circuit

Regular Expressions Cheat Sheet in Node.js

A detailed story to learn, write and execute regular expressions easily




Photo by Gábor Molnár on UnsplashPhoto by Gábor Molnár on Unsplash

1. Introduction

Regular expressions are a topic that confuses and struggles a lot of developers due to its crypt syntax.

A regular expression is a string that contains a search pattern (ex. **[0–9]{2}\ **this pattern accepts a 2 digit number), and we use it to find those patterns into texts.

2. How to Create a RegExp

In javascript we can create RegExp with 2 ways:

// a string with a format /<RegExp>/;
**var twoDigitRegExp = /[0–9]{2}/;**

// Construct a new RegExp object
**var twoDigitRegExp = new RegExp(‘[0–9]{2}’);**

3. RegExp Building Blocks

Quantifiers

**p*** --> The ***** will match the previous **p** char for 0 or more times.
Example **/js*/** will accept j (s char has **zero** times) , js (s char has **one** time) , jss (s char has **two** times).

**p+** --> The **+** will match the previous **p** char for 1 or more times.
Example **/js+/** will not  accept j (s char has **zero** times) but will accept js (s char has **one** time) , jss (s char has **two** times).

**p?** --> The **?** will match the previous  **p** char for 0 or one time.
Example **/js?/** will accept j (s char has **zero** times), js (s char has **one** time) but not accept jss (s char has **two** times).

**p{n}** --> The **{n}** will match the previous  **p** char for n times.
Example **/js{1}/** will accept js (s char has **one** time) but not accept jss (s char has **two** times).

**p{n,}** --> The **{n}** will match the previous  **p** char for at least n times.

**p{x,y}** --> The **{x,y}** will match the previous  **p** char from x to y times.

Character classes

**/.p/ **-->** **The **.** will match every charatecter before **p. **Example **/.p/ **will accept **keep , deep **but not accept **js.**

**/\d/ **-->** **The **\d** will match every number from [0-9]. Example **/festival\d/** will accept festival2020 , festival 2019 but not festivalBeer.

**/\D/ **-->** **The **\D** will match every number is **not** from [0-9]. Example **/festival\D/** will accept festivalBeer but not festival2020.

**/\w/ **-->** **The **\w** will match every character **in** latin alphabet include underscore. Example **/festival\w/** will accept festival2020 , festival 2019 but not festival$$.

**/\W/ **-->** **The **\W** will match every character **not** **in** latin alphabet include underscore. Example **/festival\W/** will accept festival#$ but not accept festival2019.

**/\s/ **-->** **The **\s** will match a single white space character, including space, tab, form feed, line feed, and other Unicode spaces. Example **/festival \s2020/** will match ' 2020' at 'festival 2020'.

**/\S/ **-->** **The **\S** will matche a single character other than white space. Example **/\S\w*/** will match 'festival' at 'festival 2020'.

Groups and ranges

**/gin|wine/ **-->** **The **| **is the or operator**. **Example **/gin|wine/ **will match 'gin' or 'wine' at 'Gin is my favorite drink'.

**/[A-Z]/ **-->** **The **[] **indicates a characters set** **to match every char in the set. Examples of character sets [A-Z] matches any char from A-Z, [@#] matches only @ and # and so on**. **Example **/[@#]\w*/g **will match '@wine' and '#scotch' at '@wine $gin #scotch'.

**/[^A-Z]/ **-->** **The **^ **indicates the **not in operator **a characters set.

**/(@|#)/ --> The () **indicates capturing group. Example **/(@|#)(gin|wine)/g ** at '**@wine @gin scotch' **will match '@wine' and '@gin'

Regular expression flags

**/(@|#)(gin|wine)/g **--> Global search with **g flag** the regExp **/(@|#)(gin|wine)/g **will check every word that matches. Example **/(@|#)(gin|wine)/g **at '@wine @gin scotch' will match '@wine' and '@gin' but without **g flag **will match only first word** **'@wine'

**/(@|#)(gin|wine)/i **-->** **Case sensitive search with **i flag**. Example **/(@|#)(gin|wine)/i **at '@WINE @gin scotch' will match '@WINE'.

**/(@|#)(gin|wine)/gi --> **Combine **g and i flags** will check all words and with case sensitive. Example **/(@|#)(gin|wine)/i **at '@WINE @gin scotch' will match '@WINE' and '@gin'.

RegExp Assertions

**@(?=gin)** --> **Lookahead assertion: **Matches '@' only if '@' **is **followed by 'gin'.

**@(?!gin)** --> **Lookahead negative assertion: **Matches '@' only if '@' **is not** followed by 'gin'.

**(?<=gin)@** --> **Lookbehind assertion: **Matches '@' only if 'gin' **is **followed by '@'.

**(?<!gin)@** --> **Lookbehind negative assertion: **Matches '@' only if 'gin' **is not **followed by '@'.

Build a Flush Message Middleware with Node.js Learn how to build a flush messages middleware system with node.js and express.js from scratchmedium.com

4. Real Examples

Regular expressions are used mainly in two topics:

  1. Input validation

  2. Search-Replace patterns in texts

Input Validation Examples

Validate an email with the following format:

  1. Name with at least 4 digits

  2. Name without special characters ^ < > ( ) [ ] \\ / . , ; : \s @ ’

  3. E-mail must have @

  4. Domain name with at least 4 digits

  5. Domain name without special characters ^ < > ( ) [ ] \\ / . , ; : \s @ ’

  6. Domain extension only .com or .net

    var regExpEmail = /([A-Z]|[a-z]|[^<>()[]\/.,;:\s@"]){4,}@([A-Z]|[a-z]|[^<>()[]\/.,;:\s@"]){4,}.(com|net)/;

    var email1 = 'petranb2@gmail.com'

    var email2 = 'petran@pkcoding.net'

    var email3 = 'petran@pkcoding.org'

    var email4 = 'pe<>ran@pkcoding.org'

    console.log(Test ${email1}:+regExpEmail.test(email1));

    console.log(Test ${email2}:+regExpEmail.test(email2));

    console.log(Test ${email3}:+regExpEmail.test(email3));

    console.log(Test ${email4}:+regExpEmail.test(email4));

    <-- Console output --> Test petranb2@gmail.com:true Test petran@pkcoding.net:true Test petran@pkcoding.org:false Test pe<>ran@pkcoding.org:false

Validate a password with the following format:

  1. Password at least 6 digits.

  2. At least one lowercase

  3. At least one uppercase

  4. At least one special character from @ # $ % ^ & *

    var regExpPassword = /(?=.[a-z])(?=.[A-Z])(?=.[0-9])(?=.[!@#$%^&*])(?=.{6,})/;

    var password1 = 'J@v@scr1pt'

    var password2 = 'N0d3js'

    var password3 = 'umbr3ll@'

    var password4 = 'A1rpl@ne'

    console.log(Test ${password1}:+regExpPassword.test(password1));

    console.log(Test ${password2}:+regExpPassword.test(password2));

    console.log(Test ${password3}:+regExpPassword.test(password3));

    console.log(Test ${password4}:+regExpPassword.test(password4));

    **<-- Console output --> **Test J@v@scr1pt:true Test N0d3js:false Test umbr3ll@:false Test A1rpl@ne:true

Validate a product code with the following format:

  1. Code must have format (XXX)-XXX

  2. Each X must be a number

    var regExpProductCode = /((\d{3}))-(\d{3})/;

    var product1 = '(854)-458'

    var product2 = '(1234)-4sw'

    var product3 = '256-789'

    var product4 = '(123)-456'

    console.log(Test ${product1}:+regExpProductCode.test(product1));

    console.log(Test ${product2}:+regExpProductCode.test(product2));

    console.log(Test ${product3}:+regExpProductCode.test(product3));

    console.log(Test ${product4}:+regExpProductCode.test(product4));

    **<-- Console output --> **Test (854)-458:true Test (1234)-4sw:false Test 256-789:false Test (123)-456:true

Search-Replace patterns in texts

Search if after a dot the text has 2 or more spaces and replace it with 1 space.

var regExpText = **/. {2,}(?=[A-Z|a-z])/**;

var text = 'Lorem ipsum dolor sit amet, consectetur adipiscing elit.   Nunc';

console.log(`Test ${text}:<<`+regExpText.exec(text)+`>>`);

console.log(text.replace(regExpText,'. '));

**<-- Console output -->**

Test Lorem ipsum dolor sit amet, consectetur adipiscing elit.   Nunc:<<.   >>
Lorem ipsum dolor sit amet, consectetur adipiscing elit. Nunc

Search if the text begins with a lowercase after a dot and replace it with uppercase

var regExpText = **/(?<=\.\s{1,})[a-z]/**;

var text = 'Lorem ipsum dolor sit amet, consectetur adipiscing elit. nunc';

console.log(`Test ${text}:<<`+regExpText.exec(text)+`>>`);

var lowerCaseChar = regExpText.exec(text)[0];

console.log(text.replace(regExpText,lowerCaseChar.toUpperCase()));

**<-- Console output -->
**Test Lorem ipsum dolor sit amet, consectetur adipiscing elit. nunc:<<n>>
Lorem ipsum dolor sit amet, consectetur adipiscing elit. Nunc

5. Security Tip

Regular Expressions can be a great alternative, but a badly written Regex could be CPU greedy and block the node.js event loop.

Read about how to prevent these circumstances at: How a RegEx can bring your Node.js service down *The use of Regular Expressions (RegEx) is quite common among software engineers and DevOps or IT roles where they…*medium.com

6.Conclusion

Learning and using regular expressions can elevate you to a great software developer.

References:

  1. MDN web docs

  2. Wiki

Special Thanks to Liran Tal for the security advice

Thanks for reading my story

Feel free to comment, email me for any ideas, changes, etc. How to Implement a Stack in Node.js A Guide to Saving properly Formatted Data With Mongoose

Further Reading

Component-Driven Microservices with NodeJS and Bit




Continue Learning