Photo by Gábor Molnár on Unsplash
1. Introduction
Regular expressions are a topic that confuses and struggles a lot of developers due to its crypt syntax.
A regular expression is a string that contains a search pattern (ex. **[0–9]{2}\ **this pattern accepts a 2 digit number), and we use it to find those patterns into texts.
2. How to Create a RegExp
In javascript we can create RegExp with 2 ways:
// a string with a format /<RegExp>/;
**var twoDigitRegExp = /[0–9]{2}/;**
// Construct a new RegExp object
**var twoDigitRegExp = new RegExp(‘[0–9]{2}’);**
3. RegExp Building Blocks
Quantifiers
**p*** --> The ***** will match the previous **p** char for 0 or more times.
Example **/js*/** will accept j (s char has **zero** times) , js (s char has **one** time) , jss (s char has **two** times).
**p+** --> The **+** will match the previous **p** char for 1 or more times.
Example **/js+/** will not accept j (s char has **zero** times) but will accept js (s char has **one** time) , jss (s char has **two** times).
**p?** --> The **?** will match the previous **p** char for 0 or one time.
Example **/js?/** will accept j (s char has **zero** times), js (s char has **one** time) but not accept jss (s char has **two** times).
**p{n}** --> The **{n}** will match the previous **p** char for n times.
Example **/js{1}/** will accept js (s char has **one** time) but not accept jss (s char has **two** times).
**p{n,}** --> The **{n}** will match the previous **p** char for at least n times.
**p{x,y}** --> The **{x,y}** will match the previous **p** char from x to y times.
Character classes
**/.p/ **-->** **The **.** will match every charatecter before **p. **Example **/.p/ **will accept **keep , deep **but not accept **js.**
**/\d/ **-->** **The **\d** will match every number from [0-9]. Example **/festival\d/** will accept festival2020 , festival 2019 but not festivalBeer.
**/\D/ **-->** **The **\D** will match every number is **not** from [0-9]. Example **/festival\D/** will accept festivalBeer but not festival2020.
**/\w/ **-->** **The **\w** will match every character **in** latin alphabet include underscore. Example **/festival\w/** will accept festival2020 , festival 2019 but not festival$$.
**/\W/ **-->** **The **\W** will match every character **not** **in** latin alphabet include underscore. Example **/festival\W/** will accept festival#$ but not accept festival2019.
**/\s/ **-->** **The **\s** will match a single white space character, including space, tab, form feed, line feed, and other Unicode spaces. Example **/festival \s2020/** will match ' 2020' at 'festival 2020'.
**/\S/ **-->** **The **\S** will matche a single character other than white space. Example **/\S\w*/** will match 'festival' at 'festival 2020'.
Groups and ranges
**/gin|wine/ **-->** **The **| **is the or operator**. **Example **/gin|wine/ **will match 'gin' or 'wine' at 'Gin is my favorite drink'.
**/[A-Z]/ **-->** **The **[] **indicates a characters set** **to match every char in the set. Examples of character sets [A-Z] matches any char from A-Z, [@#] matches only @ and # and so on**. **Example **/[@#]\w*/g **will match '@wine' and '#scotch' at '@wine $gin #scotch'.
**/[^A-Z]/ **-->** **The **^ **indicates the **not in operator **a characters set.
**/(@|#)/ --> The () **indicates capturing group. Example **/(@|#)(gin|wine)/g ** at '**@wine @gin scotch' **will match '@wine' and '@gin'
Regular expression flags
**/(@|#)(gin|wine)/g **--> Global search with **g flag** the regExp **/(@|#)(gin|wine)/g **will check every word that matches. Example **/(@|#)(gin|wine)/g **at '@wine @gin scotch' will match '@wine' and '@gin' but without **g flag **will match only first word** **'@wine'
**/(@|#)(gin|wine)/i **-->** **Case sensitive search with **i flag**. Example **/(@|#)(gin|wine)/i **at '@WINE @gin scotch' will match '@WINE'.
**/(@|#)(gin|wine)/gi --> **Combine **g and i flags** will check all words and with case sensitive. Example **/(@|#)(gin|wine)/i **at '@WINE @gin scotch' will match '@WINE' and '@gin'.
RegExp Assertions
**@(?=gin)** --> **Lookahead assertion: **Matches '@' only if '@' **is **followed by 'gin'.
**@(?!gin)** --> **Lookahead negative assertion: **Matches '@' only if '@' **is not** followed by 'gin'.
**(?<=gin)@** --> **Lookbehind assertion: **Matches '@' only if 'gin' **is **followed by '@'.
**(?<!gin)@** --> **Lookbehind negative assertion: **Matches '@' only if 'gin' **is not **followed by '@'.
4. Real Examples
Regular expressions are used mainly in two topics:
-
Input validation
-
Search-Replace patterns in texts
Input Validation Examples
Validate an email with the following format:
-
Name with at least 4 digits
-
Name without special characters ^ < > ( ) [ ] \\ / . , ; : \s @ ’
-
E-mail must have @
-
Domain name with at least 4 digits
-
Domain name without special characters ^ < > ( ) [ ] \\ / . , ; : \s @ ’
-
Domain extension only .com or .net
var regExpEmail = /([A-Z]|[a-z]|[^<>()[]\/.,;:\s@"]){4,}@([A-Z]|[a-z]|[^<>()[]\/.,;:\s@"]){4,}.(com|net)/;
var email1 = 'petranb2@gmail.com'
var email2 = 'petran@pkcoding.net'
var email3 = 'petran@pkcoding.org'
var email4 = 'pe<>ran@pkcoding.org'
console.log(
Test ${email1}:
+regExpEmail.test(email1));console.log(
Test ${email2}:
+regExpEmail.test(email2));console.log(
Test ${email3}:
+regExpEmail.test(email3));console.log(
Test ${email4}:
+regExpEmail.test(email4));<-- Console output --> Test petranb2@gmail.com:true Test petran@pkcoding.net:true Test petran@pkcoding.org:false Test pe<>ran@pkcoding.org:false
Validate a password with the following format:
-
Password at least 6 digits.
-
At least one lowercase
-
At least one uppercase
-
At least one special character from @ # $ % ^ & *
var regExpPassword = /(?=.[a-z])(?=.[A-Z])(?=.[0-9])(?=.[!@#$%^&*])(?=.{6,})/;
var password1 = 'J@v@scr1pt'
var password2 = 'N0d3js'
var password3 = 'umbr3ll@'
var password4 = 'A1rpl@ne'
console.log(
Test ${password1}:
+regExpPassword.test(password1));console.log(
Test ${password2}:
+regExpPassword.test(password2));console.log(
Test ${password3}:
+regExpPassword.test(password3));console.log(
Test ${password4}:
+regExpPassword.test(password4));**<-- Console output --> **Test J@v@scr1pt:true Test N0d3js:false Test umbr3ll@:false Test A1rpl@ne:true
Validate a product code with the following format:
-
Code must have format (XXX)-XXX
-
Each X must be a number
var regExpProductCode = /((\d{3}))-(\d{3})/;
var product1 = '(854)-458'
var product2 = '(1234)-4sw'
var product3 = '256-789'
var product4 = '(123)-456'
console.log(
Test ${product1}:
+regExpProductCode.test(product1));console.log(
Test ${product2}:
+regExpProductCode.test(product2));console.log(
Test ${product3}:
+regExpProductCode.test(product3));console.log(
Test ${product4}:
+regExpProductCode.test(product4));**<-- Console output --> **Test (854)-458:true Test (1234)-4sw:false Test 256-789:false Test (123)-456:true
Search-Replace patterns in texts
Search if after a dot the text has 2 or more spaces and replace it with 1 space.
var regExpText = **/. {2,}(?=[A-Z|a-z])/**;
var text = 'Lorem ipsum dolor sit amet, consectetur adipiscing elit. Nunc';
console.log(`Test ${text}:<<`+regExpText.exec(text)+`>>`);
console.log(text.replace(regExpText,'. '));
**<-- Console output -->**
Test Lorem ipsum dolor sit amet, consectetur adipiscing elit. Nunc:<<. >>
Lorem ipsum dolor sit amet, consectetur adipiscing elit. Nunc
Search if the text begins with a lowercase after a dot and replace it with uppercase
var regExpText = **/(?<=\.\s{1,})[a-z]/**;
var text = 'Lorem ipsum dolor sit amet, consectetur adipiscing elit. nunc';
console.log(`Test ${text}:<<`+regExpText.exec(text)+`>>`);
var lowerCaseChar = regExpText.exec(text)[0];
console.log(text.replace(regExpText,lowerCaseChar.toUpperCase()));
**<-- Console output -->
**Test Lorem ipsum dolor sit amet, consectetur adipiscing elit. nunc:<<n>>
Lorem ipsum dolor sit amet, consectetur adipiscing elit. Nunc
5. Security Tip
Regular Expressions can be a great alternative, but a badly written Regex could be CPU greedy and block the node.js event loop.
Read about how to prevent these circumstances at: How a RegEx can bring your Node.js service down *The use of Regular Expressions (RegEx) is quite common among software engineers and DevOps or IT roles where they…*medium.com
6.Conclusion
Learning and using regular expressions can elevate you to a great software developer.
References:
Special Thanks to Liran Tal for the security advice
Thanks for reading my story
Feel free to comment, email me for any ideas, changes, etc. How to Implement a Stack in Node.js A Guide to Saving properly Formatted Data With Mongoose