Regular Expressions: Brackets

Quick Reference and Refresher Part Two

Published on

Photo by Maksym Ivashchenko on Unsplash

Find the needle in this photo.

Regex in JavaScript

Since I am using JavaScript in my code examples I thought it appropriate to give a quick overview of how regular expressions work in this language. In JavaScript regexes are objects, of the type RegExp. Thus they have an object prototype and associated properties and methods. That also means you might encounter them as either a literal or a constructor. A regex literal is just what we have been using, /slashes with rules inside/ and a constructor looks like this:

new RegExp(/rules go here/, 'optional flags');

Some common string methods you might use with regular expressions are match, replace, and split. The first takes the form str.match(regex) and returns an array of matches or null if none are found. The next looks like str.replace(regex, repl) which returns a new string with repl having replaced whatever was matched by the regex. Finally, split is used to break up a string into an array and is usually just given a delimiter like a space or a comma, but a regex can be used as well.

Brackets

In this part we will just look at one group of symbols in depth, the brackets.

[ ]

Brackets indicate a set of characters to match. Any individual character between the brackets will match, and you can also use a hyphen to define a set.

'elephant'.match(/[abcd]/) // -> matches 'a'

You can use the ^ metacharacter to negate what is between the brackets.

'donkey'.match(/[^abcd]/) // -> matches 'o'

You will often see ranges of the alphabet or all numerals. [A-Za-z] [0-9] Remember that these character sets are case sensitive, unless you set the i flag.

'elephant'.match(/[a-d]/) // -> matches 'a'
'elephant'.match(/[A-D]/) // -> no match
'elephant'.match(/[A-D]/i) // -> matches 'a'

{ }

Curly braces are used to specify an exact amount of things to match. They are used after an expression: \na{2}\ will only match 'na' exactly twice.

'panama'.match(/na{2}/) // -> no match
'banana'.match(/na{2}/) // -> matches 'nana'

You can use these in conjunction with a comma to specify more than one amount. {2,} = two or more times, {2,4} = between two and four times.

'banana'.match(/a{2,4}/) // -> no match
'bananaa'.match(/a{2,4}/) // -> matches 'aa'
'bananaaa'.match(/a{2,4}/) // -> matches 'aaa'
'bananaaaa'.match(/a{2,4}/) // -> matches 'aaaa'
'bananaaaaaaaaaaa'.match(/a{2,4}/) // -> matches 'aaaa'

( )

Parentheses represent remembered matches. This is especially useful for find-and-replace operations or any time you need to do something with part of the match. When a match is remembered you can use $n to refer to it, starting with $1 up to $9, or with $& to refer to the entire match.

'Firsty McLastname'.match(/([A-Za-z]+)\s([A-Za-z]+)/) // -> matches 'First McLastname' with 'Firsty' remembered as $1 and 'McLastname' as $2

'Firsty McLastname'.replace(/([A-Za-z]+)\s([A-Za-z]+)/, '$1') // -> returns 'Firsty'

'Firsty McLastname'.replace(/([A-Za-z]+)\s([A-Za-z]+)/, '$2, $1') // -> returns 'McLastname, Firsty'

'Firstipher Lasterman'.replace(/([A-Za-z]+)\s([A-Za-z]+)/, '$&') // -> returns 'Firstipher Lasterman'

Instead of using numbers you can name the matched groups as well, with the syntax (?...).

'Firsty McLastname'.replace(/(?<first>[A-Za-z]+)\s(?<last>[A-Za-z]+)/, '$<last>, $<first>') // -> returns 'McLastname, Firsty'

If you return the match these will also appear as matches.groups.myGroupName.

Parentheses are also used in regex to group parts of the expression together into subgroups, like in /($|¥)([0-9]+).([0-9]{2})/. This would match '$149.99' and remember the subgroups '$', '149', and '99'. But what if you don't want to remember the dollar sign? You can use ?:.

'¥2000.50'.match(/(?:\$|¥)([0-9]+)\.([0-9]{2})/) // --> matches '¥2000.50' but only remembers '2000' as $1 and '50' as $2

One last example using split, where you can split a string but also return the matched results:

'Howard the Duck'.split(/the/)
// -> returns ['Howard ', ' Duck']

'Howard the Duck'.split(/(the)/)
// -> returns ['Howard ', 'the', ' Duck']

Photo by Boris Smokrovic on Unsplash

Stay tuned for more regex fun.

Part 1: The Basics ~ You Are Here ~ Part 3: Operators ~ Part 4: Examples

Enjoyed this article?

Share it with your network to help others discover it

Continue Learning

Discover more articles on similar topics