Regex
Related
- Character classes cheathsheet in the general Regex section.
Resources
- RegExp on Mozilla docs.
- JavaScript Regular Expressions on W3 Schools.
- JavaScript RegExp Reference on W3 Schools - that has a long list on the list for more info.
Regex object
How to create a RegExp
instance.
With a constructor
new RegExp("e");
new RegExp('ab+c', 'i') // string pattern as argument
new RegExp(/ab+c/, 'i') // regular expression literal as first argument - from ES6
Literal notation
/e/
/ab+c/i
Use with string methods:
// inline
text.match(/my pattern/i);
// with object
const pattern = /my pattern/i;
text.match(pattern);
e.g.
const text = 'Rain and Spain'
text.match(/ai/gi)
// [ 'ai', 'ai' ]
Using a character class:
/[[:blank:]]/g
const y = "R096 bar.txt fizz/foo.txt"
y.split(/\s+/)
// [ 'R096', 'bar.txt', 'fizz/foo.txt' ]
// These won't work so well:
y.split(" ")
// [ 'R096', '', 'bar.txt', '', '', '', 'fizz/foo.txt' ]
y.split(/" "/)
[ 'R096 bar.txt fizz/foo.txt' ]
y.split(/\s/)
// [ 'R096', '', 'bar.txt', '', '', '', 'fizz/foo.txt' ]
Flags
/\w+/
Or
new RegExp('\\w+')
Search method
A method on a string.
This method searches a string for a specified value and returns the position of the match.
This is done with a case-insensitive search.
text.search(pattern)
Examples
const text = "My Example";
text.search(/example/i);
// 3
The match starts at index 3
.
Match method
A method on a string.
Find matches in text and get an object back containing matches.
const pattern = new RegExp("e");
const text = "The best things in life are free";
const matches = text.match(pattern);
// ["e", index: 2, input: "The best things in life are free", groups: undefined]
Test method
A method on a regex object.
Check a string matches a regex pattern and return true
or false
.
patt.test(text)
This is equivalent to using match
and use !!
to convert to a boolean.
!!text.match(patt)
Examples
/Hello/.test("Hello world!")
// true
From W3 Schools.
const patt = /e/;
const text = "The best things in life are free";
const result = patt.test(str);
// true
Replace
Replace method vs replace all method
Here using both to replace every occurence, using replaceAll
and a string and then replace
with multiline regex.
> 'ABC ABC ABC'.replaceAll('ABC', "X")
'X X X'
> 'ABC ABC ABC'.replace(/ABC/g, "X")
'X X X'
Replace one.
'ABC ABC ABC'.replace('ABC', "X")
'X ABC ABC'
Replace method
A method on a string.
In a string, replace a matched pattern with another value.
The pattern can be a string or a regex pattern.
From Mozilla docs.
text.replace(pattern, replaceStr)
Warning - the default behavior is to only replace the first occurrence. So make sure to use g
global flag in your regex, even if you have to convert your plain string or regex. You can use the replaceAll method to replace all values for a string pattern, but you still need global flag for regex pattern.
Examples
Use 'dog'
string or /Dog/i
regex as the pattern. Either way, it will only replace replace once.
const pattern = /Dog/i;
const text = 'The quick brown fox jumps over the lazy dog. If the dog reacted, was it really lazy?';
text.replace(pattern, 'ferret')
// | NB. | NB.
// "The quick brown fox jumps over the lazy ferret. If the dog reacted, was it really lazy?"
Regex pattern - replace all occurences using global and case-insensitive flags.
const pattern = /Dog/gi;
const text = 'The quick brown fox jumps over the lazy dog. If the dog reacted, was it really lazy?';
text.replace(pattern, 'ferret')
// | NB. | NB.
// "The quick brown fox jumps over the lazy ferret. If the ferret reacted, was it really lazy?"
Replace all method
From Mozilla docs.
The replaceAll() method returns a new string with all matches of a pattern replaced by a replacement. The pattern can be a string or a RegExp, and the replacement can be a string or a function to be called for each match.
Syntax:
str.replaceAll(regexp|substr, newSubstr|function)
Note that as per linked docs, this only replaces all occurrences for a string pattern.
While if you use a regex pattern, it will still only replace the one occurence unless you use g
flag as before. Which is annoying. So for regex you might as well use the place .replace
.
This is a newer method so not supported on all browsers - see Can I Use? page.
Examples
String pattern to replace all.
const text = 'The quick brown fox jumps over the lazy dog. If the dog reacted, was it really lazy?';
text.replaceAll('dog', 'monkey'));
// | NB. | NB.
// "The quick brown fox jumps over the lazy monkey. If the monkey reacted, was it really lazy?"
Regex pattern - global flag still requird to replace all.
const text = 'The quick brown fox jumps over the lazy dog. If the dog reacted, was it really lazy?';
text.replaceAll(/Dog/gi, 'monkey')
// | NB. | NB.
// "The quick brown fox jumps over the lazy monkey. If the monkey reacted, was it really lazy?"
Groups and ranges
- Groups and ranges in Mozilla docs.
A match will return positional values in an array. The first is the matched string, then at index 1
is the first matched group, index 2
the second and so on. The match object has also has some key-value attributes - index
, input
and groups
.
Examples copied from link above.
Basic
Get as many groups as matched. Here these are words without the letter e
.
const aliceExcerpt = 'The Caterpillar and Alice looked at each other';
const regexpWithoutE = /\b[a-df-z]+\b/ig;
aliceExcerpt.match(regexpWithoutE)
// ["and", "at"]
Groups
Match groups created with brackets.
const imageDescription = 'This image has a resolution of 1440Γ900 pixels.';
const regexpSize = /([0-9]+)Γ([0-9]+)/;
const match = imageDescription.match(regexpSize);
`Width: ${match[1]} / Height: ${match[2]}.`
// "Width: 1440 / Height: 900."
Note index starts at 1
.
Non-capturing group
Here we need the protocol to match on the value but we drop it in the result.
const patt = new RegExp('(?:https?)://([^/\r\n]+)(/[^\r\n]*)?')
cont matches = 'http://stackoverflow.com/'.match(patt)
matches[1])
// stackoverflow.com
Topic example
Get hashtag keywords.
This only matches once.
pat = /#(\w+)/
m = 'abc #python #def'.match(pat)
m
// [ '#python',β
// 'python',β
// index: 4,β
// input: 'abc #python #def',β
// groups: undefined ]β
Using g
for a global matches to get more than one match. But made the attributes disappear. Maybe I donβt understand this area properly and what the best practice is.
pat = /#(\w+)/g
m = 'abc #python #def'.match(pat)
m
// [
// '#python',
// '#def'
// ]β
Extending the pattern to capture hyphens (not valid hashtags on Twitter but useful on other platforms for keyword topics).
pat = /#([\w-]+)/g
m = 'abc #ruby #data-science'.match(pat)
m
// [ '#ruby', '#data-science' ]β