String functions allow for searching for exact occurrences of strings, or string fragments in strings in order to program some reactions. Regular expressions allow for much more sophisticated searching in that they allow us to go beyond exact matches, and in stead look for patterns and react to those. Colloquially patterns are something that something else may be similar to, not necessarily look exactly like. Just as any other programming language JavaScript has a series of functions for that.
Regular expressions are a prominent part of several essential programs of the UNIX Operating System, the mother of all Linuxes, BSDs, OSX, etc. These OS'es have commandline facilities for usage. From there it spread into programming languages.
There are three uses for regular expressions: matching, which can also be used to extract information from a string; substituting new text for matching text; and splitting a string into an array of smaller chunks. [Tatr13].
A regular expression is a string delimited so that it is
understood by the language functions catering for it.
The choice of delimiter, the slash, /,
is conventional.
Well, it looks like this
let regex0 = /cat/; // a literal regex object
// or
let regex1 = new RegExp('cat'); // a constructed regex object
The above is a regular expression, searching for the word cat. The first example is literal. Just as a literal string, it is immutable. The second is variable, and if we need that, we use that way to define it. Its use will be shown below in conjunction with the use of two very common special characters in regular expressions.
'use strict';
let s = '';
let haystack = 'Avital named her cat Toulouse.';
let re0 = /cat/;
if (re0.test(haystack)) { // returns true
s = `I found the cat in "${haystack}"`;
} else {
s = `No cat found in "${haystack}"`;
}
console.log(s);
haystack = 'The human cat they called him.';
re0 = /^cat/;
if (re0.test(haystack)) { // returns false
s = `I found the cat at the beginning of "${haystack}"`;
} else {
s = `Not found at beginning of "${haystack}"`;
}
console.log(s);
haystack = 'She loves that cat';
re0 = /cat$/;
if (re0.test(haystack)) { // returns true
s = `I found the cat at the end of "${haystack}"`;
} else {
s = `Not found at end of "${haystack}"`;
}
console.log(s);
The first example returns true because the string cat is found in the haystack. In the second example the caret (^) anchors the regular expression to the beginning of the haystack, and because there's no cat there, it returns false. The meta character ($) anchors the regular expression at the end of the haystack, finds its target and consequently returns true.
Regular expression syntax defines a series of meta characters, ie characters with a special meaning in regular expressions. They are:
. \ + * ? [ ^ ] $ ( ) { } = ! < > | :
If you need to include one of those in a search string,
ie if you need to search for a $ for
example you must escape it with a
backslash. See these examples:
'use strict';
/love\?/.test('What is love? He asked.'); // returns true
/http\:\/\//.test('http://x15.dk'); // returns trueArguably it bit more paedagogical, but equivalent
'use strict';
let re0 = /love\?/;
re0.test('What is love? He asked.'); // returns true
let re1 = /http\:\/\//;
re1.test('http://x15.dk'); // returns trueSearching for any one of a group of characters you may formulate a regex class with either single characters, ranges of characters, or a combination of those.
'use strict';
let re0 = /[abc123]/;
re0.test('b'); // true
re0 = /[a-d1-5]/;
re0.test('d'); // true
re0 = /[a-d1-5]/;
re0.test('9'); // false
re0 = /[0-9a-zA-Z_]/;
re0.test('_$x'); //true
If you need to groups characters according to types, regular expressions offer you character classes for your typing convenience. They represent a group of characters as opposed to one exact character.
The following two expressions search for digits
/[0-9]/
/\d/
It is legal to use the character classes inside your self defined character classes:
/[nm\d]/
Searches for an n, an m,
or any digit.
'use strict';
let re0 = /\d[A-Z]/;
re0.test('3D'); // true
re0.test('CD'); // false
re0 = /\S\S\S/;
re0.test('6&c'); // true
re0.test('6 c'); // false
re0 = /Pl..../;
re0.test('Please'); // true
Sometimes we need to match for a number of occurrences of for example digit, letters, or special characters. We may introduce multiplicity in the regexes as follows
n times
n or more times
n and m times
'use strict';
let re0 = /row,? your boat/;
re0.test('row, row, row your boat');
row,? is the word row followed by zero or one
comma.
Subpatterns are placed in parenthesis, (). This
allows multiplicity designators to be assigned to the pattern.
Reusing the example from above, the pattern row,
followed by a space at least once (+).
'use strict';
let re0 = /(row,? )+ your boat/;
re0.test('row, row, row your boat');
Subpatterns may be reused without typing them again.
The first subpattern in a regex is implicitly numbered as
#1, the second is number #2, etc.
Reusing a subpattern is done by introducing a \1,
or \2 and so forth. It is a strange choice of
designator to use the backslash for this, because it has
a negative side effect. When you reuse subpatterns, your
regex must be quoted in single quotes. Otherwise the
\1 will be interpreted as an escaped digit 1.
Let me use Dow10's example illustrating
the use of subpatterns. Slightly adapted.
'use strict';
let myPets = 'favoritePet=Blondie, Maverick=dog, Blondie=cat';
let rx = /favoritePet\=(\w+).*\1\=(\w+)/;
let matches = myPets.match(rx);
console.log(`My favorite pet is a ${matches[2]} called ${matches[1]}.`);
// Displays "My favorite pet is a cat called Blondie."
You find two subpatterns in the above example, the first
matching the name in 'favoritePet=Blondie', the second one
will find the species in 'Blondie=cat'.
The '.*' find a number of characters before looking for
Blondie via the first pattern. The second pattern is looking
for the word characters after the last (=).
'use strict';
const numero = function (s) {
let r = /(\d+)/;
let a = s.match(r);
return Number(a[1]);
};
while (true) {
let s = prompt('enter string with a number');
let n = numero(s);
console.log(n);
if (n === 666)
break;
}'use strict';
let s = 'mælk=800 bla bla ost=123 bla bla rugbrød=14';
let r = /(\d+)/;
let a = s.match(r);
console.log(a);
r = /(\d+)/g;
a = s.match(r);
console.log(a);
Documentation has differences in the output array
depending on the presence of g
parameter.
'use strict';
let re0 = /over/;
re0.test('My hovercraft is full of eels'); // true
let re1 = /\bover\b/;
re1.test('My hovercraft is full of eels'); // false
re1.test('One flew over the cuckoo’s nest'); // trueregexDoyle.html
<!doctype html>
<html>
<head>
<meta charset='utf-8'/>
<meta name='viewport' content='width=device-width, initial-scale=1.0'>
<title class='title'></title>
<style>
form {
width: 30em;
}
#url {
width: 90%;
}
</style>
<script src='regexDoyle.js'></script>
</head>
<body>
<header><h1 class='title'></h1></header>
<main>
<h2>Enter a URL to scan:</h2>
<form action='#' method='post'>
<div>
<label for='url'>URL:</label>
<input type='url' id='url' placeholder='http://www.example.org/example.txt'/>
<p></p>
<label> </label>
<input type='button' id='btn' value='Find Links on that page'/>
</div>
</form>
<p id='remote'></p>
</main>
<footer></footer>
</body>
</html>
'use strict';
const $ = function (foo) { return document.getElementById(foo); };
const ajaxobj = new XMLHttpRequest();
const getFile = function (ajax, url, callback) {
console.log(`nml: ${url}`);
try {
ajax.addEventListener('load', function(ev) {
callback(ev);
});
ajax.open('get', url);
ajax.send('');
} catch(err) {
window.alert(`WTF: \n${err.message}`);
}
}
const handler = function (e) {
$('remote').innerHTML = '';
let s = e.target.responseText;
//console.log('nml s ' + s);
let r = /<a\s*href=[\'"](.+?)[\'"].*?>/gi;
let a = s.match(r);
//console.log(`\nml ${a.length} ${a[1].length}\n ${a}`);
let di = document.createElement('div');
let h2 = document.createElement('h2');
let h2t = document.createTextNode(`Linked URLs found at ${url}`);
h2.appendChild(h2t);
di.appendChild(h2);
let ul = document.createElement('ul');
for (let elm of a.slice(1)) {
let li = document.createElement('li');
let lit = document.createTextNode(elm);
li.appendChild(lit);
ul.appendChild(li);
}
di.appendChild(ul);
$('remote').appendChild(di);
}
const getRemoteContent = function (e) {
let url = $('url').value;
e.preventDefault();
getFile(ajaxobj, url, handler);
}
const showStarter = function () {
const titles = document.getElementsByClassName('title');
for (let title of titles)
title.innerHTML = 'Get Linked URLs from Webpage';
$('btn').addEventListener('click', getRemoteContent);
}
window.addEventListener("load", showStarter); // kick off JS