A quick reference for regular expressions (regex), including symbols, ranges, grouping, assertions and some sample patterns to get you started.
This is a quick cheat sheet to getting started with regular expressions.
| Pattern | Description | 
|---|---|
[abc] | 
A single character of: a, b or c | 
[^abc] | 
A character except: a, b or c | 
[a-z] | 
A character in the range: a-z | 
[^a-z] | 
A character not in the range: a-z | 
[0-9] | 
A digit in the range: 0-9 | 
[a-zA-Z] | 
A character in the range: a-z or A-Z  | 
[a-zA-Z0-9] | 
A character in the range:  a-z, A-Z or 0-9  | 
| Pattern | Description | 
|---|---|
a? | 
Zero or one of a | 
a* | 
Zero or more of a | 
a+ | 
One or more of a | 
[0-9]+ | 
One or more of 0-9 | 
a{3} | 
Exactly 3 of a | 
a{3,} | 
3 or more of a | 
a{3,6} | 
Between 3 and 6 of a | 
a* | 
Greedy quantifier | 
a*? | 
Lazy quantifier | 
a*+ | 
Possessive quantifier | 
Escape these special characters with \
| Pattern | Description | 
|---|---|
. | 
Any single character | 
\s | 
Any whitespace character | 
\S | 
Any non-whitespace character | 
\d | 
Any digit, Same as [0-9] | 
\D | 
Any non-digit, Same as [^0-9] | 
\w | 
Any word character | 
\W | 
Any non-word character | 
\X | 
Any Unicode sequences, linebreaks included | 
\C | 
Match one data unit | 
\R | 
Unicode newlines | 
\v | 
Vertical whitespace character | 
\V | 
Negation of \v - anything except newlines and vertical tabs | 
\h | 
Horizontal whitespace character | 
\H | 
Negation of \h | 
\K | 
Reset match | 
\n | 
Match nth subpattern | 
\pX | 
Unicode property X | 
\p{...} | 
Unicode property or script category | 
\PX | 
Negation of \pX | 
\P{...} | 
Negation of \p | 
\Q...\E | 
Quote; treat as literals | 
\k<name> | 
Match subpattern name | 
\k'name' | 
Match subpattern name | 
\k{name} | 
Match subpattern name | 
\gn | 
Match nth subpattern | 
\g{n} | 
Match nth subpattern | 
\g<n> | 
Recurse nth capture group | 
\g'n' | 
Recurses nth capture group. | 
\g{-n} | 
Match nth relative previous subpattern | 
\g<+n> | 
Recurse nth relative upcoming subpattern | 
\g'+n' | 
Match nth relative upcoming subpattern | 
\g'letter' | 
Recurse named capture group letter | 
\g{letter} | 
Match previously-named capture group letter | 
\g<letter> | 
Recurses named capture group letter | 
\xYY | 
Hex character YY | 
\x{YYYY} | 
Hex character YYYY | 
\ddd | 
Octal character ddd | 
\cY | 
Control character Y | 
[\b] | 
Backspace character | 
\ | 
Makes any character literal | 
| Pattern | Description | 
|---|---|
\G | 
Start of match | 
^ | 
Start of string | 
$ | 
End of string | 
\A | 
Start of string | 
\Z | 
End of string | 
\z | 
Absolute end of string | 
\b | 
A word boundary | 
\B | 
Non-word boundary | 
| Pattern | Description | 
|---|---|
\0 | 
Complete match contents | 
\1 | 
Contents in capture group 1 | 
$1 | 
Contents in capture group 1 | 
${foo} | 
Contents in capture group foo | 
\x20 | 
Hexadecimal replacement values | 
\x{06fa} | 
Hexadecimal replacement values | 
\t | 
Tab | 
\r | 
Carriage return | 
\n | 
Newline | 
\f | 
Form-feed | 
\U | 
Uppercase Transformation | 
\L | 
Lowercase Transformation | 
\E | 
Terminate any Transformation | 
| Pattern | Description | 
|---|---|
(...) | 
Capture everything enclosed | 
(a|b) | 
Match either a or b | 
(?:...) | 
Match everything enclosed | 
(?>...) | 
Atomic group (non-capturing) | 
(?|...) | 
Duplicate subpattern group number | 
(?#...) | 
Comment | 
(?'name'...) | 
Named Capturing Group | 
(?<name>...) | 
Named Capturing Group | 
(?P<name>...) | 
Named Capturing Group | 
(?imsxXU) | 
Inline modifiers | 
(?(DEFINE)...) | 
Pre-define patterns before using them | 
| - | - | 
|---|---|
(?(1)yes|no) | 
Conditional statement | 
(?(R)yes|no) | 
Conditional statement | 
(?(R#)yes|no) | 
Recursive Conditional statement | 
(?(R&name)yes|no) | 
Conditional statement | 
(?(?=...)yes|no) | 
Lookahead conditional | 
(?(?<=...)yes|no) | 
Lookbehind conditional | 
| - | - | 
|---|---|
(?=...) | 
Positive Lookahead | 
(?!...) | 
Negative Lookahead | 
(?<=...) | 
Positive Lookbehind | 
(?<!...) | 
Negative Lookbehind | 
Lookaround lets you match a group before (lookbehind) or after (lookahead) your main pattern without including it in the result.
| Pattern | Description | 
|---|---|
g | 
Global | 
m | 
Multiline | 
i | 
Case insensitive | 
x | 
Ignore whitespace | 
s | 
Single line | 
u | 
Unicode | 
X | 
eXtended | 
U | 
Ungreedy | 
A | 
Anchor | 
J | 
Duplicate group names | 
| - | - | 
|---|---|
(?R) | 
Recurse entire pattern | 
(?1) | 
Recurse first subpattern | 
(?+1) | 
Recurse first relative subpattern | 
(?&name) | 
Recurse subpattern name | 
(?P=name) | 
Match subpattern name | 
(?P>name) | 
Recurse subpattern name | 
| Character Class | Same as | Meaning | 
|---|---|---|
[[:alnum:]] | 
[0-9A-Za-z] | 
Letters and digits | 
[[:alpha:]] | 
[A-Za-z] | 
Letters | 
[[:ascii:]] | 
[\x00-\x7F] | 
ASCII codes 0-127 | 
[[:blank:]] | 
[\t ] | 
Space or tab only | 
[[:cntrl:]] | 
[\x00-\x1F\x7F] | 
Control characters | 
[[:digit:]] | 
[0-9] | 
Decimal digits | 
[[:graph:]] | 
[[:alnum:][:punct:]] | 
Visible characters (not space) | 
[[:lower:]] | 
[a-z] | 
Lowercase letters | 
[[:print:]] | 
[ -~] == [ [:graph:]] | 
Visible characters | 
[[:punct:]] | 
[!"#$%&’()*+,-./:;<=>?@[]^_`{|}~] | 
Visible punctuation characters | 
[[:space:]] | 
[\t\n\v\f\r ] | 
Whitespace | 
[[:upper:]] | 
[A-Z] | 
Uppercase letters | 
[[:word:]] | 
[0-9A-Za-z_] | 
Word characters | 
[[:xdigit:]] | 
[0-9A-Fa-f] | 
Hexadecimal digits | 
[[:<:]] | 
[\b(?=\w)] | 
Start of word | 
[[:>:]] | 
[\b(?<=\w)] | 
End of word | 
| - | - | 
|---|---|
(*ACCEPT) | 
Control verb | 
(*FAIL) | 
Control verb | 
(*MARK:NAME) | 
Control verb | 
(*COMMIT) | 
Control verb | 
(*PRUNE) | 
Control verb | 
(*SKIP) | 
Control verb | 
(*THEN) | 
Control verb | 
(*UTF) | 
Pattern modifier | 
(*UTF8) | 
Pattern modifier | 
(*UTF16) | 
Pattern modifier | 
(*UTF32) | 
Pattern modifier | 
(*UCP) | 
Pattern modifier | 
(*CR) | 
Line break modifier | 
(*LF) | 
Line break modifier | 
(*CRLF) | 
Line break modifier | 
(*ANYCRLF) | 
Line break modifier | 
(*ANY) | 
Line break modifier | 
\R | 
Line break modifier | 
(*BSR_ANYCRLF) | 
Line break modifier | 
(*BSR_UNICODE) | 
Line break modifier | 
(*LIMIT_MATCH=x) | 
Regex engine modifier | 
(*LIMIT_RECURSION=d) | 
Regex engine modifier | 
(*NO_AUTO_POSSESS) | 
Regex engine modifier | 
(*NO_START_OPT) | 
Regex engine modifier | 
| Pattern | Matches | 
|---|---|
ring         | 
Match  | 
.            | 
Match  | 
h.o          | 
Match  | 
ring\?       | 
Match  | 
\(quiet\)    | 
Match  | 
c:\\windows  | 
Match  | 
Use \ to search for these special characters: 
 [ \ ^ $ . | ? * + ( ) { }
| Pattern | Matches | 
|---|---|
cat|dog      | 
Match  | 
id|identity  | 
Match  | 
identity|id  | 
Match  | 
Order longer to shorter when alternatives overlap
| Pattern | Matches | 
|---|---|
[aeiou] | 
Match any vowel | 
[^aeiou] | 
Match a NON vowel | 
r[iau]ng | 
Match  | 
gr[ae]y | 
Match  | 
[a-zA-Z0-9] | 
Match any letter or digit | 
[\u3a00-\ufa99] | 
Match any Unicode Hàn (中文) | 
In [ ] always escape . \ ] and sometimes ^ - .
| Pattern | Meaning | 
|---|---|
\w             | 
"Word" character  (letter, digit, or underscore)  | 
\d             | 
Digit | 
\s             | 
Whitespace  (space, tab, vtab, newline)  | 
\W, \D, or \S  | 
Not word, digit, or whitespace | 
[\D\S]         | 
Means not digit or whitespace, both match | 
[^\d\s]        | 
Disallow digit and whitespace | 
| Pattern | Matches | 
|---|---|
colou?r | 
Match  | 
[BW]ill[ieamy's]* | 
Match  | 
[a-zA-Z]+ | 
Match 1 or more letters | 
\d{3}-\d{2}-\d{4} | 
Match a SSN | 
[a-z]\w{1,7} | 
Match a UW NetID | 
| Pattern | Meaning | 
|---|---|
*  + {n,}greedy  | 
Match as much as possible | 
<.+>    | 
Finds 1 big match in  | 
*?  +? {n,}?lazy  | 
Match as little as possible | 
<.+?> | 
Finds 2 matches in < | 
| Pattern | Meaning | 
|---|---|
\b               | 
"Word" edge (next to non "word" character) | 
\bring           | 
Word starts with "ring", ex  | 
ring\b           | 
Word ends with "ring", ex  | 
\b9\b            | 
Match single digit  | 
\b[a-zA-Z]{6}\b  | 
Match 6-letter words | 
\B               | 
Not word edge | 
\Bring\B         | 
Match  | 
^\d*$            | 
Entire string must be digits | 
^[a-zA-Z]{4,20}$ | 
String must have 4-20 letters | 
^[A-Z]           | 
String must begin with capital letter | 
[\.!?"')]$       | 
String must end with terminal puncutation | 
| Pattern | Meaning | 
|---|---|
(?i)[a-z]*(?-i) | 
Ignore case ON / OFF | 
(?s).*(?-s) | 
Match multiple lines (causes . to match newline) | 
(?m)^.*;$(?-m) | 
|
(?x) | 
#free-spacing mode, this EOL comment ignored | 
(?-x) | 
free-spacing mode OFF | 
/regex/ismx | 
Modify mode for entire string | 
| Pattern | Meaning | 
|---|---|
(in\|out)put   | 
Match  | 
\d{5}(-\d{4})? | 
US zip code ("+ 4" optional) | 
Parser tries EACH alternative if match fails after group.
Can lead to catastrophic backtracking.
| Pattern | Matches | 
|---|---|
(to) (be) or not \1 \2 | 
Match  | 
([^\s])\1{2} | 
Match non-space, then same twice more    | 
\b(\w+)\s+\1\b | 
Match doubled words | 
| Pattern | Meaning | 
|---|---|
on(?:click\|load) | 
Faster than: on(click\|load) | 
Use non-capturing or atomic groups when possible
| Pattern | Meaning | 
|---|---|
(?>red\|green\|blue) | 
Faster than non-capturing | 
(?>id\|identity)\b | 
Match  | 
"id" matches, but \b fails after atomic group,
parser doesn't backtrack into group to retry 'identity'
If alternatives overlap, order longer to shorter.
| Pattern | Meaning | 
|---|---|
(?= ) | 
Lookahead, if you can find ahead | 
(?! ) | 
Lookahead,if you can not find ahead | 
(?<= ) | 
Lookbehind, if you can find behind | 
(?<! ) | 
Lookbehind, if you can NOT find behind | 
\b\w+?(?=ing\b) | 
Match  | 
\b(?!\w+ing\b)\w+\b | 
Words NOT ending in "ing" | 
(?<=\bpre).*?\b  | 
Match pre | 
\b\w{3}(?<!pre)\w*?\b | 
Words NOT starting with "pre" | 
\b\w+(?<!ing)\b | 
Match words NOT ending in "ing" | 
Match "Mr." or "Ms." if word "her" is later in string
M(?(?=.*?\bher\b)s|r)\.
requires lookaround for IF condition
Import the regular expressions module
import re
>>> sentence = 'This is a sample string'
>>> bool(re.search(r'this', sentence, flags=re.I))
True
>>> bool(re.search(r'xyz', sentence))
False
>>> re.findall(r'\bs?pare?\b', 'par spar apparent spare part pare')
['par', 'spar', 'spare', 'pare']
>>> re.findall(r'\b0*[1-9]\d{2,}\b', '0501 035 154 12 26 98234')
['0501', '154', '98234']
>>> m_iter = re.finditer(r'[0-9]+', '45 349 651 593 4 204')
>>> [m[0] for m in m_iter if int(m[0]) < 350]
['45', '349', '4', '204']
>>> re.split(r'\d+', 'Sample123string42with777numbers')
['Sample', 'string', 'with', 'numbers']
>>> ip_lines = "catapults\nconcatenate\ncat"
>>> print(re.sub(r'^', r'* ', ip_lines, flags=re.M))
* catapults
* concatenate
* cat
>>> pet = re.compile(r'dog')
>>> type(pet)
<class '_sre.SRE_Pattern'>
>>> bool(pet.search('They bought a dog'))
True
>>> bool(pet.search('A cat crossed their path'))
False
| Function | Description | 
|---|---|
re.findall | 
Returns a list containing all matches | 
re.finditer | 
Return an iterable of match objects (one for each match) | 
re.search | 
Returns a Match object if there is a match anywhere in the string | 
re.split | 
Returns a list where the string has been split at each match | 
re.sub | 
Replaces one or many matches with a string | 
re.compile | 
Compile a regular expression pattern for later use | 
re.escape | 
Return string with all non-alphanumerics backslashed | 
| - | - | - | 
|---|---|---|
re.I | 
re.IGNORECASE | 
Ignore case | 
re.M | 
re.MULTILINE | 
Multiline | 
re.L | 
re.LOCALE | 
Make \w,\b,\s locale dependent | 
re.S | 
re.DOTALL | 
Dot matches all (including newline) | 
re.U | 
re.UNICODE | 
Make \w,\b,\d,\s unicode dependent | 
re.X | 
re.VERBOSE | 
Readable style | 
let textA = 'I like APPles very much';
let textB = 'I like APPles';
let regex = /apples$/i
 
// Output: false
console.log(regex.test(textA));
 
// Output: true
console.log(regex.test(textB));
let text = 'I like APPles very much';
let regexA = /apples/;
let regexB = /apples/i;
 
// Output: -1
console.log(text.search(regexA));
 
// Output: 7
console.log(text.search(regexB));
let text = 'Do you like apples?';
let regex= /apples/;
 
// Output: apples
console.log(regex.exec(text)[0]);
 
// Output: Do you like apples?
console.log(regex.exec(text).input);
let text = 'Here are apples and apPleS';
let regex = /apples/gi;
 
// Output: [ "apples", "apPleS" ]
console.log(text.match(regex));
let text = 'This 593 string will be brok294en at places where d1gits are.';
let regex = /\d+/g
 
// Output: [ "This ", " string will be brok", "en at places where d", "gits are." ] 
console.log(text.split(regex))
let regex = /t(e)(st(\d?))/g;
let text = 'test1test2';
let array = [...text.matchAll(regex)];
// Output: ["test1", "e", "st1", "1"]
console.log(array[0]);
// Output: ["test2", "e", "st2", "2"]
console.log(array[1]);
let text = 'Do you like aPPles?';
let regex = /apples/i
 
// Output: Do you like mangoes?
let result = text.replace(regex, 'mangoes');
console.log(result);
let regex = /apples/gi;
let text = 'Here are apples and apPleS';
// Output: Here are mangoes and mangoes
let result = text.replaceAll(regex, "mangoes");
console.log(result);
| - | - | 
|---|---|
preg_match() | 
Performs a regex match | 
preg_match_all() | 
Perform a global regular expression match | 
preg_replace_callback() | 
Perform a regular expression search and replace using a callback | 
preg_replace() | 
Perform a regular expression search and replace | 
preg_split() | 
Splits a string by regex pattern | 
preg_grep() | 
Returns array entries that match a pattern | 
$str = "Visit Microsoft!";
$regex = "/microsoft/i";
// Output: Visit QuickRef!
echo preg_replace($regex, "QuickRef", $str); 
$str = "Visit QuickRef";
$regex = "#quickref#i";
// Output: 1
echo preg_match($regex, $str);
$regex = "/[a-zA-Z]+ (\d+)/";
$input_str = "June 24, August 13, and December 30";
if (preg_match_all($regex, $input_str, $matches_out)) {
    // Output: 2
    echo count($matches_out);
    // Output: 3
    echo count($matches_out[0]);
    // Output: Array("June 24", "August 13", "December 30")
    print_r($matches_out[0]);
    // Output: Array("24", "13", "30")
    print_r($matches_out[1]);
}
$arr = ["Jane", "jane", "Joan", "JANE"];
$regex = "/Jane/";
// Output: Jane
echo preg_grep($regex, $arr);
$str = "Jane\tKate\nLucy Marion";
$regex = "@\s@";
// Output: Array("Jane", "Kate", "Lucy", "Marion")
print_r(preg_split($regex, $str));
Pattern p = Pattern.compile(".s", Pattern.CASE_INSENSITIVE);
Matcher m = p.matcher("aS");  
boolean s1 = m.matches();  
System.out.println(s1);   // Outputs: true
boolean s2 = Pattern.compile("[0-9]+").matcher("123").matches();  
System.out.println(s2);   // Outputs: true
boolean s3 = Pattern.matches(".s", "XXXX");  
System.out.println(s3);   // Outputs: false
| - | - | 
|---|---|
CANON_EQ | 
Canonical equivalence | 
CASE_INSENSITIVE | 
Case-insensitive matching | 
COMMENTS | 
Permits whitespace and comments | 
DOTALL | 
Dotall mode | 
MULTILINE | 
Multiline mode | 
UNICODE_CASE | 
Unicode-aware case folding | 
UNIX_LINES | 
Unix lines mode | 
There are more methods ...
Replace sentence:
String regex = "[A-Z\n]{5}$";
String str = "I like APP\nLE";
Pattern p = Pattern.compile(regex, Pattern.MULTILINE);
Matcher m = p.matcher(str);
// Outputs: I like Apple!
System.out.println(m.replaceAll("pple!"));
Array of all matches:
String str = "She sells seashells by the Seashore";
String regex = "\\w*se\\w*";
Pattern p = Pattern.compile(regex, Pattern.CASE_INSENSITIVE);
Matcher m = p.matcher(str);
List<String> matches = new ArrayList<>();
while (m.find()) {
    matches.add(m.group());
}
// Outputs: [sells, seashells, Seashore]
System.out.println(matches);
| Name | Description | 
|---|---|
REGEXP           | 
Whether string matches regex | 
REGEXP_INSTR()   | 
Starting index of substring matching regex  (NOTE: Only MySQL 8.0+)  | 
REGEXP_LIKE()    | 
Whether string matches regex   (NOTE: Only MySQL 8.0+)  | 
REGEXP_REPLACE() | 
Replace substrings matching regex  (NOTE: Only MySQL 8.0+)  | 
REGEXP_SUBSTR()  | 
Return substring matching regex   (NOTE: Only MySQL 8.0+)  | 
expr REGEXP pat 
mysql> SELECT 'abc' REGEXP '^[a-d]';
1
mysql> SELECT name FROM cities WHERE name REGEXP '^A';
mysql> SELECT name FROM cities WHERE name NOT REGEXP '^A';
mysql> SELECT name FROM cities WHERE name REGEXP 'A|B|R';
mysql> SELECT 'a' REGEXP 'A', 'a' REGEXP BINARY 'A';
1   0
REGEXP_REPLACE(expr, pat, repl[, pos[, occurrence[, match_type]]])
mysql> SELECT REGEXP_REPLACE('a b c', 'b', 'X');
a X c
mysql> SELECT REGEXP_REPLACE('abc ghi', '[a-z]+', 'X', 1, 2);
abc X
REGEXP_SUBSTR(expr, pat[, pos[, occurrence[, match_type]]])
mysql> SELECT REGEXP_SUBSTR('abc def ghi', '[a-z]+');
abc
mysql> SELECT REGEXP_SUBSTR('abc def ghi', '[a-z]+', 1, 3);
ghi
REGEXP_LIKE(expr, pat[, match_type])
mysql> SELECT regexp_like('aba', 'b+')
1
mysql> SELECT regexp_like('aba', 'b{2}')
0
mysql> # i: case-insensitive
mysql> SELECT regexp_like('Abba', 'ABBA', 'i');
1
mysql> # m: multi-line
mysql> SELECT regexp_like('a\nb\nc', '^b$', 'm');
1
REGEXP_INSTR(expr, pat[, pos[, occurrence[, return_option[, match_type]]]])
mysql> SELECT regexp_instr('aa aaa aaaa', 'a{3}');
2
mysql> SELECT regexp_instr('abba', 'b{2}', 2);
2
mysql> SELECT regexp_instr('abbabba', 'b{2}', 1, 2);
5
mysql> SELECT regexp_instr('abbabba', 'b{2}', 1, 3, 1);
7