Regex Cheat Sheet

Simple Examples

  • hello: Matches the string “hello” exactly.
  • 123: Matches the string “123” exactly.
  • .: Matches any single character.
  • \d: Matches any digit character (0-9).
  • \w: Matches any word character (a-z, A-Z, 0-9, _).
  • \s: Matches any whitespace character (space, tab, newline).

Examples

  • hello world: Matches the string “hello world” exactly.
  • hello|world: Matches either “hello” or “world”.
  • hello.*: Matches “hello” followed by zero or more characters.
  • hello\s: Matches “hello” followed by a whitespace character.
  • hello\d: Matches “hello” followed by a digit character.
  • hello\w: Matches “hello” followed by a word character.
  • hello\s\d: Matches “hello” followed by a whitespace character and a digit character.

Quantifiers

  • a+: Matches one or more “a” characters.
  • a*: Matches zero or more “a” characters.
  • a?: Matches zero or one “a” character.
  • a{3}: Matches exactly three “a” characters.
  • a{3,}: Matches three or more “a” characters.
  • a{3,5}: Matches between three and five “a” characters.

Character Classes

  • [abc]: Matches any of the characters “a”, “b”, or “c”.
  • [^abc]: Matches any character that is not “a”, “b”, or “c”.
  • [a-z]: Matches any lowercase letter.
  • [A-Z]: Matches any uppercase letter.
  • [0-9]: Matches any digit character.
  • [\w]: Matches any word character.
  • [\W]: Matches any non-word character.
  • [\s]: Matches any whitespace character.
  • [\S]: Matches any non-whitespace character.

Anchors

  • ^hello: Matches “hello” at the beginning of a line.
  • world$: Matches “world” at the end of a line.
  • \bhello\b: Matches “hello” as a whole word.
  • \Bhello\B: Matches “hello” not as a whole word.

Groups

  • (hello): Matches “hello” and captures it as a group.
  • (hello|world): Matches either “hello” or “world”.
  • (?:hello): Matches “hello” but does not capture it as a group.
  • (?=hello): Matches any string that is followed by “hello”.
  • (?!hello): Matches any string that is not followed by “hello”.

Flags

  • /hello/i: Matches “hello” case-insensitively.
  • /hello/g: Matches all occurrences of “hello”.
  • /hello/m: Matches “hello” across multiple lines.
  • /hello/s: Matches “hello” across multiple lines and allows “.” to match newline characters.

Special Characters

  • \: Escapes a special character.
  • .: Matches any single character except newline characters.
  • ^: Matches the beginning of a line.
  • $: Matches the end of a line.
  • *: Matches zero or more of the preceding character.
  • +: Matches one or more of the preceding character.
  • ?: Matches zero or one of the preceding character.
  • (: Begins a capturing group.
  • ): Ends a capturing group.
  • [: Begins a character class.
  • ]: Ends a character class.
  • {: Begins a quantifier.
  • }: Ends a quantifier.
  • |: Matches either the expression before or after the operator.
  • /: Begins and ends a regular expression.
  • i: Makes the regex case-insensitive.
  • g: Matches all occurrences of the pattern.
  • m: Makes the ^ and $ anchors match the beginning and end of lines.
  • s: Allows . to match newline characters.
  • \n: Matches a newline character.
  • \r: Matches a carriage return character.
  • \t: Matches a tab character.
  • \v: Matches a vertical tab character.
  • \0: Matches a null character (U+0000 NULL).
  • \xhh: Matches a character with the given hex code (e.g. \x0A matches a newline character).
  • \uhhhh: Matches a character with the given Unicode code point (e.g. \u0009 matches a tab character).
  • \cX: Matches a control character using caret notation (e.g. \cJ matches a newline character).
  • \u{hhhh}: Matches a character with the given Unicode code point (e.g. \u{0009} matches a tab character).

Lookarounds

  • (?=hello): Positive lookahead. Matches any string that is followed by “hello”.
  • (?!hello): Negative lookahead. Matches any string that is not followed by “hello”.
  • (?<=hello): Positive lookbehind. Matches any string that is preceded by “hello”.
  • (?<!hello): Negative lookbehind. Matches any string that is not preceded by “hello”.

Unicode Categories

  • \p{L}: Matches any letter character.
  • \p{M}: Matches any combining mark character.
  • \p{Z}: Matches any separator character.
  • \p{S}: Matches any symbol character.
  • \p{N}: Matches any number character.
  • \p{P}: Matches any punctuation character.
  • \p{C}: Matches any control character.
  • \p{Ll}: Matches any lowercase letter.
  • \p{Lu}: Matches any uppercase letter.
  • \p{Lt}: Matches any titlecase letter.
  • \p{L&}: Matches any letter character.
  • \p{Lm}: Matches any modifier letter.
  • \p{Lo}: Matches any other letter character.
  • \p{Mn}: Matches any non-spacing mark character.
  • \p{Mc}: Matches any spacing combining mark character.
  • \p{Me}: Matches any enclosing mark character.
  • \p{Zs}: Matches any space separator character.
  • \p{Zl}: Matches any line separator character.
  • \p{Zp}: Matches any paragraph separator character.
  • \p{Sm}: Matches any mathematical symbol character.
  • \p{Sc}: Matches any currency symbol character.
  • \p{Sk}: Matches any modifier symbol character.
  • \p{So}: Matches any other symbol character.
  • \p{Nd}: Matches any decimal digit character.
  • \p{Nl}: Matches any letter number character.
  • \p{No}: Matches any other number character.
  • \p{Pc}: Matches any connector punctuation character.
  • \p{Pd}: Matches any dash punctuation character.
  • \p{Ps}: Matches any open punctuation character.
  • \p{Pe}: Matches any close punctuation character.
  • \p{Pi}: Matches any initial punctuation character.
  • \p{Pf}: Matches any final punctuation character.
  • \p{Po}: Matches any other punctuation character.
  • \p{Cc}: Matches any control character.
  • \p{Cf}: Matches any format character.
  • \p{Cs}: Matches any surrogate character.
  • \p{Co}: Matches any private-use character.
  • \p{Cn}: Matches any unassigned code point.

POSIX Classes

  • [:alnum:]: Matches any alphanumeric character.
  • [:alpha:]: Matches any alphabetic character.
  • [:ascii:]: Matches any ASCII character.
  • [:blank:]: Matches any whitespace character.
  • [:cntrl:]: Matches any control character.
  • [:digit:]: Matches any digit character.
  • [:graph:]: Matches any visible character.
  • [:lower:]: Matches any lowercase character.
  • [:print:]: Matches any printable character.
  • [:punct:]: Matches any punctuation character.
  • [:space:]: Matches any whitespace character.
  • [:upper:]: Matches any uppercase character.
  • [:word:]: Matches any word character.
  • [:xdigit:]: Matches any hexadecimal digit character.

Discover more from Jorge Saldívar

Subscribe now to keep reading and get access to the full archive.

Continue reading