Tag: regex
-
Simple Examples
hello
: Matches the string “hello” exactly.
123
: Matches the string “123” exactly.
.
: Matches any single character.
\d
: Matches any digit character (0-9).
\w
: Matches any word character (a-z, A-Z, 0-9, _).
\s
: Matches any whitespace character (space, tab, newline).
Examples
hello world
: Matches the string “hello world” exactly.
hello|world
: Matches either “hello” or “world”.
hello.*
: Matches “hello” followed by zero or more characters.
hello\s
: Matches “hello” followed by a whitespace character.
hello\d
: Matches “hello” followed by a digit character.
hello\w
: Matches “hello” followed by a word character.
hello\s\d
: Matches “hello” followed by a whitespace character and a digit character.
Quantifiers
a+
: Matches one or more “a” characters.
a*
: Matches zero or more “a” characters.
a?
: Matches zero or one “a” character.
a{3}
: Matches exactly three “a” characters.
a{3,}
: Matches three or more “a” characters.
a{3,5}
: Matches between three and five “a” characters.
Character Classes
[abc]
: Matches any of the characters “a”, “b”, or “c”.
[^abc]
: Matches any character that is not “a”, “b”, or “c”.
[a-z]
: Matches any lowercase letter.
[A-Z]
: Matches any uppercase letter.
[0-9]
: Matches any digit character.
[\w]
: Matches any word character.
[\W]
: Matches any non-word character.
[\s]
: Matches any whitespace character.
[\S]
: Matches any non-whitespace character.
Anchors
^hello
: Matches “hello” at the beginning of a line.
world$
: Matches “world” at the end of a line.
\bhello\b
: Matches “hello” as a whole word.
\Bhello\B
: Matches “hello” not as a whole word.
Groups
(hello)
: Matches “hello” and captures it as a group.
(hello|world)
: Matches either “hello” or “world”.
(?:hello)
: Matches “hello” but does not capture it as a group.
(?=hello)
: Matches any string that is followed by “hello”.
(?!hello)
: Matches any string that is not followed by “hello”.
Flags
/hello/i
: Matches “hello” case-insensitively.
/hello/g
: Matches all occurrences of “hello”.
/hello/m
: Matches “hello” across multiple lines.
/hello/s
: Matches “hello” across multiple lines and allows “.” to match newline characters.
Special Characters
\
: Escapes a special character.
.
: Matches any single character except newline characters.
^
: Matches the beginning of a line.
$
: Matches the end of a line.
*
: Matches zero or more of the preceding character.
+
: Matches one or more of the preceding character.
?
: Matches zero or one of the preceding character.
(
: Begins a capturing group.
)
: Ends a capturing group.
[
: Begins a character class.
]
: Ends a character class.
{
: Begins a quantifier.
}
: Ends a quantifier.
|
: Matches either the expression before or after the operator.
/
: Begins and ends a regular expression.
i
: Makes the regex case-insensitive.
g
: Matches all occurrences of the pattern.
m
: Makes the ^
and $
anchors match the beginning and end of lines.
s
: Allows .
to match newline characters.
\n
: Matches a newline character.
\r
: Matches a carriage return character.
\t
: Matches a tab character.
\v
: Matches a vertical tab character.
\0
: Matches a null character (U+0000 NULL).
\xhh
: Matches a character with the given hex code (e.g. \x0A
matches a newline character).
\uhhhh
: Matches a character with the given Unicode code point (e.g. \u0009
matches a tab character).
\cX
: Matches a control character using caret notation (e.g. \cJ
matches a newline character).
\u{hhhh}
: Matches a character with the given Unicode code point (e.g. \u{0009}
matches a tab character).
Lookarounds
(?=hello)
: Positive lookahead. Matches any string that is followed by “hello”.
(?!hello)
: Negative lookahead. Matches any string that is not followed by “hello”.
(?<=hello)
: Positive lookbehind. Matches any string that is preceded by “hello”.
(?<!hello)
: Negative lookbehind. Matches any string that is not preceded by “hello”.
Unicode Categories
\p{L}
: Matches any letter character.
\p{M}
: Matches any combining mark character.
\p{Z}
: Matches any separator character.
\p{S}
: Matches any symbol character.
\p{N}
: Matches any number character.
\p{P}
: Matches any punctuation character.
\p{C}
: Matches any control character.
\p{Ll}
: Matches any lowercase letter.
\p{Lu}
: Matches any uppercase letter.
\p{Lt}
: Matches any titlecase letter.
\p{L&}
: Matches any letter character.
\p{Lm}
: Matches any modifier letter.
\p{Lo}
: Matches any other letter character.
\p{Mn}
: Matches any non-spacing mark character.
\p{Mc}
: Matches any spacing combining mark character.
\p{Me}
: Matches any enclosing mark character.
\p{Zs}
: Matches any space separator character.
\p{Zl}
: Matches any line separator character.
\p{Zp}
: Matches any paragraph separator character.
\p{Sm}
: Matches any mathematical symbol character.
\p{Sc}
: Matches any currency symbol character.
\p{Sk}
: Matches any modifier symbol character.
\p{So}
: Matches any other symbol character.
\p{Nd}
: Matches any decimal digit character.
\p{Nl}
: Matches any letter number character.
\p{No}
: Matches any other number character.
\p{Pc}
: Matches any connector punctuation character.
\p{Pd}
: Matches any dash punctuation character.
\p{Ps}
: Matches any open punctuation character.
\p{Pe}
: Matches any close punctuation character.
\p{Pi}
: Matches any initial punctuation character.
\p{Pf}
: Matches any final punctuation character.
\p{Po}
: Matches any other punctuation character.
\p{Cc}
: Matches any control character.
\p{Cf}
: Matches any format character.
\p{Cs}
: Matches any surrogate character.
\p{Co}
: Matches any private-use character.
\p{Cn}
: Matches any unassigned code point.
POSIX Classes
[:alnum:]
: Matches any alphanumeric character.
[:alpha:]
: Matches any alphabetic character.
[:ascii:]
: Matches any ASCII character.
[:blank:]
: Matches any whitespace character.
[:cntrl:]
: Matches any control character.
[:digit:]
: Matches any digit character.
[:graph:]
: Matches any visible character.
[:lower:]
: Matches any lowercase character.
[:print:]
: Matches any printable character.
[:punct:]
: Matches any punctuation character.
[:space:]
: Matches any whitespace character.
[:upper:]
: Matches any uppercase character.
[:word:]
: Matches any word character.
[:xdigit:]
: Matches any hexadecimal digit character.