Regular expression

Especial Characters:


If you want to use any of the above characters as a literal in a regular expression, you need to escape them with a backslash. For example, 1\+1=2 will search 1+1=2.

With a character class or character set, you can tell the regex engine to match only one out of several characters. This can be done by simply placing the characters you want to match  between square brackets. For example, if you want to match ‘a’ or ‘e’, simply use [ae]. Thus, gr[ae]y will match either gray or grey,  but not graay, or graey, or similar strings.

To specify a range of characters or numbers, you can use a hyphen inside a character class. For example, [a-z] will match a single character from a to z, and [0-9] will match a single digit from 0 to 9.

Typing  a ^ after the opening bracket will negate the character class. For example, q[^u] means a ‘q’ followed by characters not a ‘u’.

Note that the only metacharacters inside a character class are the closing bracket ], the backslash \, the caret ^, and the hyphen –. To include these characters inside a character class, you need to escape them with a backslash.

Short hand character classes include \d for [0-9], \w for “word character” or [0-9a-zA-Z_], and \s for “whitespace character”. Negated short hand character classes include \D for ^\d, \W for ^\w, and \S for ^\s.

The dot (.) matches a single character without caring what that character is, except new line characters.

Anchors match position before, between, or after characters. The caret ^ matches the position before the first character in the string. For example, applying ^a to “abc” matches ‘a’. But ^b will not match “abc” at all because ‘b’ cannot be matched at the start of the string. Similarly $ matches right after the last character in the string. c$ matches ‘c’ in “abc”, while a$ does not match at all.

\b for word boundaries. For example, \bis\b will match “is” in “This island is beautiful.”

Alternation or | is used to match one regular expression out of several regular expressions. For example, apple|banana|oranges. Note, you can use round brackets for regex grouping (e.g., \b(cat|dog)\b.

The question mark ? makes the preceding token in a regular expression optional. For example, colou?r  matches both colour and color. You can make several tokens optional by grouping them together and attaching the question mark at the closing round bracket, such as Nov(ember)? which will match “Nov” and “November”.

The asterisk or star or * tells the engine to match the preceding token zero or more times. The plus + tells the engine to match the preceding token once or more times. You can limit the repetition by using {min,max} where min is a positive integer number indicating the minimum number of matches and max is an integer equal to or greater than min indicating the maximum number of matches. If comma is present but max is omitted, the maximum number of matches is infinite.

Leave a Reply