Introduction
Yesterday, my colleague Kumar Abhishek asked me to draft a Regular expression to validate URL likeregularexp.indotnet.com. He was a bit confused to draft the same as he wasn’t aware of the power of regular expressions. The following lines are for Kumar Abhishek.
The following are special characters when working with Regular Expressions.
They will be discussed throughout the tip.
Collapse | Copy Code
. $ ^ { [ ( ) * + ?
Matching any Character with Dot – The Period Sign [.]
The full stop or period character (
.
) is known as dot. It is a wildcard that will match any character except a new line (n
).For example, if I wanted to match the
‘g’
character followed by any two characters. Collapse | Copy Code
Text: gau shu gnt cow
Regex: g..
Matches: gau shu gnt cow
gau
gnt
If the Singleline option is enabled, a dot matches any character including the new line character.
Matching Word Characters – The Word Sign [w]
Backslash and a lowercase
‘w’ (w)
is a character class that will match any word character. The following Regular Expression matches ‘a’
followed by two word characters. Collapse | Copy Code
Text: abc anaconda ant cow apple
Regex: aww
Matches: abc anaconda ant cow apple
abc
ana
ant
app
Backslash and an uppercase
‘W’ (W)
will match any non-word character.Matching White-space – The Space Sign [s]
White-space can be matched using
s
(backslash and ‘s’).The following Regular Expression matches the letter
‘a’
followed by two word characters then a white space character. Collapse | Copy Code
Text: "abc anaconda ant"
Regex: awws
Matches:
"abc "
Note that
ant
was not matched as it is not followed by a white space character.White-space is defined as the space character, new line (
n
), form feed (f
), carriage return (r
), tab (t
) and vertical tab (v
). Be careful using s
as it can lead to unexpected behaviour by matching line breaks (n
and r
). Sometimes it is better to explicitly specify the characters to match instead of using
s.E.g. to match Tab and Space, use
[tx0020]
Matching Digits – The Digit Sign [s]
The digits zero to nine can be matched using
d
(backslash and lowercase ‘d
’). For example, the following Regular Expression matches any three digits in a row. Collapse | Copy Code
Text: 123 12 843 8472
Regex: ddd
Matches: 123 12 843 8472
123
843
847
Matching Sets of Single Characters – The Square-Brackets Sign [( )]
The square brackets are used to specify a set of single characters to match. Any single character within the set will match.
For example, the following Regular Expression matches any three characters where the first character is either
‘d’
or ‘a’
. Collapse | Copy Code
Text: abc def ant cow
Regex: [da]..
Matches: abc def ant cow
abc
def
ant
Quote:The caret (^
) can be added to the start of the set of characters to specify that none of the characters in the character set should be matched.
The following Regular Expression matches any three character where the first character is not
‘d’
and not ‘a’
. Collapse | Copy Code
Text: abc def ant cow
Regex: [^da]..
Matches:
"bc "
"ef "
"nt "
"cow"
Matching Ranges of Characters – The Hyphen Sign [-]
Ranges of characters can be matched using the hyphen (
-
). the following Regular Expression matches any three characters where the second character is either ‘a’, ‘b’, ‘c’ or ‘d’
. Collapse | Copy Code
Text: abc pen nda uml
Regex: .[a-d].
Matches: abc pen nda uml
abc
nda
Quote:Ranges of characters can also be combined together. the following Regular Expression matches any of the characters from ‘a’ to ‘z’ or any digit from ‘0’ to ‘9’ followed by two word characters.
Collapse | Copy Code
Text: abc no 0aa i8i
Regex: [a-z0-9]ww
Matches: abc no 0aa i8i
abc
0aa
i8i
The pattern could be written more simply as
[a-zd]
Specifying the Number of Times to Match with Quantifiers- The Plus and Star Sign [+ and *]
Quantifiers let you specify the number of times that an expression must match. The most frequently used quantifiers are the asterisk character (
*
) and the plus sign (+
).Note that the asterisk (
*
) is usually called the star when talking about Regular Expressions.Matching Zero or More Times with Star (*)
The star tells the Regular Expression to match the character, group, or character class that immediately precedes it zero or more times. This means that the character, group, or character class is optional, it can be matched but it does not have to match.
The following Regular Expression matches the character
‘a’
followed by zero or more word characters. Collapse | Copy Code
Text: Anna Jones and a friend owned an anaconda
Regex: aw*
Options: IgnoreCase
Matches: Anna Jones and a friend owned an anaconda
Anna
and
a
an
anaconda
Matching One or More Times with Plus (+)
The plus sign tells the Regular Expression to match the character, group, or character class that immediately precedes it one or more times. This means that the character, group, or character class must be found at least once. After it is found once, it will be matched again if it follows the first match.
The following Regular Expression matches the character
‘a’
followed by at least one word character. Collapse | Copy Code
Text: Anna Jones and a friend owned an anaconda
Regex: aw+
Options: IgnoreCase
Matches: Anna Jones and a friend owned an anaconda
Anna
and
an
anaconda
Quote:Note that “a” was not matched as it is not followed by any word characters.
Matching Zero or One Times with Question Mark (?)
To specify an optional match, use the question mark (
?
). The question mark matches zero or one times. The following Regular Expression matches the character ‘a’
followed by ‘n’
then optionally followed by another‘n’
. Collapse | Copy Code
Text: Anna Jones and a friend owned an anaconda
Regex: an?
Options: IgnoreCase
Matches: Anna Jones and a friend owned an anaconda
An
a
an
a
an
an
a
a
Specifying the Number of Matches
The minimum number of matches required for a character, group, or character class can be specified with the curly brackets (
{n}
).The following Regular Expression matches the character ‘
a
’ followed by a minimum of two ‘n
’ characters. There must be two ‘n
’ characters for a match to occur. Collapse | Copy Code
Text: Anna Jones and Anne owned an anaconda
Regex: an{2}
Options: IgnoreCase
Matches: Anna Jones and Anne owned an anaconda
Ann
Ann
A range of matches can be specified by curly brackets with two numbers inside (
{n,m}
). The first number (n
) is the minimum number of matches required, the second (m
) is the maximum number of matches permitted. This Regular Expression matches the character ‘a
’ followed by a minimum of two ‘n
’ characters and a maximum of three ‘n
’ characters. Collapse | Copy Code
Text: Anna and Anne lunched with an anaconda annnnnex
Regex: an{2,3}
Options: IgnoreCase
Matches: Anna and Anne lunched with an anaconda annnnnex
Ann
Ann
annn
Quote:The Regex stops matching after the maximum number of matches has been
found.
Matching the Start and End of a String
To specify that a match must occur at the beginning of a
string
, use the caret character (^
). For example, I want a Regular Expression pattern to match the beginning of the string
followed by the character ‘a
’. Collapse | Copy Code
Text: an anaconda ate Anna Jones
Regex: ^a
Matches: an anaconda ate Anna Jones
"a" at position 1
Quote:The pattern above only matches the a in “an”
Note that the caret (
^
) has different behaviour when used inside the square brackets.If the Multiline option is on, the caret (
^
) will match the beginning of each line in a multiline string
rather than only the start of the string
.To specify that a match must occur at the end of a
string
, use the dollar character ($
). If the Multiline option is on, then the pattern will match at the end of each line in a multiline string
. This Regular Expression pattern matches the word at the end of the line in a multiline string
. Collapse | Copy Code
Text: "an anaconda
ate Anna
Jones"
Regex: w+$
Options: Multiline, IgnoreCase
Matches:
Jones
Finally, here is the Question & Answer from Kumar Abhishek.
Question
How to write a regular expression to validate the following domain name in ASP.NET wellatbell.whdev.com?
Answer
Collapse | Copy Code
<asp:regularexpressionvalidator runat="server" controltovalidate="txtDomainName"
errormessage="Please enter a Domain in the correct format."
validationexpression="^([0-9a-zA-Z]).([0-9a-zA-Z]).([a-zA-Z]{3})$"
cssclass="clsForm" id="revDomainName">****</asp:regularexpressionvalidator>
End Notes
There are lot of resources describing Regular-Expressions, I just tried to put my views. Hope you enjoyed this tip. Do not forget to rate/vote.
Source from
http://www.codeproject.com/Tips/832995/Regular-Expressions-How-to-Use
No comments:
Post a Comment