Regular Expressions
(RE)
Write a program to check the
password. (Conditions: it should be minimum 8 characters, should be
number,special char and alpha.
A regular expression is a
special sequence of characters that helps you match or find other strings or
sets of strings, using a specialized syntax held in a pattern.
re.compile():
We can combine a regular
expression pattern into pattern objects, which can be used for pattern
matching. It also helps to search a pattern again without rewriting it.
#user defined exceptions and
regular expressions
import re
class Error(Exception):
"""Super class
for other exceptions"""
pass
class NoSpecialChar(Error):
"""Raised when
the string is too length"""
pass
class PasswrdSize(Error):
"""Raised when
the string is too length"""
pass
class AlphaNumeric(Error):
pass
class NoCapitalLetter(Error):
pass
class NoSmallLetter(Error):
pass
n=8
rex=re.compile('[@_!#$%^&*()<>?/|}{~:]')
while True:
try:
a=input("enter pwd:
")
b=len(a)
for c in a:
if c.isupper():
# s contains a
capital letter
# <do
something>
# one such letter
is enough
break
else:
raise
NoCapitalLetter
for c in a:
if c.islower():
break
else:
raise
NoSmallLetter
if b<n:
raise PasswrdSize
elif
(rex.search(a)==None):
raise NoSpecialChar
elif
(re.match(a,'123#$%abc'))==False:
raise AlphaNumeric
break
except PasswrdSize:
print("Password
length should be 8")
except NoSpecialChar:
print("pwd must be
atleast one special char")
except AlphaNumeric:
print("Pwd should be
alpha numberic")
except NoCapitalLetter:
print("Atleast one
capital letter must be there in pwd")
except NoSmallLetter:
print("Atlease one
small letter should be there in pwd")
print("Password created Success")
"""output:
case1
enter pwd:
rajendra@123
Atleast one
capital letter must be there in pwd
case2
enter pwd:
rajendra123
pwd must be
atleast one special char
enter pwd:
case3:
enter pwd:
Rajendra@123
Password created
Success
case4:
enter pwd:
rAjen$@123
Password created
Success
CASE5:
enter pwd:
RAJEnDRA@123
Password created
Success
"""
RegularExpressions:
If you want to represent a
group of words (strings) according to particular pattern then we should go for
Regular Expressions
To perform :
1.
Validations
Ex: email validations, mobile validations,validating
passwords, generating OTPs
mobile number validations:
xxxxxxxxxx(10 numbers)
But most of the people perform
validation by using Javascript right, but why do we learn RegularExpressions in
python, in side java script also people use regular expressions. Hence Regular
expression is language independent concept. Even we can use this concept in
java as well.
Applications of Regular
Expressions:
2.
To
develop pattern matching application
i.e In ms-word we use find
command, which searches for a particular pattern…in Unix/Linux environment we
use grep,egrep command
3. Regular Expression plays a key role when we develop For
Translators, Assemblers , Interpreters, compiler designing we use :
· Lexical
analysis: scanning or Tokenisation
· Syntax
analysis: i.e Parsing
, semantic analysis,
Intermediate Code Generation, Code Optimisation, Target Code Generation These
are the various important phases while we design the compiler
4.
To develop digital
circuits we go for Regular Expressions: Finite Automata with output i.e Moore
machine and Melay machine, Binary Adder
5.
To develop Communication
protocols Ex: TCP/IP,…
Python have special predefined
module to develop RegularExpressions(RE) i.e “re” module
re module:
This module have several
predefined functions to apply regular expressions in our programs(applications)
1.
compile() function:
this function converts the
desired search pattern into RegularExpression i.e RegEx object.
Ex:
import re #STEP 1
pattern=re.compile(‘carona’) #Step
2here pattern is a variable
#identifier so instead we can use any valid identifier for var
print(type(pattern))
output:
<class ‘_sre.SRE_carona’> #s means submodule of RE
2.
finditer() function: we can check the no of match patterns.
Ex:
import re
pattern=re.compile(‘carona’)
matcher=pattern.finditer(‘Entire world is suffering from corona virus,due
to corona virus 20k people died with corona positive……..’)
#once we got the matcher object,
we can call other predefined methods…
(2.1) start()
method: returns start index of the match
(2.2) end()
method: returns end+1 index of the match
(2.3) group()
method:returns matched string/word
Write a pattern matching
application:
Write a python program to find
the particular pattern is available or not? If available where it is available,
how many time it is available,
import re
ctr=0
pattern=re.compile("or")
matcher=pattern.finditer('corona 20k in . world')
for i in matcher:
ctr+=1 #no of occurances
print('match is available at
index:',i.start())
print('no of occurances:',ctr)
"""output:
0
|
1
|
2
|
3
|
4
|
5
|
6
|
7
|
8
|
9
|
10
|
11
|
12
|
13
|
14
|
15
|
16
|
17
|
18
|
19
|
20
|
‘c’
|
‘o’
|
‘r’
|
‘o’
|
‘n’
|
‘a’
|
‘’
|
‘2’
|
‘0’
|
‘k
|
‘’
|
‘i’
|
‘n’
|
‘’
|
.
|
‘
|
‘w’
|
‘o’
|
‘r’
|
‘l’
|
‘d’
|
match is
available at index: 1
match is
available at index: 17
no of
occurances: 2
"""
Example: 2
import re
ctr=0
pattern=re.compile("or")
matcher=pattern.finditer('corona 20k in . world')
for i in matcher:
ctr+=1 #no of occurances
print('start:{},end:{},group:{}'.format(i.start(),i.end(),i.group()))
print('no of occurances:',ctr)
""""""output:
0
|
1
|
2
|
3
|
4
|
5
|
6
|
7
|
8
|
9
|
10
|
11
|
12
|
13
|
14
|
15
|
16
|
17
|
18
|
19
|
20
|
‘c’
|
‘o’
|
‘r’
|
‘o’
|
‘n’
|
‘a’
|
‘’
|
‘2’
|
‘0’
|
‘k
|
‘’
|
‘i’
|
‘n’
|
‘’
|
.
|
‘
|
‘w’
|
‘o’
|
‘r’
|
‘l’
|
‘d’
|
corona 20k in .
world
start:1,end:3,group:or
start:17,end:19,group:or
no of
occurances: 2
"""
Character Classes:
Characters
|
Searches for
|
[abc]
|
a or b or c
|
[^abc]
|
It searches all the chars
Except a,b and c
|
[a-z]
|
Any lower case alphabets
|
[A-Z]
|
Any upper case alphabets
|
[a-zA-Z]
|
Any alphabets
|
[0-9]
|
Any digits
|
[a-zA-Z0-9]
|
Alphanumerics
|
[^a-zA-Z0-9]
|
Special characters
|
Example program:
import re
a=re.finditer('[abc]','a7b@k9z')
for i in a:
print(i.start(),'....',i.group())
"""
index char name
start() group()
0 .... a
2 .... b
Char ‘a’ is available at 0th index
Char ‘b’ is available at 2nd
index
m.group() returns the matched
pattern
"""
Example2:
#search special characters,d..z
pattern in the 'a7b@k9z'
import re
a=re.finditer('[^abc]','a7b@k9z')
for i in a:
print(i.start(),'....',i.group())
""":
index char name
start() group()
1 .... 7
3 .... @
4 .... k
5 .... 9
6 .... z
"""
Example 3
#search numbers, alphabets
pattern in the 'a7b@k9z'
import re
a=re.finditer('[a-zA-Z0-9]','a7b@k9z')
for i in a:
print(i.start(),'....',i.group())
""":
index char name
start() group()
0 .... a
1 .... 7
2 .... b
4 .... k
5 .... 9
6 .... z
"""
Example 4:
#search special characters in
the 'a7b@k9z'
import re
a=re.finditer('[^a-zA-Z0-9]','a7b@k9z')
for i in a:
print(i.start(),'....',i.group())
""":
index char name
start() group()
3....... .... @
"""
Predefined character
classes:
Characters
|
Searches for
|
\s
|
Space character
|
\S
|
Except space characters
|
\d
|
Any digit
|
\D
|
Except digits
|
\w
|
Any word character(alpha
numberic characters i.e [a-zA-Z0-9]
|
\W
|
Any character except
word(special char) i.e [^a-zA-Z0-9]
|
.
|
Every character
|
Example 1:
#search space in the
'rajendra19@k9z'
import re
a=re.finditer('\s','rajendra19 @k9z')
for i in a:
print(i.start(),'........',i.group())
""":
index char name
start() group()
10 ........
(space)
"""
Example 2:
#search except space
characters in the 'rajendra19@k9z'
import re
a=re.finditer('\S','rajendra19
@k9z')
for i in a:
print(i.start(),'........',i.group())
""":
index char name
start() group()
0 ........ r
1 ........ a
2 ........ j
3 ........ e
4 ........ n
5 ........ d
6 ........ r
7 ........ a
8 ........ 1
9 ........ 9
11 ........ @
12 ........ k
13 ........ 9
14 ........ z
"""
#Ex3: search only digits in
the 'rajendra19@k9z'
import re
a=re.finditer('\d','rajendra19
@k9z')
for i in a:
print(i.start(),'........',i.group())
""":
index char name
start() group()
8 ........ 1
9 ........ 9
13 ........ 9
"""
#Ex4: search except digits in
the 'rajendra19@k9z'
import re
a=re.finditer('\D','rajendra19
@k9z')
for i in a:
print(i.start(),'........',i.group())
""":
index char name
start() group()
0 ........ r
1 ........ a
2 ........ j
3 ........ e
4 ........ n
5 ........ d
6 ........ r
7 ........ a
10 ........
11 ........ @
12 ........ k
14 ........ z
"""
#Ex5: search any word alpha
numeric in the 'rajendra19@k9z'
import re
a=re.finditer('\w','rajendra19 @k9z')
for i in a:
print(i.start(),'........',i.group())
""":
index char name
start() group()
0 ........ r
1 ........ a
2 ........ j
3 ........ e
4 ........ n
5 ........ d
6 ........ r
7 ........ a
8 ........ 1
9 ........ 9
12 ........ k
13 ........ 9
14 ........ z
"""
#Ex6: search any word except
alpha numeric in the 'rajendra19@k9z'
import re
a=re.finditer('\W','rajendra19
@k9z')
for i in a:
print(i.start(),'........',i.group())
""":
index char name
start() group()
10 ........
11 ........ @
"""
#Ex7: search all the char in
the 'rajendra19@k9z'
import re
a=re.finditer('.','rajendra19
@k9z')
for i in a:
print(i.start(),'........',i.group())
""":
index char name
start() group()
0 ........ r
1 ........ a
2 ........ j
3 ........ e
4 ........ n
5 ........ d
6 ........ r
7 ........ a
8 ........ 1
9 ........ 9
10 ........
11 ........ @
12 ........ k
13 ........ 9
14 ........ z
"""
Quantifiers:
It can be used to specify the
no of occurrences to match
Quantity means the no of occurrences,
Characters
|
Searches for
|
‘a’
|
Exactly char ‘a’
|
‘a+’
|
Atleast one ‘a’
|
Any number of ‘a’s,
including zero number as well
|
|
‘a?’
|
At most one ‘a’ : either one
‘a’ or zero number of a’s
|
a{n}
|
Exactly n no of ‘a’’s
|
a{m,n}
|
Minimum no of ‘a’’s and
maximum no of ‘a’’s. ex: a{2,3}
|
[^a]
|
Except ‘a’ all the
characters
|
^a
|
It will check whether the
given target string starts with ‘a’ or not
|
$a
|
It will check whether the
target string end with ‘a’ or not
|
#Ex8: Quantifiers search all the char in the 'rajendra19@k9z'
import re
a=re.finditer('a','rajendra19 @k9z') #exact ‘a’
for i in a:
print(i.start(),'........',i.group())
""":
index char name
start() group()
1 ........ a
7 ........ a
"""
#Ex8: Quantifiers search all
the char in the 'rajendra19@k9z'
import re
a=re.finditer('a+','raajendraaa19
@k9z')#atleast one 'a' in the sequence
for i in a:
print(i.start(),'........',i.group())
""":
index char name
start() group()
1 ........ aa
8 ........ aaa
"""
#Ex9:
import re
a=re.finditer('a*','rajendra19
@k9z')#atleast one 'a' in the sequence
for i in a:
print(i.start(),'........',i.group())
""":
index char name
start() group()
0 ........
1 ........ a
2 ........
3 ........
4 ........
5 ........
6 ........
7 ........ a
8 ........
9 ........
10 ........
11 ........
12 ........
13 ........
14 ........
15 ........
"""
#Ex10:
import re
a=re.finditer('a?','rajendra19
@k9z')#atleast one 'a' in the sequence
for i in a:
print(i.start(),'........',i.group())
""":
index char name
start() group()
0 ........
1 ........ a
2 ........
3 ........
4 ........
5 ........
6 ........
7 ........ a
8 ........
9 ........
10 ........
11 ........
12 ........
13 ........
14 ........
15 ........
"""
#Ex11:
import re
a=re.finditer('a{2}','rajendraa19 @k9z')#2 'a's in the
sequence
for i in a:
print(i.start(),'........',i.group())
""":
index char name
start() group()
7 ........ aa
"""
#Ex12:
import re
#min no of 'a' and max no of
'a's in the sequence
a=re.finditer('a{1,2}','rajendraa19
@k9z')
for i in a:
print(i.start(),'........',i.group())
""":
index char name
start() group()
case1:
a{2,3}
7 ........ aa
case2:
a{2,1}:
error bcoz min should be 1,
max should be 2 as per your input
case 3:
1 ........ a
7 ........ aa
"""
#Ex13:
import re
#min no of 'a' and max no of
'a's in the sequence
a=re.finditer('[^a]','rajendraa19 @k9z')
for i in a:
print(i.start(),'........',i.group())
""":
index char name
start() group()
0 ........ r
2 ........ j
3 ........ e
4 ........ n
5 ........ d
6 ........ r
9 ........ 1
10 ........ 9
11 ........
12 ........ @
13 ........ k
14 ........ 9
15 ........ z
"""
Functions are
available in re module:
1.
match()
2. fullmatch()
3. search()
4. findall()
5. finditer()
6. sub()
7. subn()
8. split()
9. compile()
match(): to check the given pattern at beginning of the
target string or not. If it is available then it returns match object, None
otherwise
Example 1
import re
s=input("enter pattern to check: ")
m=re.match(s,'Blore is a green city')
if m!=None:
print("match is available
at the beging of the string")
print('start index:{} and end
index{}'.format(m.start(),m.end()))
else:
print('Match is not available
at the begining of the string')
"""output:
case1:
enter pattern to
check: is
Match is not
available at the begining of the string
case2:
enter pattern to
check: B
match is
available at the beging of the string
start index:0
and end index1
case3:
enter pattern to
check: Blo
match is
available at the beging of the string
start index:0
and end index3
3.
fullmatch() method:
it searches the full pattern
in the given string
"""
import re
s=input("enter pattern to check: ")
m=re.fullmatch(s,'Blore is a green city')
if m!=None:
print("full string
matched")
print('start index:{} and end
index{}'.format(m.start(),m.end()))
else:
print('full string Match is
not available at the begining of the string')
"""output:
case1:
enter pattern to
check: Blore is a green city
full string
matched
start index:0
and end index21
case2:
enter pattern to
check: is
full string
Match is not available at the begining of the string
case3:
enter pattern to
check: Blore
full string Match
is not available at the begining of the string
"""
4.
search():
match of the first occurrences to search the
pattern in everywhere. It returns None if it is not found
Example:
import re
s=input("enter
pattern to check: ")
m=re.search(s,'rajendraa')
if m!=None:
print("full string matched")
print('first occurences with start index:{}
and end index{}'.format(m.start(),m.end()))
else:
print('full string Match is not available
')
"""output:
case1:
enter pattern to check: aa
full string matched
first occurences with start index:7 and
end index9
case2:
enter pattern to check: rajendra
full string matched
first occurences with start index:0 and
end index8
"""
5.
findall() method
import re
m=re.findall('[0-9]','abc123')
print(m)
"""output:
['1', '2', '3']
"""Next
Previous: User Defined Exception
https://youtu.be/TeVaBIo-WQo