Adventures in Machine Learning

Regex in Python: Understanding match() search() and fullmatch()

Python is a popular high-level programming language used by millions of developers worldwide. One of the important components of Python is the ability to work with regular expressions or regex.

Regular expressions are used to validate and parse text patterns. Python regex module offers multiple methods to search or match a regex pattern in a string.

In this article, we will explore two important regex concepts in Python – the re.match() method and matching operations in Python regex.

Understanding Python re.match() Method

The re.match() function is used to match a regex pattern at the beginning of a string. It accepts two arguments – a regex pattern and a string.

The function returns a re.Match object if the pattern is found at the beginning of the string, otherwise it returns None. Let’s understand the syntax of re.match() function.

Syntax of re.match():

re.match(pattern, string, flags=0)

The pattern is a regex pattern to be matched, and the string is the input string. There is an optional flags parameter that modifies the behavior of the regex pattern.

The re.Match object returned by re.match() function contains useful methods to examine the matched string and its groups. Return value:

If the regex pattern is found at the beginning of the string, the function returns a re.Match object.

The matched string can be accessed using group(), start() and end() methods of the Match object. If the pattern is not found at the beginning of the string, the function returns None.

Match Regex Pattern at the Beginning of the String:

The re.match() function is specifically designed to match a regex pattern at the beginning of a string. It checks whether the regex pattern matches the first few characters of the input string or not.

If it matches, it returns a Match object, otherwise, it returns None. Let’s see an example:


import re
# defining a regex pattern
pattern = r"hello"
# input string
string = "hello world"
# using re.match() method to find pattern
result = re.match(pattern, string)
# printing the result
if result:
print("Match found!")
else:
print("Match not found.")

Here, we defined a regex pattern to match the word ‘hello’. We then defined an input string ‘hello world’ and applied the re.match() function to find the pattern at the beginning of the string.

As the pattern is found at the beginning of the string, the function returns a Match object and prints “Match found!”.

Match Regex Pattern Anywhere in the String:

The re.match() function can only match the regex pattern at the beginning of a string.

If we want to match a pattern anywhere in the string, we can use re.search() or re.findall() functions. The re.search() method scans the entire string and returns the first occurrence of the pattern, whereas the re.findall() method returns all the non-overlapping matches of the pattern in the string.

Let’s see an example:


import re
# defining a regex pattern
pattern = r"world"
# input string
string = "hello world, world is beautiful"
# using re.search() method to find pattern
result = re.search(pattern, string)
# printing the result
if result:
print("Match found!")
else:
print("Match not found.")

Here, we defined a regex pattern to match the word ‘world’. We then defined an input string ‘hello world, world is beautiful’ and applied the re.search() function to find the pattern in the string.

As the pattern is found in the string, the function returns a Match object and prints “Match found!”.

Match Regex at the End of the String:

Similar to the beginning of a string, we can match a pattern at the end of a string.

To match a pattern at the end of a string, we use the dollar ($) metacharacter. This metacharacter indicates the end of the string.

Let’s see an example:


import re
# defining a regex pattern
pattern = r"beautiful$"
# input string
string = "The sunset was beautiful"
# using re.search() method to find pattern
result = re.search(pattern, string)
# printing the result
if result:
print("Match found!")
else:
print("Match not found.")

Here, we defined a regex pattern to match the word ‘beautiful’ at the end of the string. We then defined an input string ‘The sunset was beautiful’ and applied the re.search() function to find the pattern at the end of the string.

As the pattern is found at the end of the string, the function returns a Match object and prints “Match found!”.

Match the Exact Word or String:

Sometimes we want to match an exact word or string in a text.

For this, we use the caret (^) and dollar ($) metacharacters together to match the exact word or string. Let’s see an example:


import re
# defining a regex pattern
pattern = r"^beautiful$"
# input string
string = "beautiful"
# using re.search() method to find pattern
result = re.search(pattern, string)
# printing the result
if result:
print("Match found!")
else:
print("Match not found.")

Here, we defined a regex pattern to match the exact word ‘beautiful’. We then defined an input string ‘beautiful’ and applied the re.search() function to find the pattern in the string.

As the pattern is found in the string, the function returns a Match object and prints “Match found!”.

Match Regex Pattern that Starts and Ends with the Given Text:

Sometimes we want to match a regex pattern that starts with a particular text and ends with a particular text.

For this, we use the caret (^) and dollar ($) metacharacters together along with the regex pattern. Let’s see an example:


import re
# defining a regex pattern
pattern = r"^hello.*world$"
# input string
string = "hello, welcome to the world"
# using re.search() method to find pattern
result = re.search(pattern, string)
# printing the result
if result:
print("Match found!")
else:
print("Match not found.")

Here, we defined a regex pattern to match the pattern that starts with ‘hello’ and ends with ‘world’. We then defined an input string ‘hello, welcome to the world’ and applied the re.search() function to find the pattern in the string.

As the pattern is found in the string, the function returns a Match object and prints “Match found!”.

Matching Operations in Python Regex

Matching operations in Python regex allow us to match a specific character or a set of characters in a string. Let’s see some of the commonly used matching operations in Python regex.

Match Any Character:

The dot (.) metacharacter matches any single character except the newline character. Let’s see an example:


import re
# defining a regex pattern
pattern = r"he..o"
# input string
string1 = "hello"
string2 = "hezlo"
string3 = "he2lo"
string4 = "he$llo"
# using re.match() method to find pattern
result1 = re.match(pattern, string1)
# using re.match() method to find pattern
result2 = re.match(pattern, string2)
# using re.match() method to find pattern
result3 = re.match(pattern, string3)
# using re.match() method to find pattern
result4 = re.match(pattern, string4)
# printing the result
if result1:
print("Match found in string1!")
else:
print("Match not found in string1.")
if result2:
print("Match found in string2!")
else:
print("Match not found in string2.")
if result3:
print("Match found in string3!")
else:
print("Match not found in string3.")
if result4:
print("Match found in string4!")
else:
print("Match not found in string4.")

Here, we defined a regex pattern to match any 5-letter word starting with ‘he’ and ending with ‘o’. We then defined four input strings, each containing a different character in the second and fourth positions.

We applied the re.match() function to each string to check if it matched the pattern or not. As expected, the function returns a Match object for all the strings except string4, which contains the dollar ($) metacharacter.

Match Number or Digit:

The d metacharacter matches any digit from 0-9. Let’s see an example:


import re
# defining a regex pattern
pattern = r"d+"
# input string
string = "The price of the book is $50"
# using re.findall() method to find pattern
result = re.findall(pattern, string)
# printing the result
print(result)

Here, we defined a regex pattern to match one or more digits in a string. We then defined an input string ‘The price of the book is $50’ and applied the re.findall() function to find all occurrences of digits in the string.

The function returns a list containing the matches [5, 0].

Match Special Characters:

The W metacharacter matches any non-alphanumeric character such as punctuation marks, symbols, whitespace characters, etc.

Let’s see an example:


import re
# defining a regex pattern
pattern = r"W+"
# input string
string = "This is a sentence. It contains punctuation marks!"
# using re.findall() method to find pattern
result = re.findall(pattern, string)
# printing the result
print(result)

Here, we defined a regex pattern to match one or more non-alphanumeric characters in a string. We then defined an input string ‘This is a sentence.

It contains punctuation marks!’ and applied the re.findall() function to find all occurrences of special characters in the string. The function returns a list containing the matches [‘ ‘, ‘.’, ‘ ‘, ‘!’].

In conclusion, regular expressions are an essential tool in text processing and Python provides rich support for regex with its powerful regex module. In this article, we explored two important concepts – the re.match() function and matching operations in Python regex.

Now that you have a basic understanding of these concepts, you can apply them in your Python projects to manipulate and validate text patterns efficiently. Regex is a powerful tool in Python that allows developers to match, validate, and manipulate text patterns.

In Python’s regex module, there are different functions available to search text patterns including re.match(), re.search(), and re.fullmatch(). In this article, we will explore the differences between these functions and learn when to use them.

Regex Search vs. Match

Regex search and match are two important functions in Python’s regex module that allow developers to find specific text patterns in strings. Let’s understand the differences between these two functions.

How re.match() Works:

The re.match() function is used to match a regex pattern at the beginning of a string. It starts matching the pattern from the beginning of the string and returns a Match object if the pattern is found, otherwise it returns None.

Let’s see an example:


import re
# defining a regex pattern
pattern = r"hello"
# input string
string = "hello world"
# using re.match() method to find pattern
result = re.match(pattern, string)
# printing the result
if result:
print("Match found!")
else:
print("Match not found.")

Here, we defined a regex pattern to match the word ‘hello’. We then defined an input string ‘hello world’ and applied the re.match() function to find the pattern at the beginning of the string.

As the pattern is found at the beginning of the string, the function returns a Match object and prints “Match found!”.

How re.search() Works:

The re.search() function is used to search for a pattern anywhere inside a string.

It scans the entire string for the pattern and returns a Match object if the pattern is found, otherwise it returns None. Let’s see an example:


import re
# defining a regex pattern
pattern = r"world"
# input string
string = "hello world, world is beautiful"
# using re.search() method to find pattern
result = re.search(pattern, string)
# printing the result
if result:
print("Match found!")
else:
print("Match not found.")

Here, we defined a regex pattern to match the word ‘world’. We then defined an input string ‘hello world, world is beautiful’ and applied the re.search() function to find the pattern in the string.

As the pattern is found in the string, the function returns a Match object and prints “Match found!”.

Behavior of Search vs. Match with a Multiline String:

If the input string contains multiple lines, re.match() function will only match the pattern at the beginning of the first line of the string. However, re.search() function will match the pattern anywhere inside the string, even if it is in the middle of a line.

Let’s see an example:


import re
# defining a regex pattern
pattern = r"world"
# input string
string = "hello world,n world is beautiful"
# using re.match() method to find pattern
result1 = re.match(pattern, string)
# using re.search() method to find pattern
result2 = re.search(pattern, string)
# printing the result
if result1:
print("Match found with match()!")
else:
print("Match not found with match().")
if result2:
print("Match found with search()!")
else:
print("Match not found with search().")

Here, we defined a regex pattern to match the word ‘world’. We then defined an input string with multiple lines and applied both re.match() and re.search() functions to find the pattern in the string.

As the pattern is in the middle of the second line, re.match() function returns None, whereas re.search() function returns a Match object and prints “Match found with search()!”.

re.fullmatch()

Python’s regex module also provides a function called re.fullmatch(). This function is used to match the entire string with a regex pattern.

It differs from the re.match() function as it requires a match with the entire string, not just the beginning. Let’s understand the syntax of re.fullmatch() function.

Syntax of re.fullmatch():

re.fullmatch(pattern, string, flags=0)

Here, the pattern is a regex pattern to be matched, and the string is the input string. There is an optional flags parameter that modifies the behavior of the regex pattern.

Why and When to Use re.match() and re.fullmatch():

The main difference between re.match() and re.fullmatch() is that re.match() matches the pattern at the beginning of the string, whereas re.fullmatch() matches the entire string with the pattern. Developers should use re.match() when they want to match a pattern at the beginning of the string or a substring, whereas they should use re.fullmatch() when they want to match the entire string with the pattern.

Let’s see an example:


import re
# defining a regex pattern
pattern = r"hello"
# input string
string = "hello world"
# using re.fullmatch() method to find pattern
result1 = re.fullmatch(pattern, string)
# using re.match() method to find pattern
result2 = re.match(pattern, string)
# using re.fullmatch() method to find pattern with a full string
full_string = "hello"
result3 = re.fullmatch(pattern, full_string)
# printing the result
if result1:
print("Match found with fullmatch()!")
else:
print("Match not found with fullmatch().")
if result2:
print("Match found with match()!")
else:
print("Match not found with match().")
if result3:
print("Match found with fullmatch() and a full string!")
else:
print("Match not found with fullmatch() and a full string.")

Here, we defined a regex pattern to match the word ‘hello’. We then defined two input strings – ‘hello world’, which contains the pattern at the beginning of the string, and ‘hello’, which contains the pattern in the entire string.

We applied re.fullmatch() function with both input strings and re.match() function with the first input string. As expected, re.fullmatch() function returns a Match object for the second input string and re.match() function returns a Match object for the first input string.

In conclusion, Python’s regex

Popular Posts