This post shows you how to use Regular expressions in Python.
Step 1 Import
There is no native support within Python for Regular expressions so the first step is to import the re module
Step 2 Compile
telephone_regex = re.compile(r'\d\d\d\d-\d\d\d-\d\d\d\d')
The Regular expression above matches a pattern where there are four digits then a hyphen followed by three digits, another hyphen and finally four digits. Enclosing the Regular expression with r ensures Python treats it as a raw string overriding the normal special handing of the backslash character.
There is some debate whether re.compile is required. The documentation for the compile method mentions this step isn’t necessary when your program only uses a few Regular expressions. If you would like to know more about the necessity of re.compile, this Stackoverflow question is a good place to start.
Step 3 First match or All matches?
The Regular expression will be used to match the telephone numbers within this string:
text_to_search = 'You can contact me on 0171-123-45678 or 0141-321-3691'
As there are two telephone numbers, do you want to match the first telephone number? or match all of them?
first_search_result = telephone_regex.search(text_to_search)
The findall method returns all matches. This method returns a list object which in this example will contain both telephones: 0171-123-45678 and 0141-321-3691
all_results = telephone_regex.findall(text_to_search)
What happens when nothing is found?
If the search method doesn’t find a match, the None type is returned instead of a match object. This can lead to an exception if you didn’t anticipate it. The example below illustrates this behavior.
# no telephone numbers within the string text_to_search = 'You can contact me on or' # No matches are found so first_search_result is a None type # not a match object first_search_result = telephone_regex.search(text_to_search) # shows that first_search_result is a None type print(type(first_search_result)) # The None type does not have method called group so this line # will cause an exception print(first_search_result.group())
In contrast to the search method, the findall method doesn’t raise an exception if it fails to find a match. Instead it returns an empty list.
Step 4 Viewing the results
One way to view the contents of the match object is to use the group method. In the example below the group method is combined with print to output the results of the search to the console.
The findall method returns a list and the print can be used to output the contents to console.
Putting it all together
This Python module brings together the code shown throughout this post and can be used as a starting point for your own experiments.
import re text_to_search = 'You can contact me on 0171-123-45678 or 0141-321-3691' telephone_regex = re.compile(r'\d\d\d\d-\d\d\d-\d\d\d\d') # finds 0171-123-45678 first_search_result = telephone_regex.search(text_to_search) print(first_search_result.group()) # finds 0171-123-45678 and 0141-321-3691 all_results = telephone_regex.findall(text_to_search) print(all_results)