Python Regular Expressions (Regex)

Table Of Contents

1. Introduction to Regular Expressions
- Examples of Regex Usage:
2. Python’s Regex Module (re)
3. Basic Regex Patterns
- Commonly used regex characters and syntax:
4. Python Regex Functions
- Common methods of the re module:
5. Practical Examples with Regex
6. Regex Flags
7. Advanced Concepts: Grouping and Capturing
8. Tips & Best Practices

Regular Expressions (Regex) are powerful patterns used for searching, extracting, replacing, and manipulating text data. Python provides a built-in module called re for using regex effectively.

This tutorial covers:

What is a Regular Expression?
Python’s re module.
Basic Regex patterns and syntax.
Common Regex functions (match(), search(), findall(), sub(), etc.)
Advanced regex concepts with practical examples.
Regex flags for modifying matching behavior.

1. Introduction to Regular Expressions

Regular Expressions are special text strings that describe search patterns. They help you efficiently search for patterns within text or strings, validating data, parsing logs, extracting structured information, and more.

Examples of Regex Usage:

Validate email addresses or phone numbers.
Extract information from web pages.
Search through log files.
Replace or format strings.

2. Python’s Regex Module (`re`)

Python has a built-in module named re, specifically designed for handling regular expressions.

Importing the Module:

import re

3. Basic Regex Patterns

Commonly used regex characters and syntax:

Regex Pattern	Meaning	Example
`.`	Any single character	`a.b` matches “acb”, “adb”
`^`	Start of a string	`^hello` matches strings starting with “hello”
`$`	End of a string	`world$` matches strings ending with “world”
`\d`	Any digit (0-9)	`\d\d\d` matches “123”
`\D`	Any non-digit character	`\D\D` matches “ab”
`\w`	Any alphanumeric or underscore	`\w\w` matches “A1”
`\W`	Any non-alphanumeric	`\W` matches “!”
`\s`	Any whitespace	`hello\sWorld` matches “hello World”
`\S`	Any non-whitespace	`\S\S` matches “ab”
`[abc]`	Any character a, b, or c	`[aeiou]` matches vowels
`[0-9]`	Any digit from 0 to 9	`[1-5]` matches “2” or “4”
`( )`	Grouping	`(ab)+` matches “ab”, “abab”, “ababab”
`+`	One or more occurrences	`a+` matches “a”, “aa”, “aaa”
`*`	Zero or more occurrences	`ab*` matches “a”, “ab”, “abb”, “abbb”
`?`	Zero or one occurrence	`ab?` matches “a” or “ab”
`{n}`	Exactly n occurrences	`a{3}` matches “aaa”
`{n,m}`	Between n and m occurrences	`a{2,4}` matches “aa”, “aaa”, or “aaaa”

4. Python Regex Functions

Common methods of the `re` module:

re.match() – Checks for a match only at the beginning of the string.
re.search() – Searches throughout the entire string and returns the first match.
re.findall() – Returns a list of all matches.
re.finditer() – Returns an iterator yielding match objects.
re.sub() – Replaces matches with specified text.
re.split() – Splits strings based on a regex pattern.

5. Practical Examples with Regex

Example 1: Searching for a pattern

import re

text = "Hello, my number is 123-456-7890."
pattern = r"\d{3}-\d{3}-\d{4}"

match = re.search(pattern, text)

if match:
    print("Phone number found:", match.group())
else:
    print("No match found.")

Explanation:

The regex pattern \d{3}-\d{3}-\d{4} looks for the standard US phone number format.
re.search() returns the first occurrence of this pattern.

Output:

Phone number found: 123-456-7890

Example 2: Extracting Multiple Matches (`findall()`)

import re

text = "Emails: alice@example.com, bob@gmail.com, carol@outlook.com"
pattern = r"\w+@\w+\.\w+"

emails = re.findall(pattern, text)
print(emails)

Explanation:

Finds all email addresses in the provided text.

Output:

[‘alice@example.com’, ‘bob@gmail.com’, ‘carol@outlook.com’]

Example 3: Replacing Text (`sub()`)

import re

text = "Today is 2025-03-28"
pattern = r"\d{4}-\d{2}-\d{2}"

new_text = re.sub(pattern, "DATE", text)
print(new_text)

Explanation:

Replaces the date format with the word “DATE”.

Output:

Today is DATE

Example 4: Splitting Text (`split()`)

import re

text = "one,two;three four"
pattern = r"[,;\s]"  # split on comma, semicolon, or whitespace

words = re.split(pattern, text)
print(words)

Explanation:

Splits the text at commas, semicolons, and spaces.

Output:

[‘one’, ‘two’, ‘three’, ‘four’]

6. Regex Flags

Regex flags modify the behavior of regex matching:

re.I or re.IGNORECASE: Ignore uppercase/lowercase distinctions.
re.M or re.MULTILINE: ^ and $ match the start/end of each line.
re.S or re.DOTALL: . matches newline characters as well.

Example Using Flags:

import re

text = "Hello\nWorld"
pattern = r".+"

match = re.match(pattern, text, re.S)
print(match.group())

Output:

Hello
World

7. Advanced Concepts: Grouping and Capturing

Using parentheses () in patterns creates groups that can be extracted separately.

Example: Grouping

import re

text = "My name is Alice and I am 30"
pattern = r"My name is (\w+) and I am (\d+)"

match = re.search(pattern, text)

if match:
    name = match.group(1)
    age = match.group(2)
    print(f"Name: {name}, Age: {age}")

Output:

Name: Alice, Age: 30

8. Tips & Best Practices

Always test your regex patterns.
Use raw strings (r"pattern") to avoid escaping (\) issues.
Regex can become complex; keep them readable and add comments (re.VERBOSE).

1. Introduction to Regular Expressions

Examples of Regex Usage:

2. Python’s Regex Module (re)

3. Basic Regex Patterns

Commonly used regex characters and syntax:

4. Python Regex Functions

Common methods of the re module:

5. Practical Examples with Regex

Example 1: Searching for a pattern

Example 2: Extracting Multiple Matches (findall())

Example 3: Replacing Text (sub())

Example 4: Splitting Text (split())

6. Regex Flags

7. Advanced Concepts: Grouping and Capturing

8. Tips & Best Practices

Leave a Comment Cancel Reply

2. Python’s Regex Module (`re`)

Common methods of the `re` module:

Example 2: Extracting Multiple Matches (`findall()`)

Example 3: Replacing Text (`sub()`)

Example 4: Splitting Text (`split()`)