Regex With SQL Database Explained!

  • Post last modified:September 30, 2023
  • Reading time:4 mins read

Using Regex to query Postgres database with use cases

TLDR;

  • Discussed what is regex and why to use them.
  • Not all databases support it, Postgres does support regex.
  • We covered how to use regex with Postgres along with some use cases.

Regular Expression (a.k.a Regex)

  • Regex is a powerful tool that is widely used for pattern matching and text manipulation.
  • Almost all programming language has support for them and is often used for use cases such as text extraction, searching, and matching texts.
  • Regex matching starts with a ‘^’ character and ends with a ‘$’ character.

Example 

  • let’s say we want to validate that the given string is a valid Visa credit card number.
  • The input string is `4111111111111111`.
  • Our regex would be: ‘4[0–9]{12}(?:[0–9]{3})?’ 
    4[0–9]{12}:
    This part of regex basically says that the pattern starts with 4 followed by 12 digits that contain (0–9). Now that’s a total of 13 digits.
    (?:[0–9]{3}): Old visa card were 13 digits so this represents optional 3 digits.

Now that the basics are clear let’s discuss how we can use regex with database query.

Regex with Database

  • Using Regex makes SQL queries dynamic and improves performance as well.
  • In this article, we will be using the Postgres database which has support for regex.

We need to keep in mind that not all databases have support for regex.

Postgres

  • We can use the ~(TILDE) operator and .* (the wildcard operator) is used to implement regular expressions in PostgreSQL.

List of Operators

  1. ~: use for string matching regular expression with case sensitively
# search for all the email matching the pattern
postgres=# select email from accounts where email ~ '^\S+@\S+\.\S+$';

2. !~: use for string, not matching regular expression with case-sensitively.

# search for all the email not matching the pattern
postgres=# select email from accounts where email !~ '^\S+@\S+\.\S+$';

3. ~*: use for string matching regular expression with case-insensitively.

# search for all the email matching the pattern with case insensitively
postgres=# select email from accounts where email !~ '^\S+@\S+\.\S+$';

4. !~*: use for string not matching regular expression with case-insensitively.

# search for all the email not matching the pattern with case insensitively
postgres=# select email from accounts where email !~ '^\S+@\S+\.\S+$';

Use Cases: Find Credit Card By Their Brand Name

  • Our input table is credit_card which contains the card number and expiry date.

Records in Table

postgres=# select * from credit_card;
 id |   card_number    | expiry_date | customer_id
----+------------------+-------------+-------------
 11 | 2344323432112222 | 2023-04-10  |          10
 10 | 2344323432112422 | 2023-04-12  |          10
 13 | 4111111111111111 | 2023-04-11  |          10
 14 | 4111111131111111 | 2023-05-11  |          10
 15 | 4111111131119111 | 2023-05-12  |          10
 17 | 378282246310005  | 2023-05-09  |          10
 18 | 378282246710005  | 2024-05-09  |          10
 21 | 4111111131119    | 2025-05-12  |          10
(8 rows)

Querying all the credit cards 

With Visa brand

postgres=# select * from credit_card where card_number ~ '^4[0-9]{12}(?:[0-9]{3})?$';
  • All the credit cards for Visa starts with 4, old card being 13 digits and new cards are 16 digits.
  • Our query returns the expected 4 cards which starts with 4.
-[ RECORD 1 ]-----------------
id          | 13
card_number | 4111111111111111
expiry_date | 2023-04-11
customer_id | 10
-[ RECORD 2 ]-----------------
id          | 14
card_number | 4111111131111111
expiry_date | 2023-05-11
customer_id | 10
-[ RECORD 3 ]-----------------
id          | 15
card_number | 4111111131119111
expiry_date | 2023-05-12
customer_id | 10
-[ RECORD 4 ]-----------------
id          | 21
card_number | 4111111131119
expiry_date | 2025-05-12
customer_id | 10

With Amex brand

postgres=# select * from credit_card where card_number ~ '^3[47][0-9]{13}$';
  • All the Amex credit card starts with 3 and the second digit is either 4 or 7. The total digit is 15.
  • Our query returns the expected 3 results.
-[ RECORD 1 ]----------------
id          | 17
card_number | 378282246310005
expiry_date | 2023-05-09
customer_id | 10
-[ RECORD 2 ]----------------
id          | 18
card_number | 378282246710005
expiry_date | 2024-05-09
customer_id | 10

Use Cases: Finding all the emails that don’t match emailId pattern

Accounts Table

  • In Regex, we are looking for characters@characters.character ( ex. abc@gmail.com ) pattern.
  • We are filtering query with all the email which doesn’t follow the pattern.
select email from accounts where email !~ '^\S+@\S+\.\S+$';
  • Our output contains emails that do not have above defined pattern.
postgres=# select email from accounts where email !~ '^\S+@\S+\.\S+$';
    email
-------------
 abctest.com
(1 row)

Conclusion

  • In this article, we covered a basic explanation of regex with some easy examples.
  • We extended that discussion by including regex with SQL query and querying against the database.

Before You Leave

Leave a Reply