Using Generex to Generate Sample Data using Regex
Introduction
- Having test data available is a common requirement in most projects.
- However, it’s always tedious to get realistic test data that is based on production and often we end up creating mock data that suffice the development needs.
- In the previous article, we discussed one approach to generating mock data.
- The goal of this article is to use another approach to generate mock data using regex.
Pre-requisite
- We will need generex dependency that will parse our regex and help us generate data from passed regex.
- Make sure to add it in the pom.xml file
<dependency>
<groupId>com.github.mifmif</groupId>
<artifactId>generex</artifactId>
<version>1.0.2</version>
</dependency>
How does it work?
- We initialize the generex instance by passing our regex to the constructor.
- Once we have generex instance, we can use helper methods such as random(), getFirstMatch(), and getAllMatchedStrings() to generate a record.
Generex generex= new Generex("[REGEX_HERE]");
generex.random(); // generate data
Fields
- username
- age
- zipcode
- phoneNumber
- cardNumber
Generate username
- In order to generate a username field, we will use regex
“[a-zA-Z0–9]{18}”
Generex username = new Generex("[a-zA-Z0-9]{18}");
System.out.println(username.random());
// output : 991Pod3a3ZyB87c6ni
- We can also mix names with regex as shown below for usernames to have names+alphanumeric sequence.
List<String> names = List.of("sam", "jam", "tam");
String namesString = names.stream().map(n -> n + "_").collect(Collectors.joining("|"));
Generex username1 = new Generex("("+namesString+")([a-z0-9]{5})");
System.out.println(username1.random()); // jam_b64z5
Generate age
- For age, we are using “(1[89]|[2–9]\\d)” that will generate age between 18–99
Generex age = new Generex("(1[89]|[2-9]\\d)");
System.out.println(age.random()); // 20
Generate ZipCode
- We are using regex for the US zip code.
Generex zipCode = new Generex("\\d{5}(-)\\d{4}");
System.out.println(zipCode.random());
//84042-2198
Generate Phone Number
- Using regex for US phone numbers.
Generex phoneNumber = new Generex("([0-9]{3})-([0-9]{3})-([0-9]{4})");
System.out.println(phoneNumber.random());
// 658-101-3783
Generate Card Number
- Generating card numbers considering 16 digit pattern, which is not strictly true for all the cards so maybe we can improve it, but for demo purposes, it’s fine I guess.
Generex cardNumber = new Generex("\\d{4}(-)\\d{4}(-)\\d{4}(-)\\d{4}");
System.out.println(cardNumber.random());
// 4737-4046-5951-7119
Helper method to generate records
- Now that we have regex ready let’s put them all together and create the method generateRecords() that will be called to create mock records.
public static List<String> generateRecord(){
Generex username = new Generex("[a-zA-Z0-9]{18}");
List<String> names = List.of("sam", "jam", "tam");
String namesString = names.stream().map(n -> n + "_").collect(Collectors.joining("|"));
Generex username1 = new Generex("("+namesString+")([a-z0-9]{5})");
Generex cardNumber = new Generex("\\d{4}(-)\\d{4}(-)\\d{4}(-)\\d{4}");
Generex age = new Generex("(1[89]|[2-9]\\d)");
Generex zipCode = new Generex("\\d{5}(-)\\d{4}");
Generex phoneNumber = new Generex("([0-9]{3})-([0-9]{3})-([0-9]{4})");
List<String> fields = new ArrayList<>();
fields.add(username1.random());
fields.add(age.random());
fields.add(zipCode.random());
fields.add(phoneNumber.random());
fields.add(cardNumber.random());
return fields;
}
Generating records in bulk
- Now we can use IntStream from Java 8 streams to generate the range that we need and invoke generateRecord() method which returns a list.
- We then map that list of fields to string with a comma to return as CSV, but we can have any mapper map to any format like JSON and return the result.
- Here we are also printing to the console to verify the output.
public static void main(String[] args) {
List<String> records = IntStream.range(1, 100)
.mapToObj(i -> generateRecord())
.map(a -> String.join(",", a))
.peek(System.out::println)
.toList();
// sink the records
}
Output
- Our output seems to generate the records that we wanted for testing.
Conclusion
- In this article, we learned how we can use regex to quickly generate test/mock data.
- generex provides good support to pass regex and helper methods to generate random records.
Before You Leave
- Let me know if I can be of any help to your career, I would love to chat or jump on a call.
- If you like this content consider supporting it.
- Upgrade your Java skills with Grokking the Java Interview
- If you want to upskill your Java skills, you should definitely check out
[NEW] Master Spring Boot 3 & Spring Framework 6 with Java
[ 38 hrs content, 4.7/5 stars, 6+ students already enrolled]