Spring Data JPA: Replace N Queries with Single Query

  • Post last modified:March 26, 2023
  • Reading time:6 mins read

Understanding query improvement tricks as a software engineer

Introduction

  • As software developers, we usually tend to lack database knowledge and how to be efficient while executing database queries. ORM frameworks were built so that they can bring that abstraction where developers do not need to understand databases in detail.
  • But in reality, we are often interrupted by a database administrator where feedback is to optimize the query.
  • In this article, we will see such an opportunity where we will get a quick win on our naive query.

User Case

  • Our marketing users perform ad-hoc data analysis and have created a list of target users to whom they want to send an email about the upcoming campaigns. They have uploaded this file to a remote server using UI.
  • There is a scheduler that runs sendEmail job, this job is a spring boot rest endpoint that is invoked by cron.
  • Before sending an email to a customer, this job checks if the user_email exists in the database ( to avoid sending emails to the non_customer or customer who is not in the OLTP database or not subscribed to get an email marketing campaign).
  • Then we finally send an email to the customer who exists in the database and a file uploaded by the marketing team both.

Setup

Entity

  • Basically, we have one table account in the database, our entity looks like the one below.
@Entity
@Table(name="ACCOUNTS")
public class Account {
    @Id @GeneratedValue
    private int userId;
    private String username;
    private String password;
    private String email;
    private Date createdOn;
    private Date lastLogin;
     
    // setter, getter, constructor
}

Controller

  • The controller is simple to get an endpoint that invokes the email sender logic.
  • Here the response is void but in reality, we can return a boolean response about success or failure.
@GetMapping("/sendEmail")
public void sendEmail(){
    notificationLogic.sendEmail();
}

Repository

  • In the repository interface, we have added one method to check if the given email exists in the table or not.
  • All we have to do is to define the method that’s it, no implementation is needed, spring data JPA takes care of the implementation internally.
boolean existsByEmail(String emailIId);

Service

  • We have created a notificationLogic class which consists of sendEmail method. We are calling another private method that gets the list of emails that contains in the file and database both. 
  • Additionally, we are also measuring the total time it took to execute this method so that we can compare it with another approach for comparison.
@Service
public class NotificationLogic {

    @Autowired
    private AccountRepository repository;

    public void sendEmail(){
        var x = System.currentTimeMillis();
        getEmailExistsInDB();
        var y = System.currentTimeMillis();
        System.out.println(y-x);
   }
}
  • In the getEmailExistsInDB method, we are assuming that after reading the remote server file we have an immutable list of email addresses to the marketer who wants to send emails.
  • Now, before sending an email we would like to check if this email exists in the OLTP system, hence we iterate over each email address and execute a query on the OLTP database if they exist then we add them to the result in the list, and Finally, we return the list.
    private void getEmailExistsInDB() {
        List<String> targetEmailList = List.of("abc@gmail.com", "abc1@gmail.com", "abc2@gmail.com", "xyz@gmail.com");

        List<String> emailExists = targetEmailList.stream()
                .filter(e->repository.existsByEmail(e))
                .toList();
        System.out.println(emailExists);
    }

Execution

  • Now, let’s execute the rest endpoint by using curl.
curl localhost:8080/accounts/sendEmail
  • In order to see the executed query in logs, add the below property in the applications.properties file.
spring.jpa.show-sql=true=
  • Once we execute the query, we see the below logs, which clearly shows that it executes 4 query check for 4 emails. If the number of emails is larger we have to make more database roundtrips. It’s clearly inefficient.

Improvement

  • Now one of the modifications we can make to our query is to pass the list of email addresses as a parameter to the SQL query and use IN operator to check if the email exists in the table.
  • By doing so we save multiple round trips to the database and get a performance win, since now we just have to execute the query once.

Repository

  • We can use @query to write our custom query in spring data jpa. We are passing our input list as params and the output will be all the matches that we got from the database. All in a single query.
@Query("select a.email from Account a where a.email IN(:emails)")
 List<String> emailExists(@Param("emails") List<String> emails);

Service

  • Now let’s modify our send email method and now call the improvised getEmailExistsInDB method.
public void sendEmail(){    
        var x1 = System.currentTimeMillis();
        List<String> emailExistsInDB = getEmailExistsInDB();
        var y1 = System.currentTimeMillis();
        System.out.println(y1-x1);
  }
  • getEmailExistsInDB method now passes the entire email list to the repository method emailExisits and gets the matching emailAddress from the database.
private List<String> getEmailExistsInDB() {

    List<String> targetEmailList = List.of("abc@gmail.com", "abc1@gmail.com", "abc2@gmail.com", "xyz@gmail.com");

    List<String> emailExistsInDB = repository.emailExists(targetEmailList);
    System.out.println(emailExistsInDB);
    return emailExistsInDB;
}

Execution

  • Now let’s see the improvement by executing the query. Let’s invoke the controller by typing the below curl command.
curl localhost:8080/accounts/sendEmail
  • We can see that the total time it took is 9 ms compared to 126 ms in the previous case. It’s a significant improvement.

Conclusion

  • In this article, we learned how we can improve our JPA query as software developers and improve significant performance by reducing database roundtrips.
  • We replace executing the DB query for each record with passing the entire list in a single query by using a custom query in JPA.
  • As software engineers, we should understand how particular queries will impact the database and try to seek some help from database experts when necessary.

Before You Leave

Leave a Reply