How to Periodically Delete Obsolete Files in Java

  • Post last modified:April 9, 2023
  • Reading time:6 mins read

Discussing Java regex, spring boot scheduler, quartz scheduler, and much more

Introduction

  • As software developers working for enterprises, we often see a need for this kind of job where we want to delete a bunch of files that are obsolete or we want zip-log files on the server periodically.
  • In this article, we will build a job that will accept the folder path on the server and a regular expression that will define the files that we want to delete on the server.

File Deletion Logic

Option 1

  • In the below example, we are listing all the files in the folder. We can use listFiles() on the File instance, then we can find all the occurrences of regex, then we can stream over it and perform the delete operation on a filtered list of files.
public static void main(String[] args) throws SchedulerException {
        File folder = new File(args[0]);
        String regex = args[1]; 
        option1(folder, regex);
}

public static void option1(File folder, String regex) {
        Function<File, Boolean> regexMatcher = (file) -> Pattern.compile(regex)
                .matcher(file.getName()).find();
        File[] files =folder.listFiles();

        System.out.println("#####Deleting Files#####");
        Arrays.stream(files).filter(regexMatcher::apply)
                .peek(System.out::println)
                .forEach(File::delete);
        System.out.println("#####Done#####");
    }
  • On my desktop, I often end up creating lots of Screenshot files whenever I take screenshots and if I want to delete all those files, I can basically pass User/{username}/Desktop/, and for regex, we can pass “Screenshot.*.png”
  • And we can package the code and execute it using Maven, remember we need to pass an argument while executing the code.
 mvn package
 mvn exec:java -Dexec.mainClass=misc.FileDeletion.java -Dexec.args="/Users/{username}/Desktop/screenshot Screenshot.*.png"

Option 2

  • In this option, we use listFiles(FilenameFilter filenameFilter) which takes filenameFilter. We can pass a regex to filenameFilter, and it finds all the occurrences of the files with the given regex. Once we have a list of all the files we can delete them using delete() operation.
 public static void main(String[] args) throws SchedulerException {
        File folder = new File(args[0]);
        String regex = args[1]; 
        option2(folder, regex);
 }
 public static void option2(File folder, String regex) {
        FilenameFilter filenameFilter = (d,s) -> {
            return s.matches(regex);
        };
        File[] files = folder.listFiles(filenameFilter);

        System.out.println("#####Deleting Files#####");
        Arrays.stream(files).peek(System.out::println)
                .forEach(File::delete);
        System.out.println("#####Done#####");
 }

Building API 

  • Now that we have figured out the logic to find and delete the files in the folder we like, how about wrapping it using the spring boot rest controller and invoking it either periodically or ad-hoc?
  • We are creating HouseKeepingController, where we have delete/files endpoint that takes a path, regex as requestparam.
  • The entire logic is copy pasted from above.
@RestController
@RequestMapping("/housekeeping")
public class HouseKeepingController {

    @GetMapping("/delete/files")
    public void delete(@RequestParam String path, @RequestParam String regex) throws IOException {
        File folder = new File(path);
        Function<File, Boolean> regexMatcher = (file) -> Pattern.compile(regex)
                .matcher(file.getName()).find();
        File[] files =folder.listFiles();

        System.out.println("#####Deleting Files#####");
        Arrays.stream(files).filter(regexMatcher::apply)
                .peek(System.out::println)
                .forEach(File::delete);
        System.out.println("#####Done#####");
    }
}
  • Once we deploy this endpoint by executing the main class of Springboot, we can invoke the endpoint using Postman.
  • We can periodically execute this either by having a shell file and creating crontab to run maybe hourly or daily. We have discussed this approach here.
  • We can also execute writing cronjob using Java. In the next section, we discuss this.

Scheduling with Scheduler

  • We can schedule our job using a quartz scheduler or spring scheduler.

Using Quartz Scheduler

  • We need to add Quartz dependency in the pom.xml file
<dependency>
      <groupId>org.quartz-scheduler</groupId>
      <artifactId>quartz</artifactId>
      <version>2.3.0</version>
</dependency>
  • We need to implement the Job interface and write our logic to overridden execute method.
public class FileDeletionJob implements Job {


    @Override
    public void execute(JobExecutionContext jobExecutionContext) throws JobExecutionException {
        String regex = jobExecutionContext.getJobDetail().getJobDataMap().getString("regex");
        File folder = new File(jobExecutionContext.getJobDetail().getJobDataMap().getString("path"));
        File[] files = folder.listFiles();

        Function<File, Boolean> regexMatcher = (file) -> Pattern.compile(regex)
                .matcher(file.getName()).find();

        System.out.println("#####Deleting Files#####");
        Arrays.stream(files).filter(regexMatcher::apply)
                .peek(System.out::println)
                .forEach(File::delete);
        System.out.println("#####Done#####");
    }
}
  • Once the logic is done we need to configure the job. JobBuilder provides a builder instance to configure the job. As we can see we need to register the job class and initialize jobData, which we can use in Job File.
public static void configureJob(String path, String regex) throws SchedulerException {
        JobDetail job = JobBuilder.newJob(FileDeletionJob.class)
                .usingJobData("regex", regex)
                .usingJobData("path", path)
                .build();

        Trigger trigger = newTrigger().withSchedule(simpleSchedule()
                                .withIntervalInMinutes(1)
               // .withIntervalInHours(1)
                .repeatForever())
                .build();

        Scheduler scheduler = new StdSchedulerFactory().getScheduler();
        scheduler.start();
        scheduler.scheduleJob(job, trigger);
    }
  • Now once we have finished writing and configuring the job, we can package and execute using the maven command that we used before.
mvn package
mvn exec:java -Dexec.mainClass=misc.FileDeletion.java -Dexec.args="/Users/{username}/Desktop/screenshot Screenshot.*.png"
  • The directory contains some screenshot files that we would like to delete.
  • Once the job is executed we can see the logs listing files that were deleted.
  • As we can see screenshots got deleted.

Using Spring Scheduler

  • Spring provides a scheduler that we can configure to run tasks that we want to run periodically.
  • Here we are using a cron expression that runs the job every minute
@Service
public class FileDeletion {

    @Scheduled(cron = "0 * * * * *")
    public void delete() throws IOException {
        File folder = new File("/Users/surajmishra/Desktop/screenshot");
        Function<File, Boolean> regexMatcher = (file) -> Pattern.compile("Screenshot.*.png")
                .matcher(file.getName()).find();
        File[] files =folder.listFiles();

        System.out.println("#####Deleting Files#####");
        Arrays.stream(files).filter(regexMatcher::apply)
                .peek(System.out::println)
                .forEach(File::delete);
        System.out.println("#####Done#####");
    }
}
  • We need to add @EnableScheduling to the main file to enable scheduling for the SpringBoot application.
@SpringBootApplication
@EnableScheduling
public class SpringPostgresApplication {

 public static void main(String[] args) {
  SpringApplication.run(SpringPostgresApplication.class, args);
 }

}

Conclusion

  • In this article, we build an application that deletes files using regex in provided folder.
  • We also saw how we can periodically execute it using various options.
  • This pattern can be extended to different types of housekeeping jobs like zipping the file or moving the files around the server and many more.

Before You Leave

Leave a Reply