How to Build RSS Feed Client in Java

  • Post last modified:August 16, 2022
  • Reading time:5 mins read

Introduction

  • RSS feeds are one of the common way to fetch the latest news articles from popular websites.
  • In this blog we will use java library rome library to fetch articles from tech blog website using rss feed.

Use Case

  • Our use case is to fetch rss feeds from popular blog page such google cloud blog, uber engineering blog etc.
  • If you explore , you will find rss feed for most of the blog website.
  • If you can’t find such logo on the website then we can right click on the page and view source and search rss link.

Building Client

  • Now that we have rss feed url , we can use it our advantage to read it by pulling the articles.
  • We will first create project RSSReaderClient in IDE. Then first task is to add rome library to pom.xml
<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0"
         xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
         xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
    <modelVersion>4.0.0</modelVersion>

    <groupId>org.example</groupId>
    <artifactId>RSSReaderClient</artifactId>
    <version>1.0-SNAPSHOT</version>

    <properties>
        <maven.compiler.source>11</maven.compiler.source>
        <maven.compiler.target>11</maven.compiler.target>
    </properties>

    <dependencies>
        <dependency>
            <groupId>rome</groupId>
            <artifactId>rome</artifactId>
            <version>1.0</version>
        </dependency>
    </dependencies>

</project>

POJO to Store NewsArticles

  • Lets create NewsArticle POJO to store the article data. We will need it later.
import java.util.List;

public class NewsArticle {
    private String title;
    private String link;
    private String imgUrl;
    private List<String> categories;

    public String getPublishedDate() {
        return publishedDate;
    }

    public void setPublishedDate(String publishedDate) {
        this.publishedDate = publishedDate;
    }

    private String publishedDate;

    public String getTitle() {
        return title;
    }

    public void setTitle(String title) {
        this.title = title;
    }

    public String getLink() {
        return link;
    }

    public void setLink(String link) {
        this.link = link;
    }

    public String getImgUrl() {
        return imgUrl;
    }

    public void setImgUrl(String imgUrl) {
        this.imgUrl = imgUrl;
    }

    public List<String> getCategories() {
        return categories;
    }

    public void setCategories(List<String> categories) {
        this.categories = categories;
    }

    @Override
    public String toString() {
        return "NewsArticle{" +
                "title='" + title + '\'' +
                ", link='" + link + '\'' +
                ", imgUrl='" + imgUrl + '\'' +
                ", categories=" + categories +
                ", publishedDate='" + publishedDate + '\'' +
                '}';
    }
}

Utility Class for RSS Reader

  • Also , we will create utility class that will perform the read rss feed operation and then map the result to our NewsArticle POJO that we created earlier.
import com.rometools.rome.feed.synd.SyndEntry;
import com.rometools.rome.feed.synd.SyndFeed;
import com.rometools.rome.io.FeedException;
import com.rometools.rome.io.SyndFeedInput;
import com.rometools.rome.io.XmlReader;

import java.io.IOException;
import java.net.URL;
import java.util.ArrayList;
import java.util.Iterator;
import java.util.List;

public class RSSReaderUtil {

    public static List<NewsArticle> read(String feedUrl) throws IOException, FeedException {
        URL feedSource = new URL(feedUrl);
        SyndFeedInput input = new SyndFeedInput();
        SyndFeed feed = input.build(new XmlReader(feedSource));
        Iterator itr = feed.getEntries().iterator();
        List<NewsArticle> results = new ArrayList<>();
        while (itr.hasNext()) {
            SyndEntry syndEntry = (SyndEntry) itr.next();
            results.add(mapToArticle(syndEntry));
        }

        return results;
    }

    /**
     * Map to Article
     * @param syndEntry
     */
    private static NewsArticle mapToArticle(SyndEntry syndEntry) {
        NewsArticle newsArticle = new NewsArticle();
        newsArticle.setTitle(syndEntry.getTitle());
        newsArticle.setPublishedDate(syndEntry.getPublishedDate().toString());
        newsArticle.setImgUrl("");
        newsArticle.setLink(syndEntry.getLink());
        return newsArticle;
    }
}

Reader Client

  • Once we have Utility ready, then all we have to do is to use it in our client.
  • We have list of feeds that we will target and pull the articles from
  • We will pass it our RssReaderUtility.read() method , that will return the response as List<NewsArticle>
  • Once we have our response , all we have to do is to print it.
import com.rometools.rome.io.FeedException;

import java.io.IOException;
import java.net.URISyntaxException;
import java.net.URL;
import java.util.List;

public class RSSReader {

    public static void main(String[] args) throws URISyntaxException, IOException, InterruptedException, FeedException {
        URL feedSource = new URL("https://cloudblog.withgoogle.com/rss/");
        List<String> targetFeedsList = List.of("https://cloudblog.withgoogle.com/rss/",
                "https://eng.uber.com/feed/",
                "https://eng.lyft.com/feed",
                "https://netflixtechblog.com/feed");

        for(String url : targetFeedsList){
            List<NewsArticle> results = RSSReaderUtil.read(url);
            System.out.println("url : "+url);
            results.stream().forEach(a-> System.out.println(a.toString()));
            System.out.println("==========");
        }
    }
}

Testing Client

  • Let’s run our java applications , as per our logic , our reader will read the list of feeds and print it . 
  • As you can see in the screenshot of the output, we have all the latest rss feeds from google cloud blog.

Conclusion

  • In this article we have used Java library rome to fetch RSS feeds from open source blogs . We can schedule pull articles from source using any java based schedulers or even cron on unix which runs jar and write the content some destination like firebase or relation database.
  • We can use this logic to serve users by either building mobile apps or web apps or browser extensions such as chrome.

Leave a Reply