How to Build Social Stats API with Java

  • Post last modified:August 7, 2022
  • Reading time:6 mins read

Introduction

  • We use many social media apps nowadays. Often as a content creator we care about subscriber count or followers count.
  •  In this article our goal as a developer is to build backend logic that extract followers / subscriber information online and then serve it using API endpoint to integrate with UI.

Tools Used

  • We will use JSoup java library to extract information and Spring boot to build API on top of it.

Medium Followers Count

  • Let’s goto medium profile page and inspect the page for followers count

Followers page:

Inspect element

  • we can right click on authors follower count element to understand the html structure of that element and look whether we can extract it by classname or id or xpath etc.
  • Since it provides classname “pw-follower-count” we will try to extract it by class name.
  • We will use jsoup to extract the element by class name “pw-follower-count”. It will give us the Element object so we can get the text element by text() method.
public String getMediumFollowers(@PathVariable(value = "profilePath") String profilePath) throws IOException {
   Document document = Jsoup.connect("https://" + profilePath).get();
   Element first = document.getElementsByClass("pw-follower-count").first();
   String numOfFollowers = first.text();
   return numOfFollowers;
}

Youtube Subscriber Count

  • Now if we want to extract youtube subscriber count , we can again go to channel page and right click, then inspect element to understand html structure.
  • But in case of youtube scraping was not so simple hence, i had to spend some time to understand the html structure, then i found script var that includes all the crucial data so i read that and scrapped subscriber count from it.

Here is the code:

  • FilterAndGetSubscriber method is extracting the subscriber count , since actual data elements is nested too much and required some cleaning.
public String getYTSubscribers(@PathVariable(value = "channelId") String channelId) throws IOException {
     Document document = Jsoup.connect(YT_PREFIX + channelId).get();
     Elements script = document.getElementsByTag("script");
     Optional<Element> data = script.stream().filter(a -> a.html().contains("ytInitialData =") == true).findFirst();
     return filterAndGetSubscriber(data);
}


 /**
     * filterAndGetSubscriber
     * @param data
     * @return
     */
    private String filterAndGetSubscriber(Optional<Element> data) {
        if (data.isPresent()) {
            Element element = data.get();
            String html = element.data();
            String resultText = html.substring(html.indexOf("subscriberCountText"), html.indexOf("tvBanner"));
            String fResultText = resultText.substring(resultText.indexOf("\"simpleText\":\""));
            String subscribers = fResultText.replace("\"simpleText\"", "")
                    .replace(":", "")
                    .replace("},", "")
                    .replace("\"", "")
                    .replace("subscribers", "");

            System.out.println(subscribers);
            return subscribers;
        }
        return "Could not get subscriber count";
    }

Building API

  • Now we have two methods which scrap medium follower count and subscriber count.
  • As a next step we will wrap these method with API call so that if we want to integrate with client ( app or web ) we can consume it and build UI with it.
  • Since i like Java and Spring boot is goto framework for me to build API layer I will use it .

Dependency

  • Let’s add spring boot bare minimum dependencies to build api to get started in the pom.xml file.

API Endpoints

package com.socialstats.api;

import org.jsoup.Jsoup;
import org.jsoup.nodes.Document;
import org.jsoup.nodes.Element;
import org.jsoup.select.Elements;
import org.springframework.web.bind.annotation.*;
import java.io.IOException;
import java.util.Optional;

@RestController
@RequestMapping("/resources")
public class SocialStatsEndpoint {
    private static final String YT_PREFIX = "https://www.youtube.com/c/";

    @CrossOrigin(origins = "*")
    @GetMapping("/medium/followers/{profilePath}")
    public String getMediumFollowers(@PathVariable(value = "profilePath") String profilePath) throws IOException {
        Document document = Jsoup.connect("https://" + profilePath).get();
        Element first = document.getElementsByClass("pw-follower-count").first();
        String numOfFollowers = first.text();
        return numOfFollowers;
    }

    @CrossOrigin(origins = "*")
    @GetMapping("/youtube/subscribers/{channelId}")
    public String getYTSubscribers(@PathVariable(value = "channelId") String channelId) throws IOException {
        Document document = Jsoup.connect(YT_PREFIX + channelId).get();
        Elements script = document.getElementsByTag("script");
        Optional<Element> data = script.stream().filter(a -> a.html().contains("ytInitialData =") == true).findFirst();
        return filterAndGetSubscriber(data);
    }
}

Test

  • Now Let’s test our API to consume our API’s.

Medium api

  • Enter medium profile url along with the endpoint.

Youtube API

  • Enter youtube channelname to the endpoint.

It works , that’s it!
Here is the entire code that is used in this article .

Endpoint & Logic

ackage com.socialstats.api;

import org.jsoup.Jsoup;
import org.jsoup.nodes.Document;
import org.jsoup.nodes.Element;
import org.jsoup.select.Elements;
import org.springframework.web.bind.annotation.*;
import java.io.IOException;
import java.util.Optional;

@RestController
@RequestMapping("/resources")
public class SocialStatsEndpoint {
    private static final String YT_PREFIX = "https://www.youtube.com/c/";

    @CrossOrigin(origins = "*")
    @GetMapping("/medium/followers/{profilePath}")
    public String getMediumFollowers(@PathVariable(value = "profilePath") String profilePath) throws IOException {
        Document document = Jsoup.connect("https://" + profilePath).get();
        Element first = document.getElementsByClass("pw-follower-count").first();
        String numOfFollowers = first.text();
        return numOfFollowers;
    }

    @CrossOrigin(origins = "*")
    @GetMapping("/youtube/subscribers/{channelId}")
    public String getYTSubscribers(@PathVariable(value = "channelId") String channelId) throws IOException {
        Document document = Jsoup.connect(YT_PREFIX + channelId).get();
        Elements script = document.getElementsByTag("script");
        Optional<Element> data = script.stream().filter(a -> a.html().contains("ytInitialData =") == true).findFirst();
        return filterAndGetSubscriber(data);
    }

    /**
     * filterAndGetSubscriber
     * @param data
     * @return
     */
    private String filterAndGetSubscriber(Optional<Element> data) {
        if (data.isPresent()) {
            Element element = data.get();
            String html = element.data();
            String resultText = html.substring(html.indexOf("subscriberCountText"), html.indexOf("tvBanner"));
            String fResultText = resultText.substring(resultText.indexOf("\"simpleText\":\""));
            String subscribers = fResultText.replace("\"simpleText\"", "")
                    .replace(":", "")
                    .replace("},", "")
                    .replace("\"", "")
                    .replace("subscribers", "");

            System.out.println(subscribers);
            return subscribers;
        }
        return "Could not get subscriber count";
    }
}

Pom.xml

<!-- JSOUP Dependency -->
<dependency>
   <groupId>org.jsoup</groupId>
   <artifactId>jsoup</artifactId>
   <version>1.15.2</version>
</dependency>

<!-- Spring Dependencies -->
<dependency>
    <groupId>org.springframework.boot</groupId>
    <artifactId>spring-boot-starter-web</artifactId>
</dependency>

<dependency>
     <groupId>org.springframework.boot</groupId>
     <artifactId>spring-boot-starter-test</artifactId>
     <scope>test</scope>
</dependency>

<build>
   <plugins>
	   <plugin>
	   <groupId>org.springframework.boot</groupId>
	   <artifactId>spring-boot-maven-plugin</artifactId>
	   </plugin>
   </plugins>
</build>

Conclusion

  • In this article we used JSoup library to extract subscriber / follower information to build socialstatsendpoints. Then we wrap our logic into consumable API using spring boot app.
  • We can extend this same ideas to add Twitter , Facebook or Instagram to extend the scope of the API.

Leave a Reply