⏱ Reading time: 9 min

We all know, developers are known to be lazy people. I’m a developer, but I honestly don’t appeal myself as lazy as people judge the average developer. I like to optimize and automate repetitive tasks, and this involves always some hacky stuff I must do to accomplish my goal. That being said, I want you to know that the story I’m going to tell you is not about laziness… it’s about optimization 😎

At MOLO17, every day from 1PM to 2PM we enjoy the lunch break. We are used to go to a pub in a town near our office. It offers a casual setting, a good choice of courses and the owner is a crazy nice person. A great place to enjoy a meal with all the team!

Every day the pub’s menu changes with different courses. And on a daily basis, the owner – let’s call him Bob – always shares on Facebook and WhatsApp a picture of a blackboard in which he writes down the menu. In this way, the customers will know what kind of courses will be served for lunch.

Every day, I opened his Facebook diary and posted the daily menu into our dedicated Slack channel (yes, we have even a #lunch channel).

The blackboard Bob uses for the daily menu. Yum!

And here comes my laziness optimization-ness. The main “problem” of this process was that I had to manually check every day, starting from 11AM, if Bob uploaded the menu on his Facebook diary. Since checking other’s Facebook page is not my job, I started a funny weekend project for automating this stuff.

Requirements

What I needed was a Slack bot. Such bot would have checked one of the Bob’s social pages, and if it had found some new menu pictures, it would have posted them inside our #lunch Slack channel. Moreover, the process would have been triggered every 5 minutes, between 11AM and 1PM. In fact, this is the time interval in which Bob usually uploads the blackboard pics.

Quick feasibility studies

Bob uploads the menu pictures on his Facebook diary, and often on his WhatsApp status. Hence, I explored how to integrate with one of these systems for getting the freshly uploaded picture.

Obviously, I chose Kotlin as language for letting my idea come to life.

WhatsApp

I decided to start with WhatsApp. Although I knew that such company doesn’t expose a public API, I wanted to explore a bit its implementation. At least I would have learnt something new!

I started checking out how WhatsApp works under the hood. I found an interesting and well documented project about a reverse engineering of its chat protocol. Therefore, I started implementing my own version.

After an intensive Sunday afternoon, I was able to connect to the WebSocket that WhatsApp exposes and to log in by rendering a QR code given a payload (as the Web version does), that I subsequently scanned with my mobile phone. It was a but painful, due to the RSA keys manipulation that the WhatsApp client performs. Once logged in, I got the ID and the name of some chats, just for testing purposes.

Investigating more on the WebSocket events received when opening the user status page, I ended up by just throwing away the idea of using this mean for getting the pictures. All the media that WhatsApp shares are (obviously) encrypted, and I would have needed to apply other alterations to the handshaked key pair for decrypting them.

This stuff would have required me a lot of effort. Instead, I wanted something simpler and easier to maintain as weekend project. Hence, I moved my focus to Facebook.

Facebook

Facebook exposes certain data under a public API, called Graph API. This API can be accessed using a token obtained after performing a login with a set of predefined UI components. Integrating such components inside a mobile app or a web app is pretty simple. However, my need was to perform an authentication without a user interface, which is seems that isn’t directly possible.

Thus, I chose to read Facebook data by scraping them directly from the HTML code. I used a helpful tool called SeleniumHQ, used for web pages automation and testing. It’s a server which comes with a bunch of drivers written in almost all of the most popular languages. The Java version was my way to go.

A bit of structure

A software using the Selenium driver needs a Selenium server up and running. Hence, I looked for a Docker image to setup my instance quickly and without worrying too much about infrastructure. In particular, I chose the StandaloneChrome image.

Docker

I wrote down some ideas on how to organize my bot, and I came up by choosing Docker Compose to orchestrate 3 containers, each one serving a specific scope:

  • selenium: obviously, an instance of the SeleniumHQ server. It exposes the port 4444 for allowing drivers to connect;
  • lunch-bot: a rest server written in Kotlin that exposes an endpoint for checking the Facebook page looking for the blackboard pictures. It exposes the port 8080 for accepting requests;
  • cron: a simple Alpine Linux container which executes an HTTP GET call to the lunch-bot. Such call is scheduled by a cron job to run every weekday from 11AM to 1PM.

Slack’s Incoming Webhooks

For interacting with Slack, I read about Incoming Webhooks. They are a simple API which allows us to post messages and more into a Slack channel, just by performing an HTTP POST request.

We can configure a Webhook to post to a specific channel of a specific workspace, and for each Webhook set up Slack makes available a different URL, used for the POST request.

Firebase Remote Config

For getting Bob’s pictures, Facebook obviously requires a logged-in browsing session. Hence, the most immediate way to get a valid one was to inject my credentials in the HTML code using SeleniumHQ, and then submitting the form.

This solution will have worked, however it will have required me to deploy again my Kotlin application only for changing the Facebook credentials.

Firebase to the rescue!

I setup an application project on the Firebase Dev Console and enabled the Remote Config feature. In such way the Bot can retrieve the credentials on each run, and I will be able to swap them at any time if needed.

In addition, I implemented a pretty dumb encryption method, just to not store a plaintext password as remote config.

With this setup in mind, I started writing the bot.

The Lunch bot

The bot is a Kotlin REST server, written using Ktor. It exposes a single endpoint, which accepts an HTTP GET request.

Little architecture

The architecture is pretty simple. There is a main layer which contains the logic that the Bot needs to accomplish my requirements.

Then, there is a storage layer used for persisting some values, like the last run timestamp and the identifier of the last obtained picture. Such layer is an abstraction, hence I can rewrite my implementation by using a proper database, if I’ll really need to (I’m using plain text files at the moment Β―\_(ツ)_/Β―).

Finally, I made a small networking layer used for the Slack communication, just to be clean.

I setup the Ktor server on the main method, together with all the class hierarchy.

PS: I always apply Inversion of Control. Apart from the benefits you get when testing, I consider it a cleaner way to write code, decoupling classes instantiation from instance usages. Honestly, you should use it too, even for simple projects!

Quick overview

The REST call entry point is represented by a suspending lambda thanks to Ktor. When the GET handler is invoked, the execution flow passes to the LunchBot class, which contains the logic we need to apply.

embeddedServer(Netty, port = 8080) {
  install(AutoHeadResponse)
  routing {
    get {
      val result = bot.start(this)
      ...
    }
  }
}.start(true)

The Bot uses a Configuration class for fetching the Facebook credentials from Firebase Remote Config.

Then, the Bot gets from the storage layer the last execution timestamp. Such timestamp is stored only if the run obtains valid results. Also, if the last successful run completed on the current day, the execution is skipped.

After that, the Bot uses a FacebookFacade class, which wraps all the operations that are performed using SeleniumHQ. With such class, it performs a login with the given credentials by injecting them into the text inputs of the HTML page and submitting the form.

For scraping the data, I use the mobile version of Facebook instead of the desktop one, mainly for two reasons. First, the desktop version contains plenty of Javascript magic: for a reliable parsing, is not the way to go. Also, Facebook has a mechanism for recognizing user’s browsing behavior. Thus, it can determine whether a visitor is a bot instead of a real person, blocking then all the network traffic. After some research I found this blog post where the author had the same problem I experienced (God bless the internet), and following his suggestion I changed the implementation using the mobile website.

Finally, the Bot lands into the Bob’s pictures page. It relies on the chronological sort made by Facebook when displaying the photos, from the most recent to the last. In this way, I can state that if a picture appears before the last obtained one, then the photo must be a new one. Also, an image is discarded if was not uploaded during the current day.

From the pictures HTML, the Bot extracts the Facebook internal URI (like https://scontent-mrs2-1.xx.fbcdn.net/..), and posts it to Slack using the networking layer.

Conclusions

This weekend project allowed me to master a lot concepts. I grew up my knowledge about building a REST server, that Ktor makes very straight-forward. Also, I sharpened by Docker skills. How cool is Docker? I’m not dubious on the fact that it has become a standard. Even mobile app developers like me can get servers up and running in no time.

As an improvement, I could use AutoML (the brand new Google‘s machine learning as-a-service platform) for filtering out pictures that does not contain something recognized as a blackboard. But man, it sounds a bit overkill πŸ˜›

I hope you got some inspirations for building something similar by your own! Thanks for reading!

Stay tuned: @damianogiusti – molo17.com