Integrating Rollbar to Scrapy

Scrapy logs

For the past three (3) years, I’ve worked on multiple Scrapy projects with lots of spiders. Most of them are scheduled via scrapyd, a JSON API to schedule spiders. Sometimes, this spiders go kaput for various reasons such as change in layout, change in URL, being blacklist, among others. Checking the logs for them one-by-one can be time-consuming – so here’s where Rollbar comes in.

Rollbar is an error monitoring service that groups similar errors and gives you insights which one occurs the most. It can even help you track which commit/versin introduced the bugs. This way, you can discover bugs faster making it quicker for you to fix them.

Installing Rollbar for Python

Fortunately, there’s pyrollbar. You can install it via pip:

Integration pyrollbar to your Scrapy spider

Then, on your base spider (the spider that your rest of your spiders will extend, hook an instance of the RollbarHandler to the loggers of scrapy, twisted, and the spider itself.

Enjoy!

Written by

I'm Mikko, an overseas Filipino worker based in Hong Kong.

LEAVE A COMMENT