Portia lets you scrape web sites without any programming knowledge required.Create a template by clicking the elements on pages you would like to scrape, and Portia will create a spider to scrape similar pages from the website.No need to download or install anything, as Portia runs in your web browser!
Portia is completely open source! Please check out the portia github repository. Portia projects in Scrapinghub can be exported and used with the open source project, giving you all the freedom and benefits of open source.
Portia is free to use and provided as a visual spider editor for Scrapy Cloud. This means that:
I have been using Scrapinghub for more than a year for personal projects. The interface provided saves about 95% of the time normally needed for extracting information from webpages. With about an hour of learning, this tool vastly simplified my development efforts, and runs consistently and reliably. If you have a need to gather data from a series of webpages, I could not recommend Scrapinghub enough.
Portia allows us to expedite crawling simple sites. The visual spider editor and Scrapy Cloud's scheduling functionality enable us to create crawlers and run scraping jobs without touching a line of code - meaning less development time and less maintenance time. Since we began using the platform, Explore has become even better and exhaustive in its monitoring process.
Portia allows me to get the data I want on a constant basis without needing to code or download a desktop client. A fun extra is that some people I work with think I’m some kind of genius - because I’m able to extract all this data from different sites, and they're not realizing I’m doing so by merely clicking around.