Portia

Scrape websites visually. No code required!

Portia lets you scrape web sites without any programming knowledge required.Create a template by clicking the elements on pages you would like to scrape, and Portia will create a spider to scrape similar pages from the website.No need to download or install anything, as Portia runs in your web browser!

Meet Portia

Open Source

Portia is completely open source! Please check out the portia github repository. Portia projects in Scrapinghub can be exported and used with the open source project, giving you all the freedom and benefits of open source.

Pricing

Portia is free to use and provided as a visual spider editor for Scrapy Cloud. This means that:

  • Building spiders with Portia is free
  • Running Portia spiders is free for low volumes
  • If you need to run Portia spiders at large volumes (ie. more than one spider at a time), you will need to purchase Scrapy Cloud units. See Scrapy Cloud pricing

See how it works!

1. Annotate a web page

1. Annotate a web page

2. Run your spider

2. Run your spider

3. Watch your spider run

3. Watch your spider run

4. Review the extracted data

4. Review the extracted data

5. Explore the extracted data

5. Explore the extracted data

See it in action

Using Portia to scrape an e-commerce website

Vastly simplified my development efforts

I have been using Scrapinghub for more than a year for personal projects. The interface provided saves about 95% of the time normally needed for extracting information from webpages. With about an hour of learning, this tool vastly simplified my development efforts, and runs consistently and reliably. If you have a need to gather data from a series of webpages, I could not recommend Scrapinghub enough.

Bill Frischling

Founder / CantyMedia

Allows us to expedite crawling simple sites

Portia allows us to expedite crawling simple sites. The visual spider editor and Scrapy Cloud's scheduling functionality enable us to create crawlers and run scraping jobs without touching a line of code - meaning less development time and less maintenance time. Since we began using the platform, Explore has become even better and exhaustive in its monitoring process.

Cédric Pegorier

Middleware Manager / Explore.fr

Lets me extract data by clicking around

Portia allows me to get the data I want on a constant basis without needing to code or download a desktop client. A fun extra is that some people I work with think I’m some kind of genius - because I’m able to extract all this data from different sites, and they're not realizing I’m doing so by merely clicking around.

Christian Eder

Founder / Quantoras