ABCDEFGHIJKLMNOPQRSTUVWXYZAA
1
Features
2
#NameURLPriceDecisionNotesColumn 15CrawlingTypeScrapingExecutes JSInterface/LanguageSchedulableIP ProxiesAnti-blocking
3
1Diffbothttps://www.diffbot.com/Free Plan up to 10k pages; $299/month for 250k pagesWorth checking out.Seems like one of the top products in the crawling and parsing space. Crawling not supported until the Plus plan which costs $899 per month. Provides nice JSON payloads for product pages but requires us to handle all site crawling, which significantly limits the value of this product.YesSaaSYesHTTP APIPartial on non-enterprise plans
4
2Apifyhttps://apify.com/Free; $49/month for StarterWorth checking out.Sounds like more than we need. Well rated on G2. The Actor templates made it extremely easy to setup a site crawler and get started. Feels like this would be extremely expensive at scale to use. Cost $0.286 to scrape just 100 items from the Midland product catalog. Took ~10 minutes to scrape just 100 items from the Midland product catalog, but I believe this was more due to the slow response times of Midland's site.SaaSWeb UIYesYes
5
3Browse AIhttps://www.browse.ai/Free trial; Starter plan is $20/monthWorth checking out.Really slick and easy to use "Robot Training" workflow. Does a great job of making it easy to setup a basic scraper and see the data it will produce quickly and easily. No built-in crawling support. Recommends work arounds in https://help.browse.ai/article/192-one-robot-to-extract-them-all, but they all involve directly providing/configuring the set of URLs to be scraped. Traversing a large product catalog automatically is a critical requirement for our use case. Seems like a deal breaker. Also seems expensive.SaaSYesYesWeb UI/APIYesYes
6
4Scraping Beehttps://www.scrapingbee.com/$49/month for 30k to 150k pagesMaybe worth a look if other options fall throughFull featured and a reasonable price. No free trial/free tier.YesSaaSYesYesWeb UI/APIYes
7
5Web Scraperhttps://webscraper.io/Free browser extension; $50/month for cloud executionMaybe worth a look if other options fall throughSimple to use but limited functionality.Chrome ExtensionYes
8
6Crawleehttps://crawlee.dev/Free; Open SourceWorth consideration if building a JS app is not a non-starter.Can be built and deployed to ApifyYesYesSupportedYes
9
7Cherrio.jshttps://cheerio.js.org/Free; Open SourceWorth consideration if building a JS app is not a non-starter.Does HTML parsing and data extraction very well. Doesn't do much else.NoLibraryYesNoNode.jsNoNo
10
8OxyLabshttps://oxylabs.io/products/scraper-api/web$75/month for 5 GBWorth checking out if blocked sites becomes an issue for us.Provides IP addresses and other utilities to avoid blocked scraping along with their scraping support.SaaSYesYesWeb UIYes
11
9Import.iohttps://www.import.io/$399/month for StarterNo; Too expensivePoint-n-click interface. Sounds like its UI is not as good as its competitors (e.g. Apify).YesSaaSYesNo/Not WellWeb UI
12
10Parsehubhttps://www.parsehub.com/Limited free trial; Paid plans starting at $189/monthNo; Too expensiveSaaSYesYesWeb UI/API
13
11Clay.comhttps://www.clay.com/Free trial; Starter plan is $134/monthNo; Too expensiveFocused on non-technical users (point-n-click interface).SaaSWeb UIYes
14
12Scrape Proshttps://scrapingpros.com/$450/month with limited featuresNo; Too expensive
15
13NetNuthttps://netnut.io/$300/month for 20 GBNo; Too expensiveFocused on seach engine and social media crawling.
16
14Octoparsehttps://www.octoparse.com/Free plan; Then $99/monthNo; Too expensiveFocused on non-technical users (point-n-click interface).YesYesYesYes
17
15Bright Datahttps://brightdata.com/$10/month for micro packageNo; Too limited focusFocused on providing IP addresses and other utilities to avoid blocked scraping.YesYes
18
16ScrapeHero Cloudhttps://www.scrapehero.com/marketplace/$200 per month per site according to https://www.scrapehero.com/pricing/No; Not flexible enoughOffers a collection of out of the box crawlers.YesSaaSYesYesWeb UI/APIYesYes
19
17Seleniumhttps://medium.com/@datajournal/web-scraping-with-selenium-955fbaae3421Free; Open SourceNo; Only crawling/scraping HTMLWeb automation library. Support for multiple languages. Steep learning curve.LibraryYesMultiple
20
18Puppeterhttps://pptr.dev/Free; Open SourceNo; Only crawling/scraping HTMLNode.js library. Web automation library.LibraryYesNode.js
21
19Playwrighthttps://playwright.dev/Free; Open SourceNo; Only crawling/scraping HTMLWeb automation library.LibraryNode.js
22
20MechanicalSouphttps://mechanicalsoup.readthedocs.io/en/stable/Free; Open SourceNo; Only crawling/scraping HTMLWritten in Python. Web automtion library.LibraryNoNoPython
23
21Moendahttps://www.mozenda.com/Limited free trial; Paid plansNo; Prices not listedYesSaaSYesWeb UI
24
22Scrapyhttps://scrapy.org/Free; Open SourceNoWritten in Python.LibraryYesNoPython
25
23Pyspiderhttps://github.com/binux/pyspiderFree; Open SourceNoLibraryPython
26
24Beautiful Souphttps://realpython.com/beautiful-soup-web-scraper-python/Free; Open SourceNoPython library. Focuses exclusively on HTML parsing.NoLibraryPython
27
25Apache Nutchhttps://nutch.apache.org/Free; Open SourceNoWritten in Java. Steep learning curve.NoLibraryJava
28
26Hertrixhttps://github.com/internetarchive/heritrix3Free; Open SourceNoWritten in Java.YesLibraryJava
29
27Web Harvesthttps://github.com/janih/web-harvestFree; Open SourceNoLooks antiquated.Java
30
28Web Magichttps://webmagic.io/en/Free; Open SourceNoYesJava
31
29Commn Crawlhttps://commoncrawl.org/FreeNoA collection of pre-scraped sites made available for free. Looks like there is some data included in this collection for both https://www.worldwidefittings.com/ and https://midlandindustries.com/, but getting access to the HTML data from these sites is not a problem we need assistance with.NoDatasetNoNoHTTP APINoNoNo
32
30Web Robotshttps://webrobots.io/$99/month/source; Free browser extensionNo; Looks half-bakedYesSaaSWeb UI & JS
33
31Pricevahttps://priceva.com/Free; $99/month for more featuresNo; Only price trackingFocused specifically on price tracking.
34
32Scrapeboxhttps://www.scrapebox.com/One time purchaseNo; Too SEO focusedFocses on SEO related tasks.Desktop AppYes w/ added cost
35
33ScreamingFroghttps://screamingfrog.co.uk/Yearly subscriptionNo; Too SEO focusedFocses on SEO related tasks.Desktop App
36
34Web Content Extractorhttps://www.webcontentextractor.com/One time purchaseNo; Too antiquatedDesktop AppYes w/ added cost