WHAT IS A WEB SCRAPER?
A web scraper is a tool that can extract data from the web.
As long as the website is public - you can scrape it.
The internet is a sea of opportunities, but only for those who see it.
E-commerce use-cases:
SEO use-cases:
B2B lead generation use-cases:
01
User Interface & Experience (UI/UX)
Rtila is a desktop app.
It has an old school vibe about it (love that spider logo & yellow/blue color scheme).
Go ahead geek out with the CPU usage at the bottom.
The flow is straight forward "New project > Inspect > Run > Results"
Different ways to create a new project:
Then inspect the page.
That's when things start to get more complicated.
This is just a Chrome window with:
Rtila can be overwhelming for people without a technical background.
There is quite a learning curve to it.
The good news is that the extra complexity comes with greater power for customization.
I would estimate that you scrape up to 80% of website with this tool.
Hexomatic
Hexomatic is a web-based tool that runs on the cloud.
It comes with a minimalistic & modern UI (and a night mode).
The UX is super intuitive - just build workflows.
The ingredients of workflows are:
Automations are like pre-built recipes
That you can use for scraping known platforms (e.g. Google search, Google Maps, Amazon)
Or other actions (e.g. extracting emails, detecting tech stack, capturing screenshots).
When there's no automation for the job,
Scraping recipes can be built for any website in minutes.
It's similar to Rtila (= inspect),
Except it's much, much less overwhelming. (and less customizable)
It's perfect for less tech-savvy people who just want to point & shoot.
It provides automatic field matching and automatic pagination.
(But still has an advanced mode using CSS selectors for more complicated websites.)
Once you got all the recipes you need,
Just chunk them up into Zapier-like automations.
Texau
Texau has both a desktop and web-based app.
The platform is my least favorite in terms of UI/UX.
(Misaligned elements tickle me in an unpleasant way )
With Texau you can:
Recipes are just a combination of spices,
But you can run a spice alone as well.
The spices page is actually well designed,
With quick filters by platform that make a smooth UX.
UI/UX score
Hexomatic has the combo of UI/UX.
Every element is pixel perfect aligned.
The tool is very intuitive to use,
And it doesn't overwhelm.
Bonus - you can switch to dark mode (crypto hacker feeling)
Rtila - 7/10
Hexomatic - 9/10 (Winner)
Texau - 5/10
Features
Nothing beats a side-by-side features comparison
(Check the table below...)
02
Rtila | Hexomatic | Texau | |
---|---|---|---|
Desktop App | |||
Cloud app | |||
Proxy | 3rd party | 3rd party | |
Proxy Rotation | |||
Email Scraper | |||
Extract SEO Meta Tags | A BIT MORE WORK | ||
Image Conversion | |||
Currency/Crypto Conversion | |||
Discover Tech Stack | |||
Discover WHOIS | |||
Extract Internal/External Links | |||
Take Screenshot | |||
Extract Schema | |||
Extract Files | |||
Translate Text | |||
Sitemap Extract | |||
Traffic Insights | |||
URL Status Checker | |||
Web Scraper | |||
Page Visualizer | |||
Attribute Selection | |||
Attribute Filtering | |||
Attribute Conditions | |||
Page Events | ADVANCED | SIMPLE | |
Page Actions | ADVANCED | SIMPLE | |
Pre-built Scraping Templates | COMING NEXT WEEK | ||
Google Sheets | |||
Email Finder | COMING NEXT MONTH | ||
Verify Email | COMING NEXT MONTH | ||
Enrich Data | |||
Social Media Scraping | |||
Social Media Automations | |||
API | |||
Webhooks | COMING SOON (Q3/2021) | ||
Google Sheet | |||
Slack Notifications | |||
Email Notifications |
Desktop vs. Cloud
Desktop means you download an app and scrape data using your PC and internet connection.
Cloud based platforms enable you to run automations 24/7, without consuming resources on your computer or needing to keep it on all the time. There is no software to maintain or configure.
Both Texau and Hexomatic offer a cloud solution.
It's perfect for recurring workflows - schedule once and forget about it.
You can also integrate Slack or add email notifications,
So when your workflow is complete,
The results will be waiting for you in Google Sheets or another SaaS.
You can use API integrations or webhooks.
Cloud is very convenient for automation as everything happens in the background.
No constant human intervention is needed.
But cloud consumes server resources.
So it's easy to run out of credits/time for big scrapes.
Since desktop consumes only your PC's resources you get unlimited usage.
At least with Rtila.
That should also be the case for Texau's desktop app,
But for some bizzare reason,
They decided to put time limits both on desktop and cloud.
Which doesn't make any sense.
Desktop also makes sense for social automations (Texau only),
Because for cloud you'll need a proxy to do any social stuff.
Proxy
A proxy is like a VPN, but for servers.
It allows the server to have a different IP address than it's own.
There are main reasons to use a proxy.
There are two reasons for this:
Hexomatic comes with free datacenter IP rotation so every request comes from a different IP address to reduce chances of being blocked. They also offer an optional residential proxy add-on using premium credits.
With Rtila you can configure multiple proxies,
But you still have to configure IP rotation manually.
This requires you to monitor while scraping and fine-tune when necessary.
With Texau you can configure a 3rd party proxy,
But there's no way to rotate IP address.
So it's pretty useless for big scraping projects.
It's still useful for social automations though.
Page Scraper
A page scraper is an interface where you can:
Rtila has an advanced web scraper.
It is highly customizable, but also difficult to learn.
It's best suited for web-developers or technical users.
Doesn't mean you can't learn it, just be ready to invest time in it.
Here's some of the unique Rtila feature:
Hexomatic has a simple web scraper.
It is much easier to use - just point & shoot.
It's designed for non-technical users.
However, they also provide the ability to edit the selectors manually and use custom CSS selectors if you are a developer (or use Chrome dev tools to find these on a page).
Texau has no web scraper at all.
Automations
Automations can be:
Rtila doesn't have automations (at least not in the sense described here).
Instead they have public templates for page scraping. (screenshot below)
Which is equivalent to some automations in Hexomatic or spices in Texau.
Common functions like extracting emails can be done via regular expressions.
The same tasks can be solved with Rtila - automations just make it a lot easier.
Automations exclusive only to Hexomatic:
Automations exclusive only to Texau:
Disclaimer: Social automations and scraping go against terms of service (ToS) of social media platforms. Use with caution, you can get your profile banned.
Workflows
Workflows don't exist in Rtila.
You can use the output of one project as input for the next one.
There is some manual work involved in that.
In Hexomatic and Texau workflows are built the same way.
You can chain multiple automations and map attributes between them.
In Hexomatic you can also add scraping recipes to workflows,
Which don't exist in Texau.
Features Score
Hexomatic is the most versatile platform.
It's a mix of Rtila (page scraper) and Texau (automations & workflows).
The product is still in beta - some features are still to come. (e.g. webhooks).
It will mature well over time just as other Hexo products.
Rtila is "just" a page scraper.
The scraping features are more advanced in Rtila,
But there's also more configuration work to it (E.g. proxy & IP rotation).
Stuff that Hexomatic can do out of the box.
Also the learning curve may be too steep - especially if you don't have lot of free time to invest.
It's also not the best tool for automated processes. (Runs on desktop & not many integrations)
Texau doesn't have a page scraper.
But to compensate they have more automations, especially for social media.
Also they have useful integrations for cold outreach. (e.g. Lemlist)
Features score:
Rtila - 7/10
Hexomatic - 8/10 (Winner)
Texau - 7/10
03
Pricing
Price varies for cloud and desktop.
Desktop only consumes your PC resources and internet connection.
So it will be cheaper.
Rtila offers unlimited usage for a one-time fee of $79.
For some bizarre reason Texau desktop has the same time limits as cloud.
They could offer unlimited desktop (like Rtila does)
It would cost them nothing,
But then they wouldn't have anything to upsell you for $500.
Cloud pricing is higher cause there are server costs.
So unlimited isn't reasonable.
Texau limits the automation time - 20 minutes / code.
Hexomatic limits automation runs - 10,000 credits / code.
It's impossible to compare with knowing how fast the server runs.
Also the amount of time to complete an automation can very
For example, if Texau's server gets overloaded,
You'll see less results for the same "time".
This is why I prefer Hexomatic,
Their pricing is more transparent.
Hexomatic provides free hosting (no need to host your RTILA instance on a VPS or computer),
Works 24/7 without needing electricity and free datacenter IP rotation (no need for a datacenter proxy).
That said, Hexomatic also has premium credits to access optional premium automations and residential proxies.
Some users complain that this is not true LTD.
However, if you buy Rtila you still need to pay for proxy services.
Then you need to configure IP rotation and monitor so you don't get blocked.
Premium credits represent external costs which Hexomatic needs to cover for you.
It only applies to managed automations (where external costs exist).
The rate at which premium credits are consumed is completely different too.
For example:
1 Google Search = 100 results = 0.04 premium credits.
1250 Google Search = 125,000 results = 50 premium credits = $5.
In short, $5/month would be more than enough.
Which is what you would be paying for a cheap datacenter proxy anyway,
But in this case you get access to premium residential proxies worth a lot more.
Plus it's just plug & play without any hassles.
The best value/$ is still with Rtila - you get full unlimited for $74.
Then Hexomatic with reasonable limits, since it runs on the cloud.
At last Texau with ambiguous time "credits" and a sketchy $500 upsell for unlimited desktop.
Pricing score
Rtila - 10/10 (Winner)
Hexomatic - 7/10
Texau - 5/10
Battle Decision
HEXOMATIC
Here's why I bought Hexomatic:
Also got Rtila as backup, in case:
But keep in mind, Rtila requires basic HTML/CSS knowledge. (I'm a web-developer).
You should at least be able to inspect the page structure in the developer console
If not you can always hire my bud Roberto Porcar to scrape for you - Click here to learn more (sorry for the Spanish)
Thought about Texau for social automations:
Ended up refunding it, read below why...
Not customizable enough
The ProductHunt spice seemed interesting.
But it doesn't return the company website.
I'll try my own scraping recipe in Hexomatic or Rtila
Good on paper, but doesn't work
Found an interesting social automation in Texau,
It can scrape a Facebook group and find matching Linkedin profiles.
Got 10/10 results.
But after comparing photos from both platforms - the results were 0/10.
Founder is high risk...
Rtila >> https://www.rtila.net/
Hexomatic >> https://hexomatic.com/
Texau >> https://texau.app/