Isoxya

Web Crawler & Data Processing System


Isoxya is a web crawler and data processing system. It can process websites with tens of millions of pages, and extract and transform that data in myriad ways. This allows it to power many different types of software, in many different industries.

Although it can be considered as an SEO web crawler, the potential of Isoxya is far too big to be reduced to one purpose and one industry only. Rather, it defines a plugin system, abstracting away the complexities of running a large-scale web crawler, solving many of the challenges in building such a system in a robust and scalable manner, whilst providing a straightforward, considered interface.

Websites can be checked for SEO, e-commerce data can be extracted, content can be audited, and human language can be analysed—all using the same crawling system. If it’s possible to write a small script to process data from a single webpage, it’s likely possible to use Isoxya to process data from millions of pages, with minimal or no code changes.


Benefits

Crawling as a Service

You concentrate on your core product; we concentrate on processing the data and streaming it to you.

Scalable

Multi-computer, designed for close to 24/7 operation, with automated error recovery and backlog queues.

Fast

Crawls typically start and begin streaming data within seconds; no ‘crawl finalisation’ stage; analyse data immediately.

Large Crawls

Tested with sites with millions of pages; designed to scale to sites with tens of millions of pages.

Tiny Crawls

Supports many-tiny-site workloads; able to process tiny sites end-to-end within seconds, cost-effectively.

Flexible

Not just an SEO crawler: multi-industry, multi-purpose; Spellchecking, Data Mining, Machine Learning…


How it works


Isoxya is currently in private beta, but we’ve recently opened registration to allow interested companies or individuals to get a sneak preview of what we’re planning to release publicly later in 2020.


Our spiders would like to visit you

Check what we can find while crawling through your website! We’re offering 1000 pages for free, with data available for 7 days.