Rocket Validator is a commercial project I created recently using Phoenix, that helps web developers check HTML on thousands of pages with a single click:
Basically, it’s a web crawler that, given a starting URL, finds the internal links (up to 5,000 pages per report), and validates HTML on each one, showing you the issues it finds.
The main app is a standard Phoenix app, and I’ve released 2 libraries from it:
- GitHub - jaimeiniesta/funkspector: Web scraper to extract data from web pages and XML sitemaps which is what I use to find the internal links of a URL
- GitHub - rocketvalidator/funchaku: Elixir client for the Nu HTML Checker which is a client to validate a page on Ready to check - Nu Html Checker (hosted in my servers).
This is really a re-make of a project that I maintain since years ago called Site Validator, made with Rails, that I decided to rebuild using Phoenix. The result is that the new code is much simpler, the app is much, much faster and lightweight, and I’ve dramatically reduced the server costs.
While in the Rails version I used background queues, in this new version I don’t use them. Instead I launch simultaneous processes for the crawlings and validations, that politely respect rate limits (ExRated is great).