Scraping Js heavy website

I want to develop a web scraper that can read html generated client-side by js. From what I’ve read I’m not going to be able to do this with just a regular html parser like floki. Which libraries can I use to get the html of a webpage that is generated client side?

See this thread. It has a lot of useful info.

You might also take a look at:

The HTML-parser portion of the scraper doesn’t need to be different to handle HTML generated by JS- after all, it’s still just HTML. Meeseeks or Floki will work fine.

What does need to be different is how you fetch the HTML. You can’t use an HTTP client like HTTPoison, you need to use something that drives something browser-ish that can let the JS evaluate. Hound and Wallaby are common suggestions.

3 Likes