ErlangSolutions
How to do web crawling in Elixir - New webinar
Following up from his talk at ElixirConf EU Virtual, our colleague Oleg Tarasenko will be joining us on the webinar to dive deeper into Crawly, the web scraping framework he created in Elixir.
In this webinar he will discuss what web scraping is, why it is valuable and how Crawly makes it easy.
The webinar will demonstrate a real example using the Elixir Radar job board.
Register at https://www2.erlang-solutions.com/crawlywebinar2
Most Liked
joddm
I would surely welcome some more intermediate/advanced guides on web scraping. Almost all blogs/tutorials on this topic is comprised of; 1. install lib 2. basic xpath selectors 3. save to csv.
Things I am wondering about:
Persistence strategies - do we save the html to a object storage, then scrape it and save data we need to database?
Recurrent scraping - how to scrape the same pages over a period of time? Strategies for good logging for error detection when a page has changed? How do we handle incremental updates on a field or web page?
Spider structuring - do you write a more general spider that can work for general fields across many web sites, and have more custom spiders to get “special” data from each page, or do we write a custom spider for each page?
Spider orchestration - how do we monitor these x number of spiders and scheduling? How do we prevent ddos’ing and get banned?
Probably more stuff that I even don’t know that I don’t know about. 
If anyone has any available resources, please share
ErlangSolutions
Hey Joddm,
Thanks for the reply. I will pass this on to Oleg from our team who is hosting the webinar. He will likely have some valuable information on the above.
ErlangSolutions
Sorry for the mix up.
This is tomorrow, July 1st.
The website date was out of date temporarily.
The webinar will be recorded and all registrants will receive a copy via email.







