I am building a webscraper that logs in to a few different sites. I am currently using Hound and PhantomJS to scrape.
The bot is currently able to log users in, but I want to be able to store their session in my Postgres db so that at a different time I can use that session to log in and perform the bot actions again. Something that I need to consider is that this bot will start out with a couple hundred people using it concurrently.
Is it possible to have one PhantomJS server running for 100 users that spun up a GenServer and started using the bot concurrently? I just read that GhostDriver allows isolated sessions, but PhantomJS doesn’t and that every session shares the same cookie. Does anyone have experience working with PhantomJS and the cookiejar?
Will I need to find a different solution than PhantomJS if I need to have several users logging in at the same time performing different web scraping tasks on the same PhantomJS server.