ChromeRemoteInterface seems to be working different outside of iex

So I am using Chroxy and ChromeRemoteInterface to fetch and further parse HTML with Floki. If I get the outer HTML inside the iex I am able to fetch and then find the elements I desire with Floki, however, outside of iex, Floki is not able to find anything.

ws_addr = Chroxy.connection()
{:ok, page} = ChromeRemoteInterface.PageSession.start_link(ws_addr)
ChromeRemoteInterface.RPC.Page.enable(page)
ChromeRemoteInterface.PageSession.subscribe(page, "Page.loadEventFired", self())
ChromeRemoteInterface.RPC.Page.navigate(page, %{url: url})
{:ok, dom} = ChromeRemoteInterface.RPC.DOM.getDocument(page)
nodeId  = dom["result"]["root"]["backendNodeId"]
{:ok, %{"result" => result}} = ChromeRemoteInterface.RPC.DOM.getOuterHTML(page, %{backendNodeId: nodeId})
pre_selected_content = Floki.find(result["outerHTML"], "div.productBoxTop")

Any ideas of what might be happening?

5 Likes

Created an issue in the repo and already got an answer: https://github.com/andrewvy/chrome-remote-interface/issues/35

1 Like

For summary, if you access the page too fast from chrome before it’s loaded, you don’t get all the data since it’s not loaded yet. CRI is an asynchronous interface so you need to wait for the appropriate lifetime messages. Quite good to know. :slight_smile:

3 Likes