Using Tesla to perform a lookup on a public website

I’m trying to perform a lookup of a public-facing government web site to find licensing information on certain professionals. (See https://secure.utah.gov/liv/search/index.html )

I can load the page using tesla with its mint adaptor but need to now perform a POST, which appears to read part of the form data using JQuery.

The submit button is:

<input type="submit" value="Search" onclick="pageTracker._trackEvent('Search', 'By Name', 'Free'); setProfessions();">

The setProfessions function is:

function setProfessions() {
  $('input:checkbox:checked.licenseType').each(function () {
    $('<input>').attr('type','hidden').attr('value', $(this).val()).attr('name', 'professions').appendTo('form');
  });
}

(It appears the pageTracker._trackEvent script is part of the Google Analytics on the page.)

The request sent over the wire is:

Request
POST /llv/search/search.html HTTP/1.1
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Origin: https://secure.utah.gov
Cookie: JSESSIONID=fa808df4bb17cee802c57bb3c7d8; TS01bdb7d2=0143bf517010277adab28f63a21b4b94ccd906dc9560f6e061893b1cde8cf4bb926ae1270d0d55201c82143ef6534344f655d9588c6f4af6af8ce1ded94e9257c77dfbee6c; __utma=128287630.1742368486.1591802595.1591810615.1591816490.4; __utmb=128287630.2.9.1591816594569; __utmc=128287630; __utmz=128287630.1591802595.1.1.utmcsr=(direct)|utmccn=(direct)|utmcmd=(none); _ga=GA1.2.1742368486.1591802595; _gat_UA-103830962-11=1; _gid=GA1.2.731830168.1591802595; TS01959f26=0143bf5170cddb06b2628645b42b516df797f7f4f171eee6ee7ff0066afb789be3cef48b9fd7318eeff57bb54412d734fe511bd2f2; __utmt=1; TS01959f26_77=08f156120cab2800bcd2801cf30c2b76e6ce0d21176587d514fd5c1bdd2dae547dca300aea61a5577e55dc465a5d1fa90860c69dd38240003ea72082daf55dd9e6126a7c4ea30f412fef20a05a890c3a4819469b6c38866480328690ce775fcfb40e66968958ccc12126549da8133a87eb122ba58bf2eb55; fontsize=90%25
Accept-Encoding: gzip, deflate, br
Host: secure.utah.gov
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_6) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/13.1.2 Safari/605.1.15
Accept-Language: en-us
Referer: https://secure.utah.gov/llv/search/index.html
Connection: keep-alive

So it appears the button sends a post to a different URL with the selection from the form.

My question is:
Is there a way to synthetically create the form-data and encode it to be able to send the POST method from within my Phoenix application and then get a response back from the site for parsing?

They do not have a public API, so I’m stuck trying this round-about approach.

If there is a better set of tools to meet this need (other than Tesla), I’m happy to hear recommendations.

Thanks,

:wave:

Is there a way to synthetically create the form-data and encode it to be able to send the POST method from within my Phoenix application and then get a response back from the site for parsing?

Yes, at least with httpoison, here’s a test case that submits a form.

I haven’t used tesla, but Tesla.Middleware.FormUrlencoded and Tesla.Middleware.FormUrlencodedTest (for examples) might be relevant.

I can do a regular post with Tesla. The problem is that the POST is not url-encoded. It appears the JQuey function appends hidden input fields into the form for posting, and then applies gzip to the form, along with the session token from the original page. I’m looking for any assistance in the best way to mimic the functionality of the JQuey function that is appending the data to the form on submission. I think I need to:

  1. Load the index.html page to get session tokens
  2. Append hidden fields with selection values to the form
  3. Post the form to index.html, along with the session token from #1
  4. Receive the response (which is actually the page search.html)
  5. Parse the response for the items I need.

It is step #2 that I’m having problems with. If anyone has done something similar, I’d appreciate some pointers.

Thanks

Not sure if anyone can help unless they used the exact some code though. If you can paste some code that does this hidden fields addition to the form plus how are they [not] encoded then we could have something to work with.

You might be better off with a browser driver like https://github.com/elixir-wallaby/wallaby or https://github.com/HashNuke/hound.

Thanks, I’ll check them out…still digging out from 10-days with no connectivity on vacation.

So I’ve made progress using https://github.com/elixir-wallaby/wallaby but am now getting a Wallaby.StaleReferenceError.

The code sample:

def search_for_mp(mp_name) do

    Application.put_env(:wallaby, :base_url, "https://secure.utah.gov/llv/search")
    {:ok, _} = Application.ensure_all_started(:wallaby)

    opts = %{ javascriptEnabled: true,
      loadImages: false,
      version: "",
      rotatable: false,
      takesScreenshot: true,
      cssSelectorsEnabled: true,
      nativeEvents: false,
      platform: "ANY",
      unhandledPromptBehavior: "accept",
      loggingPrefs: %{
        browser: "DEBUG"
      }
    }

    {:ok, session} = Wallaby.start_session([%{capabilities: opts}])

    IO.inspect(session)
    # IO.inspect(@submit_button)
    searchName = Wallaby.Query.xpath("//div[@id='searchName']")

    session
    |> Wallaby.Browser.visit("/index.html")
    |> Wallaby.Browser.find(searchName)
    |> Wallaby.Browser.fill_in(@name_field, with: mp_name)
    |> Wallaby.Browser.click(@submit_button)

  end

When I run the code I get the following error message:

%Inspect.Error.{ message: “got Wallaby.StaleReferenceError with message "The element you are trying to reference is stale or no longer attached to the\nDOM. The most likely reason is that it has been removed with Javascript.\n\nYou can typically solve this problem by using find to block until the DOM is in a\nstable state. …}”

It looks like my use of javascriptEnabled: true in the capabilities map passed to the session is not being picked up, so I suspect that is causing part of the problem, but I can’t find any good references in how to set the session to allow javascript (which the submit button invokes).

Any suggestions appreciated.