In attempting to parse a html table, I ran across a class that I cannot figure out how to parse. Specifically:
defmodule Embedded do
@doc """
Floki can parse class selectors with embedded spaces?
"""
def embedded() do
html = """
<tr class="team">
<td class="name">
Boston Bruins
</td>
<td class="ot-losses">
</td>
<td class="diff text-success">
35
</td>
</tr>
"""
{:ok, parsed} = Floki.parse_document(html)
IO.puts("parsed=\n#{inspect(parsed)}")
parsed_td = Floki.find(parsed, "td")
IO.puts("\nparsed_td=#{inspect(parsed_td)}")
tds = Floki.find(parsed, "td") |> Floki.attribute("class")
IO.puts("tds=#{inspect(tds)}")
# tds=["\nname", "ot-losses", "diff text-success"]
name = Floki.find(parsed, "td.name") |> Floki.text()
IO.puts("\nname=#{inspect(name)}")
# name="\n Boston Bruins\n "
ot_losses = Floki.find(parsed, "td.ot_losses") |> Floki.text()
IO.puts("\not_losses=#{inspect(ot_losses)}")
# name="\n Boston Bruins\n "
# FAILS: html clearly shows a value of 35
diff = Floki.find(parsed, "td.diff text-success") |> Floki.text()
IO.puts("\ndiff=#{inspect(diff)}")
# diff=
end
end
The name and ot-losses get selected ok.
What Floki function I pull out the text for the "diff text-success?
This Floki function should work all other classes in the table.
I have no control over the html - it was read from a website.
The html fragment is presented straight from the source page.