The HTML has no attributes it’s very vanilla and plain
<!doctype html>
<html>
<body>
<section class="body-copy">
<h2>Topic 1</h2>
<p>data-a</p>
<p>data-b</p>
<p>data-c</p>
<h2>Topic 2</h2>
<p>data-d</p>
<p>data-e</p>
<p>data-f</p>
<h2>Topic 3</h2>
<p>data-g</p>
<p>data-h</p>
<p>data-i</p>
</section>
</body>
</html>
I’m using Floki and I’m trying to parse it so I can create a List of maps like so.
%{ topic: "Topic 1", data: "data-a" }
%{ topic: "Topic 1", data: "data-b" }
%{ topic: "Topic 1", data: "data-c" }
%{ topic: "Topic 2", data: "data-d" }
%{ topic: "Topic 2", data: "data-e" }
%{ topic: "Topic 2", data: "data-f" }
I’m struggling to get all the P tags under each H2 with Floki.
# Try loading this html
path = "/Users/Foo/Desktop/test.html"
{_, local_file } = File.read(path)
# This will return me all the h2
Floki.find(local_file, "h2")
[{"h2", [], ["Topic 1"]}, {"h2", [], ["Topic 2"]}, {"h2", [], ["Topic 3"]}]
# This will return me the first p from a specific h2. But not all of them
Floki.find(local_file, "h2:nth-of-type(1) + p")
[{"p", [], ["data-a"]}]
# This return this first p for Topic 2 but I need 2 more p tags (data-e, data-f)
Floki.find(local_file, "h2:nth-of-type(2) + p")
[{"p", [], ["data-d"]}]
# This return this first p for Topic 3 but I need 2 more p tags (data-h, data-i)
Floki.find(local_file, "h2:nth-of-type(3) + p")
[{"p", [], ["data-g"]}]
Question
I cannot figure out how to get ONLY the P tags for each H2.