Awesome! Glad you got it figured out.
A minor suggestion: use IO data instead of concatenating strings directly. IO data is a sort of composite data type meant for this exact use-case, where the result is modeled as an arbitrarily nested list of strings/characters/etc. In the example below, I’m using IO.chardata_to_string/1
, but if you’re just writing it back out to a file and have no use for further string processing, you can actually pass the chardata directly to most (all?) IO functions!
For a super small example the runtime will be essentially the same, but using IO data will definitely be faster for large files (and I think the resulting code is a bit cleaner).
Mix.install([
{:saxy, "~> 1.4.0"}
])
defmodule ExampleHandler do
@behaviour Saxy.Handler
def parse_stream!(xml_stream) do
{:ok, rev_chardata} = Saxy.parse_stream(xml_stream, __MODULE__, [])
rev_chardata
|> Enum.reverse()
|> IO.chardata_to_string()
end
def build(:open, tag, attrs) do
encoded_attrs = Enum.map(attrs, fn {name, val} -> [" ", name, "=\"", val, "\""] end)
["<", tag, encoded_attrs, ">"]
end
def build(:close, tag) do
["</", tag, ">"]
end
def handle_event(:start_element, {"b", [{"name", name}]}, state) do
{:ok, [build(:open, "b", [{"name", String.upcase(name)}]) | state]}
end
def handle_event(:start_element, {tag, attrs}, state) do
{:ok, [build(:open, tag, attrs) | state]}
end
def handle_event(:end_element, tag, state) do
{:ok, [build(:close, tag) | state]}
end
def handle_event(:characters, cdata, state) do
{:ok, [cdata | state]}
end
def handle_event(_, _, state), do: {:ok, state}
end
[xmlfile | _] = System.argv()
IO.puts("Processing #{xmlfile}")
ExampleHandler.parse_stream!(File.stream!(xmlfile))
|> IO.puts()
Works fine with unicode too =)
> elixir saxy_example.exs example.xml
<a>
<b name="SAM">π</b>
<c name="sal">text</c>
<b title="bob">text</b>
</a>