How to convert XML to SQL in Elixir?

I am working on a project that has to convert XML data to SQL which turned out to be a surprisingly difficult task in Elixir.

Is there anything beyond Ecto when it comes to SQL in Elixir?

My current approach is the use Saxy.SimpleForm.parse_string() to generate a tuple that I cannot pattern match easily on, so it has to be converted to a list and processing that list I need a function that converts that to SQL insert statement.

As far as I can see this is all going to be custom implementation and there is very little support in Elixir (or Erlang) to create a SerDe.

Am I missing something?

Can you elaborate a little more on that, please?

Also I’d use Erlang’s xmerl, it’s very powerful though harder to start with (learning curve).

Yeah I found SweetXml that is a wrapper for xmerl.

This works surprisingly well:

  def user_report_to_sql(correlation_id, xs) do
    xs
    |> xpath(~x"//UserRprt/Usr"e,
      user_report_session_id: ~x"@sessionId"s,
      user_report_mbr_id: ~x"@mbrId"s,
      user_report_mbr_name: ~x"@mbrName"s,
      user_report_user_id: ~x"@usrId"s,
      user_report_state: ~x"@state"s
    )
    |> Map.put(:user_report_correlation_id, correlation_id)
  end
%{
  user_report_correlation_id: "123",
  user_report_mbr_id: "CORP",
  user_report_mbr_name: "COMPANY",
  user_report_session_id: "10154924313",
  user_report_state: "ENU",
  user_report_user_id: "2321"
}

Now this Map could be turned into SQL. However, this is a bit more complex.

def map_to_sql(table_name, m) do
    "INSERT INTO " <>
      table_name <>
      " ( " <>
      (m |> Map.keys() |> Enum.map(&to_string/1) |> Enum.join(", ")) <>
      " ) " <>
      " VALUES ( " <>
      (m
       |> Map.values()
       |> Enum.map(&to_string/1)
       |> Enum.join(", ")) <>
      " );"
  end

This I consider tragic. :slight_smile: Looking for a much better solution.

1 Like

It seems I need to deep dive into Ecto’s code try to rip out some of the insert* functions.

It seems there is a project in Erlang: sqerl/sqerl.erl at master · devinus/sqerl · GitHub What a nice read! :slight_smile:

1 Like

I’d recommend starting with the guides, personally. Here’s a short one on using insert_all without an Ecto schema.

1 Like

You don’t need to build query string yourself. You can just use something like

defmodule UserReport do
  use Ecto.Schema
  schema "user_reports" do
    belongs_to :correlation, Correlation
    belongs_to :user, User
    belongs_to :session, Session
    field :mbr_id, :string
    field :mbr_name, :string
    field :state, :string
  end
end

rows = [
  %{
    correlation_id: "123",
    mbr_id: "CORP",
    mbr_name: "COMPANY",
    session_id: "10154924313",
    state: "ENU",
    user_id: "2321"
  },
  %{
    correlation_id: "124",
    mbr_id: "CORP",
    mbr_name: "COMPANY",
    session_id: "10154924314",
    state: "ENU",
    user_id: "2322"
  },
  ...
]

Repo.insert_all(UserReport, rows)

Is there a way to have a repo without a database?

My problem is that we create databases from Genserver calls. Could I use Ecto Repo wo/ an actual database?

Thanks for the tip. See below. Can I use Ecto insert_all wo/ a database?

What do you mean by “use Ecto” then?

Not directly, Repo.insert_all is responsible for actually doing the INSERT.

The function that ultimately generates the SQL statement used by insert_all is adapter-specific; it implements the Ecto.Adapters.SQl.Connection.insert/7 callback.

It’s not normally something most users of Ecto would ever call directly, and so it has minimal documentation and a complicated set of inputs. But it can be used with some experimentation:

iex(4)> Ecto.Adapters.Postgres.Connection.insert(nil, "things", [:foo, :bar], [[:foo, :bar], [:bar, :foo]], {:raise, [], []}, [], []) |> IO.iodata_to_binary()

"INSERT INTO \"things\" (\"foo\",\"bar\") VALUES ($1,$2),($3,$4)"

Some of the arguments need explanation:

  • the values in the nested lists in the 4th argument (“rows”) need to be atoms (or more complicated structures - see the code) but their values are irrelevant as long as they aren’t nil. A nil will become a literal DEFAULT in the VALUES clause instead of a placeholder
  • the inner lists in the 4th argument need to be the same length. Mismatching them will produce invalid SQL like VALUES ($1,$2,$3),($4,$5)
  • the {:raise, [], []} in the fifth argument could be more complex shapes if you want an ON CONFLICT clause
2 Likes

Could you clarify what you mean by “without a database”?

It sounds like you’re suggesting that a Genserver is responsible for dynamically choosing which database to use by way of creating the database, but there is some persistence layer to you want to direct that query to.

If my impression is correct, you might find the answer to your problem in ecto’s dynamic repository documentation (Replicas and dynamic repositories — Ecto v3.9.5). Then you could use the schema + insert_all, so long as your genserver can act as a proper registry.

If my impression of what you meant is incorrect, then I’m at a loss for what you’re trying to achieve from end to end, so your clarification would be helpful for me (and maybe others) to try to assist you.

1 Like

Wow thank you.

The function that ultimately generates the SQL statement used by insert_all is adapter-specific;

Yes this is what I realised while I was trying to incorporate Ecto into the project. It was not designed in a way that individual parts can be used without the rest. At least some of the modules are like that.

My solution is way easier. We would like to support ANSI SQL with very simple tables, not even using the primary key feature (based on the domain problem we have).

If Ecto had generate_sql(:flavor, :statement_type, data) it would be suitable for us, but as you pointed out it a bit different design, which is understandable.

The current solution is this:

  defp map_to_sql(m, table_name) when is_map(m) do
    Logger.warn("Map: #{inspect(m, pretty: true)}")

    fields =
      m
      |> Map.keys()
      |> Enum.map(&to_string/1)
      |> Enum.join(", ")

    values =
      m
      |> Map.values()
      |> Enum.map(&value_to_sql/1)
      |> Enum.join(", ")

    EEx.eval_string(
      """
      INSERT INTO <%= table_name %>
        (<%= fields %>)
      VALUES
        (<%= values %>);
      """,
      table_name: table_name,
      fields: fields,
      values: values
    )
  end

And one small function:

defp value_to_sql(x) when is_binary(x), do: "'" <> to_string(x) <> "'"
defp value_to_sql(x), do: x

With this we can generate ANSI SQL inserts and it works surprisingly well. I might rewrite this using the binding syntax though.

Yes, this is exactly what I was looking for!

Thank you. I think I am going to create a new version with dynamic Ecto schemas. Thanks a lot!

Elixir → SQL. This is what I need the most. Right now we implemented it with Eex. Other than that a struct would be nice so that we have a simple definition to look at when need to modify (add or remove fields) the schema.

It will work surprisingly less well when your first record with last_name set to O'Brien arrives, and causes a syntax error. :stuck_out_tongue: In general, it’s preferred to use the placeholder syntax to avoid escaping altogether.

1 Like

We do not have a customer O’Brien yet!! :slight_smile:

I think we can just use the binding feature of the SQL lib to make this work.

Btw. what is the placeholder syntax and how would you use it in this case?