Wondering if there are any compiler gurus out there that might be able to help me. I’ve been building a dynamic query system that would allow a user to compose a relatively complex Ecto query using a JSON schema with dynamic naming for keys and joins and such. Since Ecto requires that query aliases in joins have to be compile time atoms, we’ve had to pivot to a system that assigns individual joins a compile-time generated name, and convert the aliases to user-defined keys when we select the values back out so the client can identify their values.
I’ve created a list of atoms (using a simple naming pattern :column_x
where x is a number) as a config variable to serve as identifiers. Each key on the JSON incoming schema is assigned an identifier when the request comes in and I am later using the following macro to implement the individual joins dynamically:
defmacro join_column(query, qual, binding, expr, column_id, clause \\ true) do
column_branches =
Enum.map(@column_ids, fn c ->
{
:->,
[],
[
[Macro.escape(c, unquote: true)],
Builder.Join.build(query, qual, binding, expr, nil, clause, Macro.escape(c, unquote: true), nil, nil, __CALLER__) |> elem(0)
]
}
end)
column_fn = {:fn, [], column_branches ++ [{:->, [], [[{:_, [], nil}], query]}]}
quote do
unquote(column_fn).(unquote(column_id))
end
end
The goal here is to use a similar syntax to Ecto, and because I’m able to Macro.escape
the atom, I can use the identifiers to reference the joins and satisfy Ecto’s compile-time atom requirement. This however, balloons compile time and memory. We’re allowing users to add up to 50 columns per report, so the problem of generating 51 anonymous functions per macro instance is further complicated by the fact that this macro is used in conjunction with another parent macro to link the join functions to individual Ecto schemas, of which we have potentially hundreds due to our db being organized as a star schema.
My boundaries seem to be:
- The number of columns/schemas is a hard requirement, so I can’t simply reduce the number of columns to reduce overhead.
- The compile-time requirement makes it so that I can’t use any kind of lookup to store the identifiers, since escaping the return value would still be a reference to a runtime variable.
- We can’t use keys to reference the schemas, since we might want to include the same schema more than once with different parameters (filters) around the join.
Does anyone know if there is any way to make this more performance with respect to compilation time and memory overhead? I’m wondering if there is either a different way to write this, or some other syntax that I’m missing like case
that might compile faster to the same bytecode. I’m trying not to expand too much and keep the scope of my problem narrow enough to identify the problem. Hopefully this is clear enough for y’all to get the basic gist, but appreciate any help!