How to produce executable code using yecc?

maybe it’s not the way to go, but this was my initial guess.

I have been writing a parser for a reduced query dialect that I have partially inherited and much expanded, allowing users write things like:

plant where accession.code='2018.0047'

it’s not ready, but the missing intermediate steps are clear, except the final one: how do I have the result executed?

I am targeting as result the quote representation of the equivalent Ecto.Query.from query. for the above example the equivalent as far as I am concerned would be:
from(p in "plant", select: [:id], join: a in "accession", on: a.id==p.accession_id, where: a.code=="2018.0047")

I have been looking into the structures returned by the __schema__ functions, and all looks quite doable, I mean I know how to extract the table name from the modules, and owner and related modules and keys from the association given its name, so let’s assume that my parser does return this value:

{:from, [context: Elixir, import: Ecto.Query],
 [
   {:in, [context: Elixir, import: Kernel], [{:p, [], Elixir}, "plant"]},
   [
     select: [:id],
     join: {:in, [context: Elixir, import: Kernel],
      [{:a, [], Elixir}, "accession"]},
     on: {:==, [context: Elixir, import: Kernel],
      [
        {{:., [], [{:a, [], Elixir}, :id]}, [], []},
        {{:., [], [{:p, [], Elixir}, :accession_id]}, [], []}
      ]},
     where: {:==, [context: Elixir, import: Kernel],
      [{{:., [], [{:a, [], Elixir}, :code]}, [], []}, "2018.0047"]}
   ]
 ]}

how do I get Ecto to execute it?

The result of quote is not something you execute. Instead of building Elixir AST and then trying to get it compiled, you’d be much better off just dynamically constructing the ecto query data structure itself.

I understand your individual words, but I don’t see what they mean. “the ecto query data structure”?

you mean there’s a proper way to produce Elixir code from a yecc parser?

(added this as a question on StackOverflow)

There’s no quick way to transform an abstract syntax tree made ith yecc into something executable.

One possible option would be to write a function that converts the yecc AST into Elixir AST (the data format returned by quote) that does what you want, and then using that to create an Elixir function at compile time.

Its been a while since I dived into the Ecto source, but my recollection is that at the lowest level, Ecto does actually interpret an elixir (or very close to elixir) AST as it builds the SQL query. Therefore it may be possible to dive into what AST is produced by the Ecto DSL and mirror that. Not a trivial exercise and there’s no guarantee that the AST used by Ecto will stay the same, but I suspect its a reasonable approach to take if that’s what you really need to do.

my parser does return values according to the data format returned by quote.
“compile time” is too early. the queries are user input, built dynamically.

you don’t remember a few keywords to look for in the code. something to grep?

Then build a function that takes the output of yecc and walks it, calling the relevant Ecto.Query functions along the way to build an ecto query data structure. Trying to “execute” the AST is not something you should do dynamically based on user input.

it’s myself, defining the output of yecc. and I’m trying to understand which would be the most obvious choice for this output. I thought that producing an AST, reproducing the quote of a from query was an obvious thing to do. I counted on the ability to unquote my output, and I wonder if the limitation “(CompileError) unquote called outside quote)” is circumventable.

I fail to see any no go in executing an AST based on user input in this case. there’s a yecc grammar validating it mostly. then if it fails, it fails, where’s the problem?

what do you mean by “an Ecto query data structure”?

Elixir code is compiled not interpreted. Dynamically compiling code all the time is not going to perform well, and may lead to unbound memory growth since you’d be generating compilation artifacts all the time.

By an ecto query data structure I mean a %Ecto.Query{} struct, which is what you get when you call the functions:

SomeSchema |> where(foo: ^whatever)
1 Like

ok. this is a no-go, for Elixir.

thank you.

I need to let users insert complex queries, either guided, or typing, editing what was guidedly produced. then I need to execute them. in Python, I have been using pyparsing and ply, and the result is an object which implements the two functions count and select. so I can tell the user how many records would match the query, and which they are. that’s what I’m trying to do in Elixir as well. this need is a very hard requirement.

Right, this is entirely possible in Elixir. Build a datastructure in Yecc that represents the input. Build a recursive function that builds an ecto query by walking the datastructure yecc returns. If this seems like something you aren’t sure how to do, break the problem down. Play around with the ecto functions so that you get familiar with how to use them. Play around with walking data structures with an accumulator so that you can do simple things like count how many parts to the input there are. Then combine.

We’re happy to help with these steps, but you need to break the problem down into pieces you can work on incrementally.

2 Likes

all very reasonable, but I fail to see what you mean in practice, and the elixir code is still far too opaque to me, to browse through ecto and figure out what happens. for example: the various clauses defining from, they all end with an evaluation of an other from, except one which ends with a quoted. it makes no sense to me.

to refine my vocabulary when speaking in Elixir/Ecto, I proposed a page to be included in the documentation, which did help me understand how migrations work, but I was hoping for reviews. I received hints on how to write better SQL though.

You shouldn’t need to care how ecto works internally. Unless you’re doing some very complex stuff you should be able to recursively build up an ecto query just using the public api of it.

Sure, so let’s do some practice. What I don’t recommend is trying to look at the ecto functions and see how they work, because they’re pretty complicated internally. Technically they’re all macros, so it’s just a bit of a pain to figure out what is going on. There’s a reason for this (compile time protection from sql injection, yay!) but it’s definitely complicated.

Fortunatelly, using the ecto query macros isn’t that bad, but there’s no substitute for practice. I highly recommend working through the programming ecto book, or any of the other Elixir books that uses ecto (programming phoenix, the graphql book, etc). If you want to go for it just based on docs that’s fine too, just build some simple exercises for yourself. Here’s an example of a function from the graphql book that takes this kind of input:

%{filter: %{category: "drinks", priced_below: 10.00}, order: :desc}

And then builds an ecto query that filters / orders an Item schema accordingly:

def list_items(args) do
  args
  |> items_query
  |> Repo.all
end

def items_query(args) do
  Enum.reduce(args, Item, fn
    {:order, order}, query ->
      query |> order_by({^order, :name})
    {:filter, filter}, query ->
      query |> filter_with(filter)
  end)
end

defp filter_with(query, filter) do
  Enum.reduce(filter, query, fn
    {:name, name}, query ->
      from q in query, where: ilike(q.name, ^"%#{name}%")
    {:priced_above, price}, query ->
      from q in query, where: q.price >= ^price
    {:priced_below, price}, query ->
      from q in query, where: q.price <= ^price
    {:added_after, date}, query ->
      from q in query, where: q.added_on >= ^date
    {:added_before, date}, query ->
      from q in query, where: q.added_on <= ^date
    {:category, category_name}, query ->
      from q in query,
        join: c in assoc(q, :category),
        where: ilike(c.name, ^"%#{category_name}%")
    {:tag, tag_name}, query ->
      from q in query,
        join: t in assoc(q, :tags),
        where: ilike(t.name, ^"%#{tag_name}%")
  end)
end

If we pass the example input into the items_query function we see that it returns an ecto query:

iex(2)> query = PlateSlate.Menu.items_query(%{filter: %{category: "drinks", priced_below: 10.00}, order: :desc})
#Ecto.Query<from i in PlateSlate.Menu.Item, join: c in assoc(i, :category),
 where: ilike(c.name, ^"%drinks%"), where: i.price <= ^10.0,
 order_by: [desc: i.name]>

This query can then be executed by our repo:

Repo.all(query)

I’m suggesting that you should build a function similar to my items_query function that recursively walks through your yecc output and matches on various sub parts, reducing on to the ecto query data structure.

3 Likes

imagine I have the value corresponding to from(a in "accession"), do you think you can help me with going from this value to the one corresponding to from(a in "accession", where: a.code=="12345")?

this would help me for simple queries, where all clauses are at the same level, “multiplied” by and. it’s a start though, so thank you.
I need to first represent the complete where clause, then to fire that in one step into the query, or I keep the above limitation.
and you just answered the question which I was typing.

(my where clause include things like count(plants.images)>1, where plants is a has_many association from accession, and images likewise from plant. count is the obvious aggregation function. and this is text typed by the user.)

If you use named bindings this shouldn’t be a problem:

base = from a in "accession", as: :accession
# later
binding = :accession
updated = from [{^binding, a}] in base, where: a.code == "12345"

If you cannot use named bindings you might need to keep track of the positions of bindings, which does allow you to do [{a, ^position}] to pull out the correct binding.

If you have complex where parts you might want to consider using dynamic to build up the condition and only attach the final condition to the query.

I think I finally get your point. an AST (the value returned if I quote an expression) is a compile-time artifact, which I can compile, but not evaluate.

so instead of targeting the quote/AST, format, I should target the Ecto.Query format. and since I cannot do this from yecc (or can I? yecc is in Erlang), I need an intermediate representation. yet, for each production in the grammar (have you seen my grammar?), (production which combines simpler elements), I need a combining function/clause.

I will take a pause and read again in a few hours, but I have the impression that all the examples/hints I’ve been reading, they contain hard coded names or hard coded patterns.

1 Like