SQL - Brings an extensible SQL parser and sigil to Elixir, confidently write SQL with automatic parameterized queries.

Hey there, I wrote this low-level library recently, its goal is simply to lower the barrier between Elixir and SQL, and it does that by providing a SQL sigil.

I’ll list a few highlights compared to raw SQL and Ecto.Query, although the goal with this library is not to replace either, but to provide fundamental safe building blocks to Ecto.SQL.Adapters.

  • Lower the barrier for DBAs to contribute to your codebase, without having to translate SQL to Ecto.Query.
  • Composable queries, no need for you to remember, when to start with select or from.
  • Interpolation-al queries, don’t fiddle with fragments and ?.
  • prevents SQL injection, by design.

You might have noticed that I’ve listed prevent SQL injection, and if you’re thinking: wait a minute, Ecto.Query is already by design preventing SQL injection.

You would be correct, However, Ecto.SQL also provides the query/2 and query!/2 functions that are vulnerable to SQL injection since they take a String and a list of params, the intention with this API, is to give you an escape hatch for executing parameterized queries.

This library is a result of a thread on the mailing list: https://groups.google.com/g/elixir-ecto/c/8MOkRFAdLZc

As this is an MVP, I’m looking for your feedback.

17 Likes

Very interesting work! It’s on my list of things to try out with EctoFDB.

My use case: FDB has no SQL so I’m thinking I could make a cli tool to query EctoFDB (similar to psql for Postgres) that uses the Ecto adapter and your SQL library.

1 Like

Pure serendipity, I’ve been looking at a very similar challenge with nimble parsec yesterday.

Namely, splitting a file full of AWS Athena DDL/SQL statements (it only accepts one statement at a time) while interpolating some variables in the DDL statements.

I’m eager to read your implementation. There are some challenging edge cases such as things like user ‘O\’Malley’ and the ubiquitous’Little Bobby;drop tables’.

1 Like

I would love to see that file, although I did start out with a version in nimble parsec, but I was pushing the limit of how I wanted to write the MVP and I didn’t wanna try to retro fit the grammar to make it composable.

So I ended up hand writing it. (this could change in the future though)

If I understand parser terminology right, then what I have written is a recursive decent parser / single pass as I read the string the AST is being built. And nimble parsec does not fit that well, but it would be great for spitting out tokens and then you could write some logic to analyze and build your AST.

The edge cases you have listed are those interpolated values or strings?

If the latter, then I’m not sure if my parser could handle those cases, as I only built it with the sigil in mind, not to migrate files of exotic SQL statements. Furthermore I have yet to implement DDL keywords in the parser and any keyword not tokenized by the parser would end up as ident tokens and passed through.

2 Likes

Did you try with Erlang’s yecc and leex?

1 Like

Yes, I’ve used that in the past, and I might use it for this in the future. But generated parsers does not come without it flaws.

For this project I wanted to see if we could get all the benefits of Ecto.Query and Elixir with its composability but with less abstraction and more control.

2 Likes

I looked closely at this, and my parser does not support backticks yet. But it was pretty simple to add:

iex(18)> ~SQL[select `Little Bobby;drop tables`].query
[
  {:select, [line: 0, column: 6, end_line: 0, end_column: 32],
   [
     {:"``", [line: 0, column: 8, end_line: 0, end_column: 32],
      [
        {:ident, [line: 0, column: 8, end_line: 0, end_column: 32],
         [~c"Little Bobby;drop tables"]}
      ]}
   ]}
]

For what it’s worth, it looks like your examples got botched. I believe that only quotes and backticks are valid SQL syntax, but with that in mind, my parser would have trouble with O\Malley, but that also smells like invalid SQL, and should be: 'OMalley’:

iex(26)> ~SQL[select  'O`Malley'].query
[
  {:select, [line: 0, column: 6, end_line: 0, end_column: 17],
   [
     {:"''", [line: 0, column: 9, end_line: 0, end_column: 17],
      [
        {:ident, [line: 0, column: 9, end_line: 0, end_column: 17],
         [~c"O`Malley"]}
      ]}
   ]}
]
1 Like

Awesome library, thank you for sharing this!

After a cursory look, my 2c:

Currently we have:

iex> ~SQL"SELECT 42"
select 42

I’d consider using ~SQL[...] or #SQL[...] in the inspect implementation:

iex> ~SQL"SELECT 42"
~SQL[select 42]

Currently we have:

x = 1 ; ~SQL"SELECT #{x}"

By convention we use interpolations with lowercase sigils but (to my dismay) we can’t have lowercase multi letter ~sql so another idea would be to have slightly different interpolation syntax, say:

x = 1 ; ~SQL"SELECT {x}"

Besides following convention I think doing this would actually be a security feature, never ever allow users to use string interpolations anywhere near SQL. Once users feel comfortable using this library that does the right thing they will inevitably (ok, I’m projecting, I definitely will) call sql = "select #{x}" ; Repo.query!(sql) and that can turn into a bug pretty fast. Using a distinct syntax like {...} prevents this. In hindsight, Repo.query! accepting sanitized strings (akin to, say, Phoenix.HTML.Safe) might have been a better default.

3 Likes

Thanks for the feedback, I’m currently working my way through the SQL grammar, once I can generate a parser from the BNF file, then we can make a decision based on our options, from what I can tell then braces are valid SQL syntax, but I couldn’t find any #.