Request for feedback: does an Elixir sqlite3 driver need a connection pool?

Hello all,

As part of my ongoing efforts to make an extremely fast sqlite3 Elixir driver (that steps on a Rust library) I am stumbling upon interesting problems regarding connections pools. Mind you, the technical problems have been solved but I am currently struggling with semantics and surprising mechanics. And I need an external feedback.

Description of the current state of the library

(None of what is described here is in the GitHub repo yet)

  • I made the Rust part of the code take care of connection pooling. Passing options will be possible either directly or through an Ecto.Repo configuration.
  • sqlite3 doesn’t really have a notion of a “connection” per se; what it returns to you is basically a 32/64 bit integer – a handle a la the UNIX file descriptors.
  • sqlite3 can and will return N different handles to the same DB file as long as N different consumers of the same DB request opening it.
  • sqlite3 doesn’t encourage sharing the same handle between different OS threads (it does support it but with caveats so I decided to dodge that potentially dangerous path).
  • This nicely maps to connection pools. I already have Rust tests that prove that opening the same file DB – or the same named in-memory DB – in multiple threads works fine and all threads see what the others did to the DB.
  • Here lies the main problem: when people want to open an anonymous in-memory DB (by specifying a DB name of :memory: or file::memory:?<various_URL_formatted_parameters>) then the connection pool will be giving you handles to a different in-memory anonymous DB every time you request a connection from the pool. In other words, a connection pool opened to the anonymous in-memory DB is basically a global pool responsible for all anonymous in-memory DBs your app might need, which IMO breaks the concept of an individual connection pool which should only be responsible for giving connections to the same DB.

(This is a limitation of the Rust connection pool library that pertains to sqlite3; it doesn’t handle that special case.)

My questions to the community

  • Do you feel that an app using sqlite3 database(s) needs connection pooling at all? I feel that it does; after all, if somebody were to request 100K opens of the same DB then you will likely starve your system for file descriptors (one example; another is that you might run out of RAM). But my concerns here might be wrong and I am open to hear your thoughts.
  • Should I leave in the surprising case of opening a connection pool to an anonymous in-memory DB that actually yields a handle to a different DB every time a connection is requested from that pool? I feel that’s an absolutely awful idea and it’s what prompted me to start this thread. However, I am not sure how to alleviate the problem.
    • One way is to do away with connection pooling altogether (this introduces other tough technical challenges on the Rust side of the code).
    • Trying to detect if the user wants to open an anonymous in-memory DB and skip the connection pooling in that case is a huge can of worms and a rabbit-hole – already attempted it and gave up because it started to take way too much time and energy. It’s doable, mind you, but I am not sure if that will not be barking up the wrong tree.
    • Or I can just leave a big fat warning in the docs. But I am pretty sure somebody will make the wrong assumption and not read the docs and will then angrily open a GitHub issue. :grinning_face_with_smiling_eyes:

I am open to your thoughts and will be grateful for a discussion.

// Dimi.

1 Like

Out of curiosity, what are the slow parts in the current SQLite driver?

Oh, I didn’t mean that sqlitex is slow. I meant my statement more along the lines of “I want my competing implementation to not be slower than the others”.

I’m aiming at these selling points with my library:

  • The lower-level library is in Rust. In my eyes this reduces friction for compiling it on more platforms easily (we all know how difficult it could be to compile C/C++ code on Windows even for programmers; obviously it’s doable but with a nontrivial amount of friction nonetheless).
  • More features and finer control over sqlite’s operations (f.ex. allowing Rust stored procedures of sorts, or using the backup/restore API, or the changeset/history API, etc.)
  • Integration with Ecto 3.
  • Making sure it doesn’t break the BEAM’s guarantees of responsiveness (to that end I’m still pondering using an asynchronous sqlite3 library).

integration with ecto3 is definitely needed, but, having looked into it, it’s an elixir problem, not a nif problem, almost all of it is re-implementing the glue between ecto api and the lower layer, and even using e.g. postgrex as a template, it’s incredibly dense code that made my eyes cross, and there’s no deep “operating manual” for creating an Ecto3 adapter yet… So many things I would like to have ecto adapters for… I gave up pretty early on. That said, it would be extremely useful if someone did implement an ecto3 layer, but if you do, I recommend doing without regard to the underlying sqlite driver, since that would help everyone.

I’d thought about doing an sqlite api in zigler - I suspect it would be pretty easy, though I don’t have compilation in windows right now for I don’t know what reason. Unfortunately, too many other things on my plate.

That’s true. But when I tried to use sqlitex a while back I wasn’t happy how many sqlite3 features aren’t there so I figured I’ll combine adding features with Ecto3 support with an easy cross-OS compilation (Rust).

I also pondered using async Rust for the sqlite3 underlying library (the sqlx crate) but found no good arguments for it. There’s no point in returning futures to the Elixir and then awaiting them.

I’d likely go for just trying to recognize if the user wants an anonymous in-memory DB and provide special treatment there.

Still very interested if somebody has an opinion on the original topic though.