Hello all,
As part of my ongoing efforts to make an extremely fast sqlite3 Elixir driver (that steps on a Rust library) I am stumbling upon interesting problems regarding connections pools. Mind you, the technical problems have been solved but I am currently struggling with semantics and surprising mechanics. And I need an external feedback.
Description of the current state of the library
(None of what is described here is in the GitHub repo yet)
- I made the Rust part of the code take care of connection pooling. Passing options will be possible either directly or through an
Ecto.Repo
configuration. - sqlite3 doesn’t really have a notion of a “connection” per se; what it returns to you is basically a 32/64 bit integer – a handle a la the UNIX file descriptors.
- sqlite3 can and will return N different handles to the same DB file as long as N different consumers of the same DB request opening it.
- sqlite3 doesn’t encourage sharing the same handle between different OS threads (it does support it but with caveats so I decided to dodge that potentially dangerous path).
- This nicely maps to connection pools. I already have Rust tests that prove that opening the same file DB – or the same named in-memory DB – in multiple threads works fine and all threads see what the others did to the DB.
-
Here lies the main problem: when people want to open an anonymous in-memory DB (by specifying a DB name of
:memory:
orfile::memory:?<various_URL_formatted_parameters>
) then the connection pool will be giving you handles to a different in-memory anonymous DB every time you request a connection from the pool. In other words, a connection pool opened to the anonymous in-memory DB is basically a global pool responsible for all anonymous in-memory DBs your app might need, which IMO breaks the concept of an individual connection pool which should only be responsible for giving connections to the same DB.
(This is a limitation of the Rust connection pool library that pertains to sqlite3; it doesn’t handle that special case.)
My questions to the community
- Do you feel that an app using sqlite3 database(s) needs connection pooling at all? I feel that it does; after all, if somebody were to request 100K opens of the same DB then you will likely starve your system for file descriptors (one example; another is that you might run out of RAM). But my concerns here might be wrong and I am open to hear your thoughts.
- Should I leave in the surprising case of opening a connection pool to an anonymous in-memory DB that actually yields a handle to a different DB every time a connection is requested from that pool? I feel that’s an absolutely awful idea and it’s what prompted me to start this thread. However, I am not sure how to alleviate the problem.
- One way is to do away with connection pooling altogether (this introduces other tough technical challenges on the Rust side of the code).
- Trying to detect if the user wants to open an anonymous in-memory DB and skip the connection pooling in that case is a huge can of worms and a rabbit-hole – already attempted it and gave up because it started to take way too much time and energy. It’s doable, mind you, but I am not sure if that will not be barking up the wrong tree.
- Or I can just leave a big fat warning in the docs. But I am pretty sure somebody will make the wrong assumption and not read the docs and will then angrily open a GitHub issue.
I am open to your thoughts and will be grateful for a discussion.
// Dimi.