How do I dynamically create, alter tables at runtime in Elixir?

apoorv-2204 · December 17, 2024, 6:37am

I need to dynamically create and populate tables at runtime, as well as modify existing tables and populate them while data is being read concurrently. Additionally, I want the ability to create indexes at runtime, triggered, for example, by a click event. This functionality is intended for an admin dashboard that handles large volumes of data.

katafrakt · December 17, 2024, 2:49pm

You will need to connect as a user with sufficient permissions to your database server and then execute raw SQL queries. I don’t think there is a better way. Perhaps you can try to bend Ecto.Migrator to do that, but I’m not sure it’s a good idea.

while data is being read concurrently

Well, this is more of a database problem, not Elixir problem, and you didn’t even mention which database you use. In PostgreSQL if you modify (ALTER) a table, you won’t be able to read the data concurrently.

sbuttgereit · December 17, 2024, 8:37pm

A couple caveats before I get into the meat of thing:

There are sufficient numbers of footguns when doing this kind of thing that it shouldn’t really be anyone’s first choice for many operations to which these kinds of techniques can be applied.
There are security ramifications for any application that has definitional control over the database.

I strongly suggest that anyone contemplating these kinds of techniques have a solid understanding of the database server software and its administration. A similar understanding Elixir, too, is probably needed to apply this sort of thing appropriately and safely. If you’ve got those boxes checked, understand the risks and consequences, there are places where administrative manipulation of the database server at runtime can be useful. I do this sort of stuff myself.

Finally, I’m assuming PostgreSQL. Other databases will be similar, at least from the Elixir side, assuming that the capabilities you want from your application are supported by your database vendor.

Basically, @katafrakt is correct. You’re going to be bypassing any database querying abstractions like the query builder and other Ecto niceties (e.g. Schema structs) and you’re going to send raw SQL queries to the database server from Elixir.

If you are using Ecto, you’ll be sending your database queries using Ecto SQL functionality… start with Ecto.Adapters.SQL — Ecto SQL v3.12.1. The advantage to this is you can use Ecto Repos for configuration/connection management and it’s convenient if you’re already using Ecto for other purposes. Alternatively, you can also use Postgrex directly, especially if the only kind of operations of your application are going to be doing these direct queries… and the interface there looks very similar (Postgrex — Postgrex v0.19.3).

That’s about as simple as it gets. You’re sending SQL database commands to the database.

Naturally, it’s all on you to get the correct SQL and to do it in a safe (injection free/illicit manipulation free) way. There’s huge power over the database possible here with all the consequences of that.

A couple more tips…

I’m assuming you’re working with PostgreSQL here. While PostgreSQL DDL is transactional, you must be very aware of PostgreSQL locking strategies when executing DDL against a running system; for example…

… if this really means what it says PostgreSQL itself may well make you sad. A fair number of ALTER TABLE commands will take an ACCESS EXCLUSIVE lock which means:

This mode guarantees that the holder is the only transaction accessing the table in any way.

(PostgreSQL: Documentation: 17: 13.3. Explicit Locking)

If you’re less familiar with PostgreSQL locking (or just following along from home) a short, decent little primer on this can be got at: PostgreSQL rocks, except when it blocks: Understanding locks - Citus Data

Of course, you need to be aware of things like pros/cons of building indexes concurrently: PostgreSQL: Documentation: 17: CREATE INDEX, etc.

The query/4 API calls mentioned above want to work with single PostgreSQL commands in auto-commit mode (at least as I recall). To get multiple commands into a single transactional command for the purposes of Ecto SQL/Postgrex, you can wrap them in a PostgreSQL DO block. This gets treated as a single command by Ecto SQL/Postgrex and gives you transactional guarantees for the whole block.

There was mention of trying to use the migrator… I don’t recommend that. You’re going to be fighting the fact that it really wants to apply migrations, not ad hoc SQL, and you’re not really going to gain anything from it over just issuing the raw queries. Also, my expectation is that it’s not going to be possible to achieve the use case you’ve described.

Finally, as pointed out by katafrakt, many of these ALTER and CREATE operations require database privileges that you often times don’t want given to the application database connections. One technique I use to avoid giving the application high/broad database privileges is to write database functions /stored procedures which are owned by a suitably privileged user and allow the function to be executed by the low privileged application database connection via the SECURITY DEFINER execution context. This way the database function logic can limit the privileged operation to a very narrow subset that is required for the application without granting the higher database privileges to the application which could allow unwanted changes to be permitted. You can read more about this here: PostgreSQL: Documentation: 17: CREATE FUNCTION Look for SECURITY DEFINER and be sure that you read all the references on that page (like the warning) if you’re not already well versed in this technique.