Hoping to use Rustler to create Elixir bindings for LanceDB - can you help/guide me?

eileennoonan · April 11, 2025, 7:34pm

Really appreciate this and I think it just further confirms that my next attempt is going to be to take @dimitarvp’s approach which doesn’t require async.

dimitarvp · April 12, 2025, 12:55am

Quick follow-up: I don’t like evmap. It requires an explicit .refresh() call after each write if you want the write to be visible to reads immediately – which is exactly what we need. We’re writing to the map when (1) opening a connection and (2) closing a connection. In both cases we want the changes to the map to be immediately visible since (1) when opening you want a follow-up SQL statement being executed to be able to immediately fetch the actual connection from the map and (2) when closing you would want the confirmation that it was successful to be real and not a pretense (and also once you get :ok after closing you would want to get an :error afterwards i.e. “that connection ID is no longer valid and is not found”).

So yeah, evmap is for cases when you want to delay the visibility of some of the writes for slightly later in the future. Likely useful for certain caches maybe? In any case, it’s not good for our purposes of using it as a global registry of connection pools.

@eileennoonan So it’s either DashMap or ConcurrentMap. I’ll check the latter and report back.

dimitarvp · April 12, 2025, 3:21am

Follow-up of the follow-up: ConcurrentMap requires us to wrap our values in Arc to achieve veeeery sliiiiiightly better speed.

Frak that. DashMap was apparently the best choice from the start. I am sticking to it.

eileennoonan · April 12, 2025, 1:11pm

Wow! Thanks for checking all this out. I have to say I’m planning to move slow and steady on this so I might not be able to keep up with all your posts as they come in, but thanks to you I now have a clear strategy in mind for managing connections.

dimitarvp · April 12, 2025, 2:11pm

Yeah I am done hijacking your thread now.

Will ping you after I merge my PR because by then the “blessed” ways will be fleshed out. (Currently working on code that transforms Erlang/Elixir terms to SQLite-native types.)

eileennoonan · April 13, 2025, 7:26pm

Do not worry about hijacking! This has been so directly useful!

eileennoonan · April 14, 2025, 10:09pm

I’ve been getting some help from the LanceDB Discord, and have successfully passed a LanceDB connection reference from Rust to the BEAM! Using a Rustler ResourceArc and Tokio to turn the async function into a blocking one.

Passing test here:

github.com/enoonan/elixir_lancedb

test/native_test.exs

main

defmodule ElixirLanceDB.NativeTest do
  use ExUnit.Case

  describe "Native :: Connection" do
    test "it returns a connection" do
      assert ElixirLanceDB.Native.connect("./data") |> is_reference()
    end
  end
end

Rust code here:

github.com/enoonan/elixir_lancedb

native/elixir_lancedb/src/lib.rs

main

use lancedb::{connection::ConnectRequest, database::listing::ListingDatabase, Result};
use rustler::{Encoder, Env, ResourceArc, Term};
use std::{
    collections::HashMap,
    result::Result::{Err, Ok},
    sync::{Arc, Mutex},
};
use tokio::runtime::Runtime;

#[derive(Debug, Clone)]
struct DBResource(Arc<Mutex<ListingDatabase>>);

impl<'a> Encoder for DBResource {
    fn encode<'b>(&self, env: Env<'b>) -> Term<'b> {
        let resource: ResourceArc<DBResource> = ResourceArc::new(DBResource(self.0.clone()));
        let result = resource.encode(env);
        print!("{:#?}", result);
        return result;
    }
}

This file has been truncated. show original

There’s also a more fleshed out but still pretty naive elixir-nodejs implementation in the repo, which I’ll be using until the Rust integration is caught up. Very exciting to see that test turn green

dimitarvp · April 14, 2025, 11:30pm

Nice, congratulations on v0.1! Always super satisfying to have something that works.

FWIW, I started the whole thing with my SQLite library exactly because I didn’t want a Mutex.

eileennoonan · April 15, 2025, 2:57am

In theory the Mutex shouldn’t be a problem since I heard from the Lance folks that I really only need / want one open connection at a time per DB / table.

On the BEAM side all my workers will hit that one connection, and throughput should still roar. If it does somehow become a bottleneck then I think it will mean this has been a massive success.

dimitarvp · April 15, 2025, 11:12am

That’s exactly the use-case for my SQLite library. Get one handle, use it in a number of OS threads / Erlang processes.

However, the author of the Rust SQLite library made the connection handle implement Send but not Sync which means it cannot be freely shared between OS threads (i.e. Erlang / Elixir processes).

My initial / older version of the library (files are still inside that PR that have not been deleted yet) still uses an Arc<Mutex<...>> because of that choice of the rusqlite library author(s). But I am not OK with it because if you have 20 processes hitting the same connection at the same time, they will never work in parallel, they will always take turns because of the Mutex. It will bottleneck your Elixir processes in those conditions. If they never actually contend for access then yeah, your current implementation is quite enough.

But I was not OK with the idea that I can’t have truly parallel usage – and still am not. I want an actual physical parallelism, that’s why I started the PR to upgrade my library.

I am simply making you aware of what engineering tradeoff you are picking. What use is to be able to share that object to 20 Elixir processes if they all wait for each other and take turns? Might as well just use one process in that case.

eileennoonan · April 15, 2025, 12:51pm

Yeah I mean it’s a really incredible implementation you’ve built! I learned a ton from reading through it.

I put your post there into ChatGPT and Claude and have been debating pros and cons all morning.

It’s true that it would be some time before the Mutex becomes a bottleneck.

But it also looks like I might be able to get a huge optimization by just swapping Mutex for RwLock. Gonna check in with the folks on the Lance but this is very promising.

dimitarvp · April 15, 2025, 1:57pm

Yep, that would be your best option if you don’t want to go my route. Can’t beat a RwLock in almost any circumstances.

For an LLM I recommend Gemini Pro btw. It has an insanely long context window, conversations are much longer than with ChatGPT or Claude. It pointed out some interesting things that I have also been acting on.

eileennoonan · April 15, 2025, 8:50pm

Houston, we have contact. I’ve now shared a persistent Lance database connection reference from Rust to the BEAM, sent it back from the BEAM to Rust, and used it to send some lance db metadata from Rust back to the BEAM. As far as I can tell, that’s a world first

After some help from the LanceDB discord, here’s where I’ve settled for managing connections for now:

use lancedb::Connection;
use once_cell::sync::OnceCell;
use rustler::{Env, ResourceArc, Term};
use std::{
    result::Result::{Err, Ok},
    sync::{Arc, Mutex},
};
use tokio::runtime::{Builder, Runtime};

static RUNTIME: OnceCell<Runtime> = OnceCell::new();

fn get_runtime() -> &'static Runtime {
    RUNTIME.get_or_init(|| {
        Builder::new_current_thread()
            .enable_all()
            .build()
            .expect("Failed to create ElixirLanceDB runtime")
    })
}

struct DbConnResource(Arc<Mutex<Connection>>);

#[rustler::nif(schedule = "DirtyCpu")]
fn connect(uri: String) -> ResourceArc<DbConnResource> {
    let conn = get_runtime()
        .block_on(async { lancedb::connect(&uri).execute().await })
        .unwrap();
    ResourceArc::new(DbConnResource(Arc::new(Mutex::new(conn))))
}

#[rustler::nif(schedule = "DirtyCpu")]
fn table_names(conn: ResourceArc<DbConnResource>) -> Vec<String> {
    let conn = db_conn(conn);
    return get_runtime().block_on(async { conn.table_names().execute().await.unwrap() });
}

fn db_conn(conn_resource: ResourceArc<DbConnResource>) -> Connection {
    let connection;
    {
        connection = conn_resource.0.lock().unwrap().clone();
    }
    connection
}

#[allow(unused, non_local_definitions)]
fn load(env: Env, _: Term) -> bool {
    rustler::resource!(DbConnResource, env);
    true
}

rustler::init!("Elixir.ElixirLanceDB.Native", load = load);

As per the Lance team, the Mutex shouldn’t be a bottleneck if I’m not holding onto the lock for a significant period of time. If I just lock the mutex, clone the connection, then immediately release the mutex, it will not be a bottleneck at all and I don’t even have to mess around with a RwLock or even worry about starvation. It might be something in Lance’s design that allows this - can’t guarantee it will apply to other embedded DBs.

The other neat thing is the shared static Tokio runtime. The one_cell crate lets me instantiate one Tokio runtime on boot, which I can then re-use to make every function call sync. Lance’s Python library does something similar.

@bgoosman - it might be worth checking that get_runtime bit out for Kuzu? The current implementation is not a ton of code. I will probably end up needing to expand on it but for now it’s working.

I think this connection / Mutex / sync stuff will turn out to have been the hardest part. From here on out I’m expecting it’s just a lot of fleshing out function implementations. Most of that work will be in encoding/decoding various parameter and config structs.

After that it’s OTP/poolboy and then … the world! dun dun dunnnnn mwahaahahaaaa I’m stoked.

Thanks everyone for your help and encouragement! I’m going to try to keep future convos on the repo so as to avoid spamming this forum too much.

sax · April 17, 2025, 7:00pm

Awesome! I haven’t been keeping up on implementing features, as I no longer have a business use case, but I spent some time a while ago getting an async runtime working in Rustler using tokio to manage Rust WebRTC from Elixir. I’ve kept it compiling and updating dependencies, so maybe it can be helpful to you as another example: GitHub - synchronal/specter: Headless webrtc client

I went about it in the way that it sounds like you are:

Get a basic initialization of state that can be passed back to Elixir as a ResourceArc.
Use the reference from Elixir to send a message back to the Rust runtime.
Rustler functions that mutate state in the Rust runtime. In my case I have WebRTC peer connections, which when initialized get saved into a HashMap with a UUID as the key. I send the UUID back to Elixir.
Add a NIF function that starts a tokio task with loop, returning a reference to channel sender back to Elixir.
Send a message from Elixir to the channel sender, receive the message in the task and handle (or log) the message.
Get a pid from Elixir into the Rust runtime. Send a message to that pid from Rust, handled via handle_info.
Send a message from Elixir to a tokio task via a channel, keeping track of the current Elixir pid, and returning immediately. When the task finishes its work, send a message from the task back to the Elixir pid.

I found that for my library, most of the tests could be written with ExUnit. There are definitely places where I would write unit tests in Rust, but I would personally let those emerge from specific issues and problems, or where individual functions become complex and difficult to implement with tests as verification. In my case, the integration with Elixir was the more important thing and so became the focus of my tests… if I could get the Rust to compile, I could write a test in Elixir.

dimitarvp · April 17, 2025, 7:51pm

This is great, thanks a lot. Will definitely look very closely into your code.

bgoosman · May 16, 2025, 12:25am

I successfully implemented (I think?!) get_database and get_connection in kuzu_nif using your example @eileennoonan. kuzu_nif/native/kuzu_ex/src/lib.rs at main · bgoosmanviz/kuzu_nif · GitHub

Thanks a lot.

dimitarvp · May 16, 2025, 8:06am

Looks better, kudos. My next step would be to have my own Err-like enum and have .map_err map to it. This means implementing Display (optional) and rustler::Encoder (mandatory) for your enum variant. I have that in xqlite and it’s almost seamless, works with Rust’s ? operator and all, it’s very neat.