UUIDv7 - A UUID v7 implementation and Ecto.Type for Elixir - based on Rust

Hello everybody :wave:

Recently, some of my colleagues talked about database ids and uuids and their problems, and I remembered the pain of working with randomly distributed primary keys. They’re nice at first but then you have to index and order by a different field like created_at, while serial and bigserial already have an order and are indexed as the primary key.

UUIDv6, UUIDv7 and UUIDv8 are new standards to deal with issues found in UUIDv4 and earlier. I especially liked this post that analyzes the new standards: https://blog.devgenius.io/analyzing-new-unique-identifier-formats-uuidv6-uuidv7-and-uuidv8-d6cc5cd7391a

My favourite is UUIDv7 because they’re like UUIDv4 but the first characters are based on a millisecond timestamp; it seems like a very small change.

Elixir doesn’t have a common implementation of UUIDv7 yet, and Ecto is based on UUIDv4. So I decided to build one based on a Rust package https://crates.io/crates/uuid, which is relatively mature.

The new library is called UUIDv7 and is available on Hex UUIDv7 - Hex.

Because it’s based on Rust, the UUID generation is a whopping 72% faster than the default Ecto.UUID version 4 generator. NIFs are precompiled and generated for most platforms.

It’s easy to set up, you only have to change one line:

def App.Schemas.User do
  use Ecto.Schema
  @primary_key {:id, UUIDv7, autogenerate: true}
end

You can verify the UUIDs are ordered by running a small test:

uuid1 = UUIDv7.generate()
uuid2 = UUIDv7.generate()
uuid3 = UUIDv7.generate()
uuid4 = UUIDv7.generate()

assert uuid1 < uuid2
assert uuid2 < uuid3
assert uuid3 < uuid4

Though you may have to add Process.sleep(1) between the generations to skip one millisecond (UUIDv7 is based on milliseconds and ends with random bits)

Since the performance difference between Rust-based UUID generation and Ecto.UUID is so large, maybe it could be a motivation to write other more commonly used functions as NIFs from more compute-efficient languages?

You can check the benchmark here: UUIDv7 - Benchmark

GitHub: UUIDv7 - GitHub
Hex: UUIDv7 - Hex

43 Likes

Thank you for making this!

1 Like

New proposed standards. While it looks like UUID v6+ have made a lot of progress, they haven’t been adopted as standards yet: draft-ietf-uuidrev-rfc4122bis-14 - Universally Unique IDentifiers (UUID)

This isn’t to say that what ends up being adopted will be much different than has appeared in the various drafts, but for some the formality can matter.

As an aside, it looks like the scope of the proposed standard has expanded as they’ve progressed through the process to actually revising the UUID v1 through v5 standards as well: Revise Universally Unique Identifier Definitions (uuidrev)… the revisions to previous standards are very limited to doing things like correcting errata (unsurprisingly).

5 Likes

Also there is uniq for UUIDv7 genertion - pure Elixir afaik.

4 Likes

I wonder what’s the difference between that and ULID?

1 Like

First of all, ULID is not valid UUID. It has compatible length, but not compatible format.

2 Likes

Yup, UUIDv7 and Uniq are compatible because of the shared standard, but uniq is a bit slower because it’s 100% Elixir. It’s comparable with Ecto.UUID.

iex(1)> UUIDv7.generate()
"0188f846-191e-7f32-81f1-871f64b71d6b"
iex(2)> Uniq.UUID.uuid7()
"0188f846-1ae4-7205-8cf4-ac0be8a620a1"

Benchmark:

Name                     ips        average  deviation         median         99th %
uuidv7                1.75 M      570.22 ns  ±3940.19%         500 ns         667 ns
uniq (uuid v7)        1.07 M      937.20 ns  ±1852.78%         916 ns        1000 ns
ecto (uuid v4)        1.02 M      978.17 ns  ±1593.54%         958 ns        1042 ns

Comparison:
uuidv7                1.75 M
uniq (uuid v7)        1.07 M - 1.64x slower +366.98 ns
ecto (uuid v4)        1.02 M - 1.72x slower +407.95 ns
8 Likes

Are there any plans for this to be made part of the Ecto standard library?

I can’t say for everyone, but I don’t think so.

You can always send a proposal, but since it can be used easily as an external library, I don’t think there’s a valid reason.

3 Likes

I think using UUIDv7 can make it easier to do horizontal scaling (using multiple DB servers, i.e. sharding). Correct? Is there any tutorial on the topic? Thanks.

A quote from https://uuid7.com/

  • Concurrency and Distribution: In distributed systems, generating unique, sequential IDs can be a challenge. UUIDv7 can be generated concurrently across multiple nodes without the risk of collisions, making it suitable for distributed architectures.
5 Likes

Many will probably stick with UUIDv4 for years to come, but this looks great :slight_smile:

1 Like

@martinthenth Could you please make ecto an optional dependency? I don’t see why it’s required since not every project that would use your package would really need ecto stuff and so there would be 3 dependencies less (which btw. is half of all deps).

Required: uuidv7 -> rustler_precompiled -> castore
Not required: ecto -> decimal + telemetry

Also one question … Previous comments mentions that it’s just proposal, so I wonder if it’s possible that the output of generate/0 function could change in future?

1 Like

uuid v5-v8 were accepted in May this year: RFC 9562: Universally Unique IDentifiers (UUIDs)

5 Likes

Thanks! In that case I would ask to update documentation as I did not found any RFC mention either on hexdocs nor GitHub’s readme file.

Thanks for the suggestion. You’re right that ecto is not required when using it without the Ecto.Type and I’ll add that to the roadmap for v1.0.0. When I find some time to work on that, I’ll also remove the dependency on rustler and make it a full Elixir-based UUID v7 generator. I’ll extend the testing suite to make sure it won’t break anything and remains fully compliant to the spec

4 Likes

In my case I need to find a time when it would not be so hot :hot_face:

Oh, really? I really liked that your solution is focused on performance. I have already used it and as you mentioned I needed to call Process.sleep(1). Having in mind that besides it I’m doing some metaprogramming it shows how fast the whole code is and I’m really counting on it since in my case performance is what I really need. :see_no_evil:

Just for sure making ecto optional is completely different case to rustler and also rustler_precompiled - it’s completely fine and even desired to have them as dependencies having in mind performance results in benchmarks. :muscle:

Will definitely wait for 1.0.0 release. Hopefully it would not be so hot as I have some real world use case for fast UUID generation … :thinking:

Thank you very much for your work and time. Having such packages is indisputably important and very helpful. :heart:

1 Like