FeistelCipher, AshFeistelCipher - Encrypted integer IDs using Feistel cipher

I’m excited to share FeistelCipher and AshFeistelCipher, PostgreSQL-based libraries that provide encrypted integer IDs using the Feistel cipher algorithm.

The Problem

Sequential IDs (1, 2, 3…) expose sensitive business information:

  • Competitors can estimate your growth rate
  • Users can enumerate resources (/posts/1, /posts/2…)
  • Total record counts are revealed

Common solutions have their own issues:

  • UUIDs: Fixed 36 characters for everything - overkill for most use cases
  • Random integers: Collision risks and complex generation logic

Our Solution

FeistelCipher provides a different approach:

  • Store sequential integers internally
  • Expose encrypted integers externally (non-sequential, unpredictable)
  • Adjustable bit size per column: User ID = 40 bits, Post ID = 52 bits
  • Automatic encryption via PostgreSQL triggers

Key Features

  • Deterministic & Collision-free: One-to-one mapping within the bit range
  • Fast: ~4.4μs per encryption (benchmarked on Apple M3 Pro)

Usage

FeistelCipher (Ecto)

Migration:

defmodule MyApp.Repo.Migrations.CreatePosts do
  use Ecto.Migration

  def up do
    create table(:posts) do
      add :seq, :bigserial
      add :title, :string
    end

    execute FeistelCipher.up_for_trigger("public", "posts", "seq", "id")
  end

  def down do
    execute FeistelCipher.down_for_trigger("public", "posts", "seq", "id")
    drop table(:posts)
  end
end

Schema:

defmodule MyApp.Post do
  use Ecto.Schema

  schema "posts" do
    field :seq, :id, read_after_writes: true
    field :title, :string
  end
  
  @derive {Jason.Encoder, except: [:seq]}  # Hide seq in API responses
end

Usage:

%Post{title: "Hello"} |> Repo.insert()
# => %Post{id: 8234567, seq: 1, title: "Hello"}

The seq column auto-increments, and the trigger automatically encrypts it into the id column.

AshFeistelCipher (Ash Framework)

For Ash Framework users, AshFeistelCipher provides a cleaner, declarative syntax:

defmodule MyApp.Post do
  use Ash.Resource,
    data_layer: AshPostgres.DataLayer,
    extensions: [AshFeistelCipher]

  postgres do
    table "posts"
    repo MyApp.Repo
  end

  attributes do
    integer_sequence :seq
    encrypted_integer_primary_key :id, from: :seq
    
    attribute :title, :string, allow_nil?: false
  end
end

Run mix ash.codegen to generate migrations with automatic trigger configuration.

Links

14 Likes

Implementing the cipher in SQL is equal parts horrifying and brilliant. I see it comes right from the Postgres wiki so I won’t argue with that!

In the example it looks like the sequential id is stored but the random id is the one used as the primary key. Am I understanding that correctly? If so you are losing all of the benefits of a sequential primary key, no?

Also, I think providing a default salt is a dangerous footgun. Force the user to generate a random salt (like Phoenix’s secret_key_base).

Thank you for the excellent feedback! You’ve identified some important points that deserve clarification.

On Primary Key Performance Trade-offs

You’re absolutely right that using the encrypted id as a primary key loses the benefits of a sequential primary key. This is intentional - it’s the same trade-off that UUIDv4 has (random ordering causes B-tree page splits).

However, the library supports an alternative pattern: Keep id as a sequential primary key and encrypt a separate disp_id column for public display:

create table(:posts, primary_key: false) do
  add :id, :bigserial, primary_key: true    # Sequential, internal
  add :disp_id, :bigint                      # Encrypted, external
  add :title, :string
end

execute FeistelCipher.up_for_trigger("public", "posts", "id", "disp_id")

This gives you sequential PK performance while still hiding growth patterns externally.

Regarding encryption overhead: The encryption takes microseconds while typical INSERT/UPDATE operations involving disk writes (WAL, index updates) take milliseconds, making the encryption overhead negligible. For high-volume inserts or frequent sequential scans over large datasets, this library may not be the optimal choice.

This library targets typical web applications where security/privacy outweighs marginal insert/update performance. I’ve added a “Performance Considerations” section to the README to make these trade-offs explicit.

On Default Salt

You’re 100% correct - this is a security issue. Having all projects share the same default salt means analyzing one project’s encryption could compromise others.

I’ve just released v0.13.0 that automatically generates a unique random salt during installation. Each project now gets its own salt without any manual intervention.

Thanks again for taking the time to review this thoroughly!

8 Likes

BTW UUIDs are not “36 characters”, they are 16 bytes. Nobody is storing the base16 encoding in the DB. It’s for humans.

Wouldn’t we still want/need a secondary index for disp_id to lookup – assuming growing number of rows – the record on an outside request?

So that’s still more work on top of the serial PK?

2 Likes

Generally when you encrypt sequential ids like this you decrypt them before doing a lookup. But since the encryption takes place in the database that might be a little more difficult!

At least internal queries and joins would hit the good index, though.

For the record, in Postgres the problem is less Btree fragmentation and more the stupid visibility map thing which is fundamentally a consequence of their very bad storage engine. Not that fragmenting Btrees is a good thing either, but it kinda happens.

@byu Yeah, disp_id needs an index. I’ve updated the README just now.

Non-PK disp_id can be annoying with tools like TablePlus though. If you have a user’s disp_id from a URL and want to see their posts, you’d need to look up the user table first to get the real id. Using the encrypted id as PK (seq->id pattern) avoids this.

1 Like

The problem is that if you use the random id as the PK the only remaining benefit of this approach is that it avoids collisions. With a random 64 bit id there is a 50% chance of having had a single collision after roughly 4 billion inserts. So avoiding collisions alone is not very useful.

A third approach would be to implement the encryption in Elixir instead and use an Ecto Type to perform the conversion so that you can perform reads against the sequential key. This is an old trick, but I’m not sure if we have a well-maintained Elixir library for it.

I prefer systems with mathematical guarantees over probabilistic ones.

Random IDs require more bits to keep collision probability acceptable. Feistel’s collision-free guarantee allows fewer bits for human-friendly short IDs.

The deterministic nature also provides reproducible seed data with stable URLs, which random IDs can’t offer.

Regarding the Ecto Type approach: that would create a mismatch between DB values and URL values, making debugging with tools like TablePlus more difficult since you’d need to decrypt IDs to query the database.

2 Likes

The database will prevent a collision using the primary key index. So there is no probabilistic risk of corrupted data, even with small keys.

But nobody is going down to 4-byte ids (we have learned our lesson there), so it’s 8 bytes either way.

I am not saying your approach is wrong, what I’m saying is that you want to use the sequential id as the primary key. If you use the display id as the primary key you are getting the same performance characteristics as a random id, and you’re storing 16 bytes worth of ids! That’s the same as a UUID! The values in the index are random, so you fragment the Btree and mess with the visibility map. By using the random id as the PK you are losing all of the benefits of your approach.

Sure they can! You can seed the PRNG.

Yes, it’s the same tradeoff. But the main advantage of your library is the performance benefits, so that’s the side of the tradeoff you should be taking.

Of course I know the PK index prevents collisions. But you need to handle the retry logic in your application when collisions do occur. All that just to save 8 bytes.

If we’ve learned any lesson, it’s that “one size fits all” doesn’t work. A 62 bits (8 bytes) Feistel cipher produces 19-digit IDs—way too long for user-friendly URLs. A B2B SaaS doesn’t need that. With 32 bits (4 bytes) for company IDs, you get 10-digit numbers with 4 billion capacity. Salesforce, one of the largest B2B SaaS companies, has around 150,000 customers—that’s 0.004% of 32-bit capacity. Different scales need different solutions. That’s the actual lesson.

You can’t say “your approach isn’t wrong” and then tell me what I “want” to do in the same sentence.

P.S. Yes, 32 bits still gets stored in an 8-byte bigint. If someone really wants to save those 4 bytes, they can send a PR.