Blink - Fast bulk seeding for Ecto/PostgreSQL with clean, declarative syntax

coen.bakker · January 11, 2026, 1:12pm

Blink is a library for fast bulk data insertion into PostgreSQL databases using the COPY command. It provides a clean, declarative syntax for defining seeders.

Features:

Uses PostgreSQL’s COPY for fast bulk inserts
Tables inserted in declaration order to respect foreign key constraints
Access data from previously defined tables when building subsequent tables
Store auxiliary context data that won’t be inserted into the database
Load data from CSV/JSON files with Blink.from_csv/2 and Blink.from_json/2
:transform option for type conversion when loading from files
Integrates with ExMachina nicely
Rollback on errors
Adapter pattern for supporting other databases

Example:

defmodule MyApp.Seeder do
  use Blink

  def call do
    new()
    |> add_table(:users)
    |> add_table(:posts)
    |> insert(MyApp.Repo)
  end

  def table(_store, :users) do
    [
      %{id: 1, name: "Alice", email: "alice@example.com"},
      %{id: 2, name: "Bob", email: "bob@example.com"}
    ]
  end

  def table(store, :posts) do
    users = store.tables.users
    # Build posts referencing users...
  end
end

Links:

Documentation: blink v0.4.1 — Documentation
Hex: blink | Hex

Asd · January 12, 2026, 9:05pm

Good library.

I’ve read the code and found a couple of fairly obvious bugs (like non-escaped strings in generated CSV) and limitations (like reading everything in memory), so I made a PR with fixes.

I am also providing fairly cheap consultancy services if you want to have this kind of review and contribution in your private projects.

coen.bakker · January 13, 2026, 10:44am

Great. Ty.

I was aware of the memory issue and had a fix in mind similar to the one in your PR. I’ll have a closer look when I have time.

coen.bakker · January 18, 2026, 7:15pm

v0.5.0 Released

Version 0.5.0 is now available. This release marks a big step toward 1.0.0 — it covers all the major changes I had planned. Now the focus shifts to gathering feedback, fixing bugs, and addressing any remaining breaking changes before 1.0.0 (though I don’t have any in mind).

The headline feature is stream support, which enables memory-efficient seeding of large datasets.

Both table/2 clauses return streams in the example below, but returning lists still works as before.

defmodule Blog.Seeder do
  use Blink

  def call do
    new()
    |> with_table("users")
    |> with_table("posts")
    |> run(Blog.Repo, timeout: :infinity)
  end

  def table(_seeder, "users") do
    Stream.map(1..200_000, fn i ->
      %{
        id: i,
        name: "User #{i}",
        email: "user#{i}@example.com",
        ...
        inserted_at: ~U[2024-01-01 00:00:00Z],
        updated_at: ~U[2024-01-01 00:00:00Z]
      }
    end)
  end
  
  def table(seeder, "posts") do
    users_stream = seeder.tables["users"]

    Stream.flat_map(users_stream, fn user ->
      for i <- 1..20 do
        %{
          id: (user.id - 1) * 20 + i,
          title: "Post #{i} by #{user.name}",
          body: "This is the content of post #{i}",
          user_id: user.id,
          ...
          inserted_at: ~U[2024-01-01 00:00:00Z],
          updated_at: ~U[2024-01-01 00:00:00Z]
        }
      end
    end)
  end
end

Other highlights

JSONB support — nested maps are automatically JSON-encoded during insertion
Configurable timeout — :timeout option for long-running transactions
Configurable batch size — :batch_size option controls stream chunking (default: 10,000 rows)
Performance improvement — CSV encoding executes significantly faster
Bug fix — CSV escaping now correctly handles pipes, quotes, newlines, and backslashes

Breaking changes

Blink.Store → Blink.Seeder
insert/3 → run/3
add_table/2 → with_table/2
add_context/2 → with_context/2
Return values simplified to :ok (raises on failure)
Adapter call/4 callback now receives table_name as a string

Full changelog: v0.5.0 release