Blink is a library for fast bulk data insertion into PostgreSQL databases using the COPY command. It provides a clean, declarative syntax for defining seeders.
Features:
Uses PostgreSQL’s COPY for fast bulk inserts
Tables inserted in declaration order to respect foreign key constraints
Access data from previously defined tables when building subsequent tables
Store auxiliary context data that won’t be inserted into the database
Load data from CSV/JSON files with Blink.from_csv/2 and Blink.from_json/2
:transform option for type conversion when loading from files
Integrates with ExMachina nicely
Rollback on errors
Adapter pattern for supporting other databases
Example:
defmodule MyApp.Seeder do
use Blink
def call do
new()
|> add_table(:users)
|> add_table(:posts)
|> insert(MyApp.Repo)
end
def table(_store, :users) do
[
%{id: 1, name: "Alice", email: "alice@example.com"},
%{id: 2, name: "Bob", email: "bob@example.com"}
]
end
def table(store, :posts) do
users = store.tables.users
# Build posts referencing users...
end
end
I’ve read the code and found a couple of fairly obvious bugs (like non-escaped strings in generated CSV) and limitations (like reading everything in memory), so I made a PR with fixes.
I am also providing fairly cheap consultancy services if you want to have this kind of review and contribution in your private projects.
Version 0.5.0 is now available. This release marks a big step toward 1.0.0 — it covers all the major changes I had planned. Now the focus shifts to gathering feedback, fixing bugs, and addressing any remaining breaking changes before 1.0.0 (though I don’t have any in mind).
The headline feature is stream support, which enables memory-efficient seeding of large datasets.
Both table/2 clauses return streams in the example below, but returning lists still works as before.
defmodule Blog.Seeder do
use Blink
def call do
new()
|> with_table("users")
|> with_table("posts")
|> run(Blog.Repo, timeout: :infinity)
end
def table(_seeder, "users") do
Stream.map(1..200_000, fn i ->
%{
id: i,
name: "User #{i}",
email: "user#{i}@example.com",
...
inserted_at: ~U[2024-01-01 00:00:00Z],
updated_at: ~U[2024-01-01 00:00:00Z]
}
end)
end
def table(seeder, "posts") do
users_stream = seeder.tables["users"]
Stream.flat_map(users_stream, fn user ->
for i <- 1..20 do
%{
id: (user.id - 1) * 20 + i,
title: "Post #{i} by #{user.name}",
body: "This is the content of post #{i}",
user_id: user.id,
...
inserted_at: ~U[2024-01-01 00:00:00Z],
updated_at: ~U[2024-01-01 00:00:00Z]
}
end
end)
end
end
Other highlights
JSONB support — nested maps are automatically JSON-encoded during insertion
Configurable timeout — :timeout option for long-running transactions