Arcana - Embeddable RAG library for Elixir/Phoenix

georgeguimaraes · December 30, 2025, 6:22pm

I’m excited to share Arcana, a RAG (Retrieval Augmented Generation) library I’ve been building for Elixir/Phoenix applications.

What is it?

Arcana lets you add vector search, document retrieval, and AI-powered Q&A to any Phoenix app. It’s designed to be embeddable - it uses your existing Ecto Repo and PostgreSQL with pgvector, no separate vector database needed.

Key Features

Simple API for basic RAG:

  {:ok, doc} = Arcana.ingest("Your content", repo: MyApp.Repo)
  results = Arcana.search("query", repo: MyApp.Repo, mode: :hybrid)
  {:ok, answer} = Arcana.ask("What is Elixir?", repo: MyApp.Repo, llm: "openai:gpt-4o")

Agentic RAG Pipeline for complex questions:

  ctx =
    Agent.new("Compare Elixir and Erlang")
    |> Agent.select(collections: ["elixir-docs", "erlang-docs"])
    |> Agent.expand()
    |> Agent.decompose()
    |> Agent.search()
    |> Agent.rerank(threshold: 7)
    |> Agent.answer(self_correct: true)

The pipeline includes query rewriting, collection routing, query expansion, question decomposition, re-ranking, and self-correcting answers (reduces hallucinations by verifying answers are grounded in context).

Embeddings with Nx/Bumblebee:

Local embeddings with bge-small-en-v1.5 - no API keys needed
Or use OpenAI embeddings
Or bring your own via the Arcana.Embedder behaviour

Pluggable Everything:

Every pipeline step has a behaviour you can implement:

defmodule MyApp.CrossEncoderReranker do
    @behaviour Arcana.Agent.Reranker

    @impl true
    def rerank(question, chunks, opts) do
      # Your cross-encoder logic
      {:ok, scored_chunks}
    end
  end

Other highlights:

Hybrid search (vector + fulltext with Reciprocal Rank Fusion)
In-memory HNSWLib backend for testing or smaller apps
Built-in telemetry for all operations
Evaluation metrics (MRR, Recall@k, Precision@k)
Optional LiveView dashboard
File ingestion (text, markdown, PDF)

Links

GitHub: GitHub - georgeguimaraes/arcana: Embeddable RAG library for Elixir/Phoenix with agentic pipelines and dashboard
Hex: arcana | Hex
Demo app with Doctor Who corpus: GitHub - georgeguimaraes/arcana-adept: Example Phoenix app demonstrating Arcana RAG toolkit

Would love feedback from the community!

georgeguimaraes · December 30, 2025, 6:30pm

Dashboard example here:

SyntaxSorcerer · January 22, 2026, 8:14pm

I’m curious to how members of the community are using this. I have a small collection of PDFs I’d like to use in a RAG pipeline for some agentic workflows. Not sure what’s the best way to get started but Arcana looks like it could be useful.

georgeguimaraes · January 23, 2026, 12:45pm

Check GitHub - georgeguimaraes/arcana-adept: Example Phoenix app demonstrating Arcana RAG toolkit

It’s not using PDF, but JSON files, but it’d give you a good overview of the whole workflow.

Also, Arcana has pdf “scraping” capabilities using https://poppler.freedesktop.org/. Not fancy, and not extremely exceptional. I want to experiment with GitHub - kreuzberg-dev/kreuzberg: A polyglot document intelligence framework with a Rust core. Extract text, metadata, and structured information from PDFs, Office documents, images, and 50+ formats. Available for Rust, Python, Ruby, Java, Go, PHP, Elixir, C#, TypeScript (Node/Bun/Wasm/Deno)- or use via CLI, REST API, or MCP server. to know how we can do better.

But remember, you can define your own PDF Parser and give it to Arcana.