KristerV

State of developing agents with Elixir (not coding agents)

Hey. Is there anyone here who creates agents in their apps? Not talking about using agents, but creating them. I’m finding it pretty difficult. The LLM models themselves are not actually very intelligent it seems, the tooling is crucial to get a useful agent out of these text generators.

I decided to write down my findings and experiences so people don’t have to reinvent the wheel, but at the same time I’m hoping someone will tell me that I’m doing it all wrong and give me a recipe for success.

Ultimately a friend says they use the Claude Agent SDK and get their agents running without too much effort and with good results. That’s what I want. But all the official tooling is for the JS ecosystem, obviously. Anyway, here’s the current state.

Features an agent needs to be useful

Plan-Act-Verify Loop: Explicit verification step after actions.
State Machine Persistence: Durable storage of agent “thought” and “status” (Checkpoints).
Context Compaction: Auto-summarizing history to prevent token bloat/model drift.
Memory Layering: Separating ephemeral “Working Memory” from “Long-term Knowledge.”
Human-in-the-Loop (HITL): Breakpoints for manual approval on sensitive actions.
Model agnostic: Switching from one LLM model to another shouldn’t cause problems/rewrites.

This is not an exhaustive list for agents, just what I know is needed atm.

Industry tools

Here’s the tooling non-Elixir environments have

Vercel AI SDK (Typescript)
LangGraph (JS/Python).
PydanticAI (Python)
Claude Agent SDK (Python/TS)
OpenAI Agents SDK (Python/TS)

Elixir tools

LangChain (Elixir): An Elixir-native implementation of the LangChain framework. It focuses on “Chaining” processes and providing modular components for prompt management and model integration.
Jido: OTP-native state-machine framework for autonomous agents (Action/Signal/Runner). This is the “power-user” choice for complex, long-running agent logic.
Legion: A dedicated harness library specifically for building agentic loops and tool execution.
AshAi / AshBaml: Crucial bridges for turning your existing Ash Resources and Actions into LLM-accessible tools.
Instructor Ex: The go-to for type-safe data extraction and validation using Ecto schemas.
ReqLLM: An LLM-specific client for the Req library.
LLMAgent: A signal-based library designed for managing conversation flows and tool handlers.
Oban: Essential for job persistence, concurrency control, and “Resume from failure” logic.
Elixir AI SDK: A community port that brings the Vercel AI SDK’s maxSteps and streaming patterns to Elixir.
edit: just found Whisperer, looks like a good first layer of such a system.

My experience

I started with langchain, but switching models breaks the app code, which makes the lib kind of pointless. And they don’t really accept PR’s, but that’s understandable tbh. I switched to OpenRouter.ai and that works great with only a 5% addition to cost of the LLM models.

I currently have built my agents on Oban. A custom loop of messages, context pruning, planning etc. But it’s brittle. Now I have to rebuild my architecture to be more deterministic (because the models just aren’t very smart on their own).

The libraries look to me like each does one part of the puzzle or if it tries to do it all it’s just not very deep. And since docs are not deep either it’s one of those things where you spend a week trying them all out and then realize none of them do what you need. Out of all of them Legion seems to be the most all inclusive option with orchestrators and an agent loop. But no plan → exec → verify it seems. Which is kind of core, so don’t really want to jump into it.

Anyway, these are just my thoughts. I’m probably misunderstanding a bunch.

My question

My point is not to complain or point at missing pieces. Rather I have a feeling I’m missing the big picture. So I ask - how do you build agents? What libraries, techniques or approaches do you use?

20 comments

#agent #ai #llm #dev-tools #ai-tools

30 1596 20

2026-02-26 12:28:02 UTC

Most Liked

mikehostetler

Did you try Jido at all? I see it listed above - Jido has a sophisticated “ReAct” reasoning loop with tools powered by ReqLLM here - models switch pretty easily:

https://github.com/agentjido/jido_ai/blob/main/lib/jido_ai/agents/examples/weather_agent.ex

Post #16

typesend

tfwright

I tend to agree with OPs reasons for using Oban.

Oban: Essential for job persistence, concurrency control, and “Resume from failure” logic.

In fact, I have found myself using Oban as my state machine library of choice since its states and arg persistence and retry handling cover pretty much all my uses cases. Before I found myself adding a “status” enum with only minor variation in values and repetitive transition logic to various contexts/schemas. Bringing in a new dep just to handle that kind of thing seemed unnecessary, but as I went to abstract it myself I realized I was recreating a bunch of APIs I already had at my disposal in Oban. So I started to lean more on it and most of that stuff just became worker config, leaving only the need to enqueue jobs with the appropriate args. Logic has much better SoC because the schemas now only express states that are actually specifically meaningful to the domain, and all the generic “error” “pending“ etc stuff is hidden away and protected with much better guarantees than I ever managed to maintain. And that’s before using any of the Pro features like workflows. So it’s hard for me to imagine the downside to using Oban for something like this.

Post #7