State of developing agents with Elixir (not coding agents)

Hey. Is there anyone here who creates agents in their apps? Not talking about using agents, but creating them. I’m finding it pretty difficult. The LLM models themselves are not actually very intelligent it seems, the tooling is crucial to get a useful agent out of these text generators.

I decided to write down my findings and experiences so people don’t have to reinvent the wheel, but at the same time I’m hoping someone will tell me that I’m doing it all wrong and give me a recipe for success.

Ultimately a friend says they use the Claude Agent SDK and get their agents running without too much effort and with good results. That’s what I want. But all the official tooling is for the JS ecosystem, obviously. Anyway, here’s the current state.

Features an agent needs to be useful

  1. Plan-Act-Verify Loop: Explicit verification step after actions.
  2. State Machine Persistence: Durable storage of agent “thought” and “status” (Checkpoints).
  3. Context Compaction: Auto-summarizing history to prevent token bloat/model drift.
  4. Memory Layering: Separating ephemeral “Working Memory” from “Long-term Knowledge.”
  5. Human-in-the-Loop (HITL): Breakpoints for manual approval on sensitive actions.
  6. Model agnostic: Switching from one LLM model to another shouldn’t cause problems/rewrites.

This is not an exhaustive list for agents, just what I know is needed atm.

Industry tools

Here’s the tooling non-Elixir environments have

  1. Vercel AI SDK (Typescript)
  2. LangGraph (JS/Python).
  3. PydanticAI (Python)
  4. Claude Agent SDK (Python/TS)
  5. OpenAI Agents SDK (Python/TS)

Elixir tools

  1. LangChain (Elixir): An Elixir-native implementation of the LangChain framework. It focuses on “Chaining” processes and providing modular components for prompt management and model integration.
  2. Jido: OTP-native state-machine framework for autonomous agents (Action/Signal/Runner). This is the “power-user” choice for complex, long-running agent logic.
  3. Legion: A dedicated harness library specifically for building agentic loops and tool execution.
  4. AshAi / AshBaml: Crucial bridges for turning your existing Ash Resources and Actions into LLM-accessible tools.
  5. Instructor Ex: The go-to for type-safe data extraction and validation using Ecto schemas.
  6. ReqLLM: An LLM-specific client for the Req library.
  7. LLMAgent: A signal-based library designed for managing conversation flows and tool handlers.
  8. Oban: Essential for job persistence, concurrency control, and “Resume from failure” logic.
  9. Elixir AI SDK: A community port that brings the Vercel AI SDK’s maxSteps and streaming patterns to Elixir.
  10. edit: just found Whisperer, looks like a good first layer of such a system.

My experience

I started with langchain, but switching models breaks the app code, which makes the lib kind of pointless. And they don’t really accept PR’s, but that’s understandable tbh. I switched to OpenRouter.ai and that works great with only a 5% addition to cost of the LLM models.

I currently have built my agents on Oban. A custom loop of messages, context pruning, planning etc. But it’s brittle. Now I have to rebuild my architecture to be more deterministic (because the models just aren’t very smart on their own).

The libraries look to me like each does one part of the puzzle or if it tries to do it all it’s just not very deep. And since docs are not deep either it’s one of those things where you spend a week trying them all out and then realize none of them do what you need. Out of all of them Legion seems to be the most all inclusive option with orchestrators and an agent loop. But no plan → exec → verify it seems. Which is kind of core, so don’t really want to jump into it.

Anyway, these are just my thoughts. I’m probably misunderstanding a bunch.

My question

My point is not to complain or point at missing pieces. Rather I have a feeling I’m missing the big picture. So I ask - how do you build agents? What libraries, techniques or approaches do you use?

3 Likes

See also Sagents:

(Launched very recently.)

3 Likes

@brainlid :waving_hand:

1 Like

I just use GenServers - no libraries. Each agent is a GenServer. I have 1 orchestrator agent which has a list of all of the subagents. All of them have tools available.

I did try LangChain… but honestly more of a learning curve than just doing your own thing. Jido I looked at but never tried… not sure of the benefits (for my use case at least).

I would strongly suggest looking at GenServers and not using Oban for this.

How do you handle persistence through app restarts?

Persistence of what?

I presume you mean chat history? I save to the DB.

I tend to agree with OPs reasons for using Oban.

Oban: Essential for job persistence, concurrency control, and “Resume from failure” logic.

In fact, I have found myself using Oban as my state machine library of choice since its states and arg persistence and retry handling cover pretty much all my uses cases. Before I found myself adding a “status” enum with only minor variation in values and repetitive transition logic to various contexts/schemas. Bringing in a new dep just to handle that kind of thing seemed unnecessary, but as I went to abstract it myself I realized I was recreating a bunch of APIs I already had at my disposal in Oban. So I started to lean more on it and most of that stuff just became worker config, leaving only the need to enqueue jobs with the appropriate args. Logic has much better SoC because the schemas now only express states that are actually specifically meaningful to the domain, and all the generic “error” “pending“ etc stuff is hidden away and protected with much better guarantees than I ever managed to maintain. And that’s before using any of the Pro features like workflows. So it’s hard for me to imagine the downside to using Oban for something like this.

3 Likes

Are you saying use Oban to build out AI agents?

I am in the same boat as OP, actually likely quite a bit less experienced. I have only just started building out my first LLM backed features and given what I said above Oban is a key component so far. My experience with using Oban in FSMs has been excellent, and given the implicit need for FSMs in many “agentic” workflows, it seems a natural fit to me. But I’m open to arguments to the contrary.

1 Like

That, or maybe GitHub - nshkrdotcom/flowstone: Asset-first data orchestration for Elixir/BEAM. Dagster-inspired with OTP fault tolerance, LiveView dashboard, lineage tracking, checkpoint gates, and distributed execution via Oban..

Both are excellent to persist and enrich a “god”-like object as a mission is progressing. Granted it would be easier and quicker with Oban Pro but not all companies agree with a subscription for that.

I found have the free Oban to be amazing at orchestration. Sure it’s a bit more manual but as @tfwright says: a lot of stuff is already taken care of for you so why would you reinvent it? The fact that something does not give you 100% of what you need (but say, 60% - 80%) does not mean it should be ignored for the task.

1 Like

I’ve used Genservers for a lot of things and 99% of the cases end up being jobs. An agent loop is ultimately a job. When it’s done it’s task - job is done. So Oban is a natural fit.

Okay, this is the most fully featured thing I’ve seen so far. Need to investigate further, thanks.

1 Like

@Dmk, the Sagents library is built on the Elixir LangChain library, does not use Oban, and the agents_demo project sets up DB persistence for the agent and the user-side conversation.

2 Likes

I had missed @typesend’s post originally - but caught it this weekend and watched the demo video. Very impressive. Saw that is was built ontop of LangChain.

My previous comment about not using LangChain in my projects was just that with Elixir rolling your own solution is very simple - making the need for using an external library (for me anyways) somewhat unnecessary.

2 Likes

I’d like to share that I started building a hobby coding agent using Elixir/OTP called Opal:

It’s nowhere near finished, but it’s currently bootstrapping itself. The core agent loop uses :gen_statem for the state machine, tools are executed in parallel and subagents basically come for free due to OTP primitives. I’ve found Elixir an awesome language to build this in.

It’s a hobby project so please adjust expectations :slight_smile: I started building this to learn both OTP and how agents work. I’ve been using AI to accelerate my work but keeping a close eye + manually reviewing the agent line by line.

So far, I’ve found using Erlang distribution gives us unprecedented observability into the state of running agents. Being able to iex into any Opal session to dump state, poke around and issue calls: it’s just insane.

Drop a star and let me know any feedback!

3 Likes

I think BullMQ would be quite useful, particularly the “flows” features, if you want to breakdown a task in sub-agents: Flows & Parent-Child Jobs — BullMQ v1.2.6

1 Like

Did you try Jido at all? I see it listed above - Jido has a sophisticated “ReAct” reasoning loop with tools powered by ReqLLM here - models switch pretty easily:

4 Likes

Here’s code to walk you through a tutorial on how to build agents of varying sophistication with ReqLLM as well:

1 Like

I get what you’re saying. If you only need to make a one-off LLM call, then rolling your own and doing it direct is a great option. Sagents is designed to help when it gets more complex than that. It helps handle the extra complexity of things like HITL (Human in the Loop), sub-agents, uninterruptible sub-agents, introspection, telemetry, managing a different view of the conversation from the user’s perspective vs the agent’s perspective, etc.

1 Like

If you are working on agents you may want to check out my libs:

2 Likes

Related conversation?