Is anyone working on "AI Agents" in Elixir?

For those who are not aware, “AI agents” are, for the most part, commodity LLMs which are given access to “tools” and prompted to complete tasks, possibly in some sort of loop.

The tool use is facilitated by a program which scans the output text of the LLM and looks for a “tool call” request (in some standard format), and then executes that call. For example, you might give the model access to a “calculator” tool which enables it to do math, or a “weather API” tool to check the weather. And so on. The model is given a prompt which tells it what tools it has access to, and I believe most models coming out nowadays are trained to some degree on tool use so that they get the general idea.

The “agentic” behavior here is somewhat arbitrary, but the idea is that you have some sort of feedback loop. The model generates a tool call, receives the result, and then perhaps generates more calls based on that result. People have been using this to write code, for example, with (so far) limited success.

The current emerging “killer app” for agents is the “deep research” model, which has been adopted by google, openai, perplexity, twitter (lol), and so on. The basic idea here is that you give the model a “search engine” tool and then just prompt it to run in a loop searching, reading results, and then coming up with more searches. Then it generates a nice summary (“report”) at the end for human consumption. It goes without saying that this task is a lot easier than writing code, and as a result agents seem to be actually “catching on” for the first time.

Due to the autoregressive nature of current LLMs, which has proved to be quite sticky thus far, they perform extremely poorly for “local” use. Current autoregressive models require the entire model to be run through the GPU’s registers on every forward pass just to generate one token. As a result, “local” inference is completely bottlenecked by memory bandwidth. If you have a 30GB model (on the low end of “useful”), and a GPU with 600GB/s memory bandwidth (that’s pretty good), you would expect 20 tokens/sec (fairly usable). Unfortunately GPU memory bandwidth is expensive and 30GB is not enough for a top tier model.

However, this problem vanishes with batching. GPUs are built for parallel compute, and deep nets are built to utilize it. If you batch, say, 10 requests at a time, all of a sudden you are getting 200 tokens/sec on the same hardware (flops notwithstanding). The point being: there is a forcing function towards multitenancy. This is why everyone is using cloud APIs instead of running their own models - the cost reduction is enormous.

What this means is that “AI agents” are actually just glue code for interacting between LLM APIs and “tool” APIs. And that’s where Elixir comes in: we are very good at soft-realtime. Elixir and the BEAM are the ideal ecosystem for this. LiveView is the perfect tool for server-side realtime UI. If you were going to build some sort of “agentic” app, this would be the platform.

So I’m curious, is anyone doing something in that space?

6 Likes

Hey, so I know that there are these libraries which I didn’t try yet: jido swarm_ex

And I have these blog posts open in a tab but couldn’t find the time to read them so far:

I’m sure there is more going on

10 Likes
4 Likes

I have been throwing together a list of Elixir based openAI-style clients as well with some of the forementioned libraries (in addition to some others): GitHub - druyang/awesome-elixir-llm-genai: A list of LLM and GenAI Elixir Resources/Tools

Like you said, technically all you need for an agentic library is the ability to function call (which is really just structured outputs) and feed the feedback back into the model. Many of the libraries work, the amount of extra lifting to create the loops varies. I think the aforementioned Jido seems the most promising (not to pick favorites).

Since most agentic frameworks are bottlenecked by IO requests rather than pure computation, I also believe that BEAM/Elixir is an extremely underrated choice for GenAI. This is exacerbated by the use of batching LLM calls at scale for further cost reduction.

I’m still new to Elixir, but in my free time I want to work on some OSS projects in this area :slight_smile:

1 Like

I gave them a quick scan and they seem to be making roughly the same point I was, which is encouraging to me :slight_smile:

It seems like “agents” in general are a broad enough concept as to be effectively turing complete - i.e. there is no wake to make an “agent framework”, because agents are just code. And not only that, but at least for the time being running large models on your own servers is not cost-effective unless you have massive scale, and even then you would want as much multitenancy as possible within your own systems, so that forcing function doesn’t go away. In practice I think even large orgs will have to disaggregate their “ai compute” from their application compute, like we do with databases and storage, for the foreseeable future.

It seems like the one bit of tooling we still need is, well, tooling for tool calls. The impetus for this post was that there was another post on here about a ruby library called rubyllm which I thought looked like a fantastic model for how we could implement that functionality.

I am curious, though: is anyone on here working on an application (an agent) in this space, as opposed to libraries and frameworks? I am curious to hear people’s experiences with this.

2 Likes

So, I really liked the blog post, especially that it boils down to this

I hope these examples showcase how we can build sophisticated LLM workflows using simple Elixir mechanisms. While our implementation might seem basic - just pattern matching and with statements - don’t be fooled by the simplicity. This approach of composing small, focused functions and passing message chains through them can take us surprisingly far.

The beauty of this approach lies in its transparency - we can see exactly how our LLM interactions flow, what context is being maintained, and where we might want to add error handling or logging. No magic, no hidden state, just clear functional transformations.

Libraries can still be useful when they add convenience and structure.

I also like that this article follows the structure of Anthropic’s article as it’s a very good reminder that you can probably solve most problems without “agents”, even if you make use of LLMs.

Yes, I agree. We have database and storage as building blocks, so I could imagine that in the same way many applications have a “reasoning” block that makes use of LLMs for certain innovative features. Depending on how performant and cheap they will be, I could also imagine that we can use LLMs to avoid implementing some features in code and use a call to an LLM as shortcut.
Recently, I’ve been thinking of software defined networking as analogy (no expert here, so could be totally off :sweat:). Before that you’d do all networking in hardware. As software performance increased and it got cheaper at the same time it suddenly made sense to do networking in software as you could work way more flexible.
In the same way, I could see LLMs enabling “flexible” code. I also like this analogy because there is still a lot of hardware involved in networking, it’s just that you can do innovative things with software defined networking.

Not an application in that sense but I want to try to build an LLM based program that converts models from transformers written in pytorch to Elixir and bumblebee.
Partially as a learning project for building such projects, and I’m just curious how far I can get.
I also want to take it step by step following Anthropic’s categories and only move to agents if I really have to.
As a sidenote, here are people from huggingface saying that it’s more effective to give LLMs the ability to write and execute code to do something instead of JSON tools.
I also think that when your problem allows it it’s in general more effective to let LLMs write code instead of asking the LLM to perform the operation directly.
In my case I guess there is a lot of Python/pytorch code that follows a structure that can simply be parsed and transformed to Elixir, and I don’t really need an LLM to perform that operation but rather code that parses and transforms. On the other hand, for arbitrary functions or control flow I might need an LLM and potentially a feedback loop.

I guess for real agents the core issue is that you give up control to LLMs, which might limit their use cases. Or, you’d need other ways to exercise control over the results and actions that will be performed.

Are you thinking about building any sort of agentic application?

1 Like

The reason databases and storage are cheaper disaggregated is that multitenancy has inherent efficiency gains. A common example is S3, where high-storage low-bandwidth customers and high-bandwidth low-storage customers can share the same physical drives and use the capacity more effectively. LLMs currently exhibit a dramatic cost reduction in a multitenant environment because they are autoregressive.

If trends were to reverse this could just as easily end up not being the case. For example, if diffusion LLMs were to catch on and on-CPU accelerators and larger memory bandwidth catch on (e.g. amd strix halo) then you could imagine it being simpler to just run your zero-shot “AI” tasks directly on CPU (for us, this would mean Nx/Axon/Bumblebee). Models have also been shrinking, which helps.

But thus far, autoregressive models have kept winning. I have no deep technical understanding of why - maybe nobody does?

Nobody has solved prompt injection yet so giving your models the ability to run arbitrary code seems unwise. My bet would be for “real” products this will remain a very bad idea for a while.

I am not. I didn’t see any value in the paradigm at all until these “deep research” tools came out, but it’s the first use case that actually makes sense to me and I thought it was interesting that Elixir/BEAM mesh really well with that type of product.

Has anyone look at the Google A2A agent protocol (Announcing the Agent2Agent Protocol (A2A) - Google Developers Blog) ? Initially I am thinking Elixir genserver would be a good fit to model agent to agent communication , but with these standardize protocols being pushed by big companies, what are some of the advantages of using Elixir instead of Python in this domain?

1 Like

I find myself agreeing with this post by antirez (redis guy) about MCP and other “AI protocols”.

Coming up with “structured” protocols for LLMs is quite possibly the least interesting thing you can do with a paradigm which is revolutionary solely because it can interact with unstructured data. We have, for the first time in the history of our field, finally found a way to interact with computers in the way that “regular people” expect and of course the first thing programmers do is try to find a bunch of ways to get rid of the uncertainty and reassert structure.

Of course, our ability to reason about computing in a structured way is why we are programmers and everyone else is not, so this is not surprising. But I don’t think these things are going to last - they are a product of hype IMO.

Of course I cannot write this comment without linking the XKCD.

1 Like

Protocols aside, the big advantage for Elixir here is that our ecosystem was practically made for this. There is no better platform for writing soft-realtime glue code between different models/services/APIs.

If you take a step back and think about this it makes perfect sense: Erlang/OTP were literally designed for telecommunications. This problem, facilitating communication between models/services/users, is telcom, it’s just that the scope has expanded far beyond phone calls. It is a testament to the wisdom and creativity of those who built these systems that they can still be so relevant today.

“Agentic” apps built with Elixir will scale better with much lower latency (especially tail latency) than anything built with Python, and (IMO) developer ergonomics are much better too, though Python is far from the worst.

If anything, our biggest “competitor” will probably be JS simply because a couple of large companies (e.g. Cloudflare) have committed to in-process multitenancy which will drive costs down significantly for those who aren’t serving enough to saturate a VM core. On the other hand, one might argue those customers aren’t very valuable.

3 Likes

Author of Jido here

I’m actively wrestling with these ideas. I’ve implemented several “Applications” with Jido now that … after finishing them … I really struggle to answer whether they are better with Jido or not. Long term - a few GenServers that wrap req calls to LLM API’s are better.

I have this overwhelming feeling that Jido is the right direction - but has not arrived at a sensible destination.

A few other thoughts to share:

  • There’s a complexity vector here - a simple LLM wrapper doesn’t need a sophisticated agent framework - Oban works great.
  • Most agent implementations are really really simplistic - any dreams of massive swarms of agents demonstrating collective intelligence are still dreams - Elixir is more suited to larger swarms of agents due to OTP
  • While implementing Jido, I learned a lot about OTP - the educational journey was amazing. I found Joe Armstrong’s blog while on this journey - and realized that some of the features of Jido are simply constrained implementations of OTP - I don’t think that’s bad necessarily - but probably not the best implementation
  • While it’s easy to write an run 10,000 agents with Jido - it’s not that useful - for all of the normal distributed systems problems that come from running and coordinating 10,000 GenServers

There’s more questions then answers right now - but I do think LLM’s are here to stay so it’s better to wrestle with them

This particular space in our industry is evolving a lot right now - so I’m content to just continue wrestling and playing with the ideas. A few “first principles” I’ve collected so far:

  • Agents will be a new buzzword for “LLM Applications”
  • Agentic workflows are just workflows - low volume ETL pipelines with more variety
  • Multi-agent is simply a new variant on distributed systems problems
  • Elixir is well-suited for this - but suffers from a slower iteration cycle - so I’ve been following Python and TS “agent” frameworks closely and pulling in patterns to Jido that I feel will be durable over time
7 Likes

Thanks for the reply! It’s good to hear from someone working on this directly. Unlike you I am not currently working on anything in this area, so I am operating almost entirely on intuition. But with that said I do have some suspicions as to how things will play out, and I’ve voiced some of them already.

I think it’s more important that we provide “building blocks” in Elixir rather than an “agentic framework” per se, since the BEAM itself provides that framework. Not even OTP necessarily, but the scheduler and runtime is just so well-suited to the task IMO. Providing tools for tool calling, monitoring, etc is what will be important. From a cursory look I see Jido is providing some of those things, so I think you’re probably on to something there.

Yeah, I don’t see the value proposition for “swarms of agents” or anything like that (seems almost anthropomorphic tbh). But Elixir is great because it will scale out so well if you deploy a real app with real users.

Of course one can get fantastic value out of this model. The best example would be LiveView, which is “just” a GenServer if you squint and yet provides much more than the sum of its parts in practice.

I think it’s best not to get distracted here. There is a lot of hype nonsense going around (see my above comment about protocols) which is probably not going to stick. The LLMs are the only interesting thing about LLMs (imagine that) - so it’s best to focus on them. APIs will come and go (and can be trivially implemented with Req and so on).

1 Like

I love talking and thinking about this stuff! The depth of conversations I’ve had from other community members makes me appreciate Elixir even more

I think it’s more important that we provide “building blocks” in Elixir rather than an “agentic framework” per se, since the BEAM itself provides that framework.

Yes, I’m 100% with you here - this is why I was opinionated about keeping any specific LLMs out of Jido core. LLM’s are implemented as specific Agents, Skills and Actions in the jido_ai package - I was pretty happy with how well this worked

Yeah, I don’t see the value proposition for “swarms of agents” or anything like that

Short term - no - I don’t know of any examples deploying many unique agents right now. That said, I see a natural evolution where a product implements a 1-to-many agent pattern where one agent orchestrates the work of other agents on behalf of a user. This was the breakthrough that Manus demonstrated so well.

I’m not convinced this is good though - as you said it gets very anthropomorphic and too magical. Interestingly, I have asked my kids about this (ages 15 & 17) and they have less of an issue - so I’m acknowledging some personal bias here.

I think it’s best not to get distracted here.

Solid point - I pushed out the Jido weather example and went on holiday for Spring Break - which has prompted a lot of this reflection. I’ve been soaking up other example frameworks, reading everything I can get my hands on etc

This was a great read as well.

It’s been easy to get lost in the weeds - I’m back to building agents this week with some fresh perspectives and will be shipping more soon!

2 Likes

Really enjoying this thread—so many insightful takes. I’d love to add a perspective from someone currently building a service on top of Ash and Ash AI(GitHub - ash-project/ash_ai: Structured outputs, vectorization and tool calling for your Ash application), where we’re integrating an agent-style chatbot into a real workflow.

One thing I believe strongly:

An agent’s job is to elevate and clarify user intent, then communicate and act on it to drastically simplify the UX.

In the app I’m building, the flow looks something like this:

a user uploads a contract file and just says “process this”.

The agent then:

  • reads the file and extracts key data
  • checks if the referenced client is already in the system
  • creates the client if necessary
  • continues to create an invoice→ all while mixing automation with UI-driven confirmation, so the user stays in control.

Internally, this feels incredibly natural and surprisingly fluid .

Where Elixir shines in this setup:

  • Ash Framework: The declarative power of Ash is hard to overstate. It lets me define tools cleanly and declaratively, which plug seamlessly into AshAI. This means I can expose my application logic as “tools” with almost no extra effort.
  • LiveView or Channels: Real-time interaction with the user is crucial. For example, once the agent identifies what kind of data needs to be confirmed, we bring the user to a LiveView-powered form to review/edit inputs. Once confirmed, the agent resumes and continues the workflow. This kind of multi-step, multi-view interactivity is where LiveView or channels make things incredibly smooth.

I’m still early in the journey, but this combination of structured domain logic and real-time agent orchestration feels like a powerful direction.

3 Likes