Omni - a universal Elixir client for LLM APIs

, ,

Hey everyone - I’ve been building with Elixir on and off for over 8 years, but somehow have never posted on the actual Elixir forum. Time to fix that…

Also, I’d love to share with you Omni - a library for working with LLM APIs across multiple providers through a unified interface. Anthropic, OpenAI, Google Gemini, Ollama, OpenRouter, and OpenCode Zen are supported out of the box.

# Resolve model
{:ok, model} = Omni.get_model(:anthropic, "claude-sonnet-4-6")

# Simple text generation
{:ok, response} = Omni.generate_text(model, "Hello!")

# Stream with composable callbacks
{:ok, stream} = Omni.stream_text(model, "Tell me a story")

{:ok, response} =
  stream
  |> Omni.StreamingResponse.on(:text_delta, &IO.write(&1.delta))
  |> Omni.StreamingResponse.complete()

Tool use and structured outputs are supported. Pass tools in the context and Omni handles the execution loop automatically - calling the model, executing tool handlers, feeding results back, and repeating until the model is done. Structured output uses JSON Schema constraints with validation:

# Tool use - Omni manages the tool execution loop
{:ok, response} = Omni.generate_text(
  model,
  Omni.context(
    messages: [Omni.message(role: :user, content: "What's the weather in London?")],
    tools: [weather_tool]
  )
)

# Structured output
alias Omni.Schema
{:ok, response} = Omni.generate_text(
  model,
  "Extract the contact details: Reach me at jane@example.com or call 01234 567890",
  output: Schema.object(%{
    email: Schema.string(description: "Email address"),
    phone: Schema.string(description: "Phone number")
  }, required: [:email, :phone])
)

Omni also offers a lightweight take on agents. Omni.Agent is a GenServer that manages its own conversation context and tool execution, and communicates with callers via standard process messages. You control behaviour through lifecycle callbacks. It’s a building block, not a framework - what you build on top (planning, memory, multi-agent orchestration) is your concern.

I know req_llm covers similar ground, which - slightly annoyingly - I didn’t realise existed until I was 90% of the way done with Omni :man_facepalming:t2:. On the surface they have quite similar APIs, and both use Req, but how they handle implementing providers is a little different. Omni separates providers (the endpoint, configuration and auth) and dialects (wire format translation). The dialect does the heavy lifting, and as most providers share a dialect, adding a new provider is typically a small, mostly-declarative module. Everything is streaming-first - generate_text is built on top of stream_text, so there’s one code path through each dialect.

Anyway, please check it out. Let me know if you have any questions.

10 Likes

How does this compare to (or where does it sit in relation to) Langchain (Elixir)?

They’re both text generation focused - so fundamentally do the same thing. Just a different style and take.

Langchain has a few things Omni does not: multimodal, RAG text splitting, EEx prompt templates - and probably some other stuff. Omni is light-weight, only has 2 dependencies.

The main difference for users is the surface API. The mental model for Omni: is build a request, get a stream, consume it. For Langchain it’s build a chain, add some messages, run the chain. Omni’s style is functional, data oriented; Langchain’s is stateful structs, callbacks, framework-y.

Internally the big difference is how Omni splits Providers and Dialects into two things, which should make it relatively painless to add more providers over time. Langchain has one big fat module per provider which I think looks hard to maintain. In theory Omni could sit underneath Langchain and be that provider translation layer.

1 Like

Thanks a lot @aaronrussell for the explanation and for the library. Could you please compare/contrast Omni with ReqLLM?

Regarding req_llm, they are honestly very similar. Both libraries attempt to solve the same problem and approach it in very similar ways. Both libraries are clearly influenced by the Vercel AI SDK, but in Omni I’ve taken a lot of inspiration from the Pi codebase, which I think steers it in a slightly leaner direction.

  • Both have a almost identical top-level API: generate_text(model, context, inference_opts) and stream_text(model, context, inference_opts)
  • Both attempt to establish a canonical data model that works across all LLM provider APIs. There are some minor differences in the shape of those data models, but essentially they’re doing the same thing.
  • Both source model data from the models.dev API.
  • Both can generate text, structured objects, can stream responses, track usage tokens and costs.
  • Both use Req under the hood.

Some gaps (some of these will narrow over time):

  • req_llm supports 45 providers with ~665 models - omni supports 6 providers with ~300 models
  • req_llm supports image generation, omni does not
  • Omni takes a slightly pragmatic view with more obscure inference options like top_p, logprobs etc and doesn’t currently attempt to support every option for every provider - req_llm appears to be have a bit more complete coverage
  • Omni provides a simple `Omni.Agent` genserver as a building block for agents - req_llm does not have anything like this (but is part of the wider Jido ecosystem)

Lower level differences:

  • Omni splits a provider into two behaviours - the Provider and the Dialect (the wire format). This should make adding new providers much quicker and easier to maintain, as most providers in the wild share dialects. This also makes it easy for users to create their own providers.
  • Omni is streaming first - so even generate_text/3 is a streaming request that is accumulated in one call. This means a dialect only needs to care about streaming requests - resulting in simpler implementations.
1 Like

Thanks for your reply! There may be a couple other distinctions. With ReqLLM, Ollama integration is not an out-of-the-box option, but Omni docs show Ollama support. Also: I believe ReqLLM is integrated with Jido and Ash.

I’ll give Omni a try with Ollama!

Yep there is an Ollama provider. A little config is needed:

# Ollama isn't a default provider, so load it in the config
config :omni, :providers, [:ollama]

# You need to configure your installed models
config :omni, Omni.Providers.Ollama, models: ["mistral:7b", "qwen3.5:4b"]

Oh, and tool calling, reasoning etc is model dependent. That little 4b qwen model is pretty good for testing.

1 Like

Omni updates this week…

Omni v1.2.0

  • Extracted Omni.Agent and associated modules into it’s own package. omni lives as a stateless LLM API layer for any LLM provider. omni_agent becomes it’s own thing (see below).
  • Model store updated, including latest GPT 5.4 mini and nano models.
  • Minor under the hood bits and bobs.

Omni Agent v0.1.0

  • Extracted from above, now it’s own package for creating stateful, multi-turn GenServer-powered agent processes.
  • API and lifecycle simplified, documentation cleared up.
  • Consider this a more experimental package - expect things to change and break.

Links

Omni - GitHub | Docs
Omni Agent - GitHub | Docs

1 Like

Thanks for releasing Omni. I like the approach, especially the low number of deps.

Question: With Omni, is it possible to receive the model’s tool selections without having Omni execute the tool itself? I wasn’t able to tell in the docs; they focus on the auto-execution loop, which makes sense for many cases, but I would like to have full control over the tool execution and resulting context additions.

Yep two ways to do that. First and simplest is pass max_steps: 1, eg:

Omni.stream_text(model, context, max_steps: 1)

Also tools themselves can be schema-only - they don’t need a function handler attached. So if any of your tools don’t have a function handler then it will just stop there with a stop_reason: :tool_use and then it’s up to you to handle the rest and feed the results back.

1 Like