I’ve been developing Ash resources and noticed that while AI tools work well when I provide existing examples, they struggle when creating new resources or tackling less-documented use cases. I’m exploring ways to improve AI responses by incorporating the Ash documentation.
One idea is to use the newly released Model Context Protocol from Anthropic, combined with a Retrieval-Augmented Generation (RAG) system, AWS Bedrock Knowledge Bases, or tools like Cline. This would involve downloading HEX docs for the Ash libraries, feeding them into a RAG system or knowledge base, and making them accessible for AI queries.
However, there are some challenges:
Ash documentation is extensive, covering multiple libraries within the ecosystem.
The docs are constantly updated, so I’d need a way to keep them in sync.
Alternatively, I’m considering whether the Model Context Protocol could directly access Ash documentation without requiring the setup and maintenance of a separate RAG system.
While this problem is most noticeable with Ash, it’s not unique to it—it applies to any HEX library. However, Ash is where I’ve experienced the most issues, as LLMs tend to wildly make things up when attempting to assist with it.
Before diving in, I’d like to ask:
Has anyone tried a similar approach?
Are there simpler solutions or existing tools for integrating HEX library documentation with LLMs effectively?
Any insights or suggestions would be much appreciated!
You’ll want to use the hex2context notebook, which tries to find and include only the most relevant sections of documentation. (I just shipped this ten minutes ago, and there’s tons of opportunities for improvement. Note that the notebook computes all embeddings locally, but swapping in paid models should be relatively trivial. I haven’t tested different embeddings models out yet; any real-world feedback would be super useful.)
Thanks for sharing! Just curious, how are you using this with your workflow? Do you have an approach that works well with any LLM tooling?
My first thoughts around how I’d use this is to integrate it into an MCP server so an MCP client could easily call Hex2context.ingest_docs/2 and Hex2context.retrieve_docs/2.
The workflow part is definitely a work-in-progress.
I have a (very simple) example session using Aider on the hex2txt homepage. The workflow for hex2context is similar, but you need to swap in the URL of your Livebook proxy instance (and change the URL structure slightly, pass in the query query string param, etc.). Clunky for sure.
I think that exposing an MCP server is the next step here. I haven’t reviewed the spec in detail yet but I would imagine that it wouldn’t be too hard to build a proof-of-concept MCP server directly in the Livebook.
(Unfortunately Aider doesn’t have MCP support (yet?) although I know Cline does, but I’m not personally planning on using Cline because of the VS Code dependency.)
I’m always curious about stuff like this. Is there a way for us to provide structured context to the large llm companies as well? I know this is slightly off topic, but the MCP stuff seems interesting and I’m curious if they have documented a way for people to provide information to them that they could train on so that the larger models that are trained by larger groups or corporations could include this in their training set.
Or a way to improve how the ash docs or just hexdocs in general are used to provide answers.
Seems like a missed opportunity since documentation in Elixir is so centralized, consistent, and high quality, compared to a mishmash of documentation in other ecosystems