Improving AI Tooling with Ash Documentation

bradley · December 22, 2024, 4:44am

I’ve been developing Ash resources and noticed that while AI tools work well when I provide existing examples, they struggle when creating new resources or tackling less-documented use cases. I’m exploring ways to improve AI responses by incorporating the Ash documentation.

One idea is to use the newly released Model Context Protocol from Anthropic, combined with a Retrieval-Augmented Generation (RAG) system, AWS Bedrock Knowledge Bases, or tools like Cline. This would involve downloading HEX docs for the Ash libraries, feeding them into a RAG system or knowledge base, and making them accessible for AI queries.

However, there are some challenges:

Ash documentation is extensive, covering multiple libraries within the ecosystem.
The docs are constantly updated, so I’d need a way to keep them in sync.

Alternatively, I’m considering whether the Model Context Protocol could directly access Ash documentation without requiring the setup and maintenance of a separate RAG system.

While this problem is most noticeable with Ash, it’s not unique to it—it applies to any HEX library. However, Ash is where I’ve experienced the most issues, as LLMs tend to wildly make things up when attempting to assist with it.

Before diving in, I’d like to ask:

Has anyone tried a similar approach?
Are there simpler solutions or existing tools for integrating HEX library documentation with LLMs effectively?

Any insights or suggestions would be much appreciated!

zachdaniel · December 23, 2024, 11:55pm

I can’t contribute much to this conversation except to confirm that LLMs have no idea what they are talking about when it comes to Ash

I think that custom training an LLM on reference docs, example apps, etc will be the only way to have something at least halfway decent.

mjrusso · December 24, 2024, 5:08am

Try this out: https://hex2txt.fly.dev

You’ll want to use the hex2context notebook, which tries to find and include only the most relevant sections of documentation. (I just shipped this ten minutes ago, and there’s tons of opportunities for improvement. Note that the notebook computes all embeddings locally, but swapping in paid models should be relatively trivial. I haven’t tested different embeddings models out yet; any real-world feedback would be super useful.)

To run the notebook: Run - Livebook.dev

Direct link to source: hex2txt/notebooks/hex2context.livemd at main · mjrusso/hex2txt · GitHub

bradley · December 25, 2024, 2:03pm

Thanks for sharing! Just curious, how are you using this with your workflow? Do you have an approach that works well with any LLM tooling?

My first thoughts around how I’d use this is to integrate it into an MCP server so an MCP client could easily call Hex2context.ingest_docs/2 and Hex2context.retrieve_docs/2.

mjrusso · December 26, 2024, 1:35pm

The workflow part is definitely a work-in-progress.

I have a (very simple) example session using Aider on the hex2txt homepage. The workflow for hex2context is similar, but you need to swap in the URL of your Livebook proxy instance (and change the URL structure slightly, pass in the query query string param, etc.). Clunky for sure.

I think that exposing an MCP server is the next step here. I haven’t reviewed the spec in detail yet but I would imagine that it wouldn’t be too hard to build a proof-of-concept MCP server directly in the Livebook.

(Unfortunately Aider doesn’t have MCP support (yet?) although I know Cline does, but I’m not personally planning on using Cline because of the VS Code dependency.)

felix-starman · December 26, 2024, 4:24pm

I’m always curious about stuff like this. Is there a way for us to provide structured context to the large llm companies as well? I know this is slightly off topic, but the MCP stuff seems interesting and I’m curious if they have documented a way for people to provide information to them that they could train on so that the larger models that are trained by larger groups or corporations could include this in their training set.

Or a way to improve how the ash docs or just hexdocs in general are used to provide answers.

Seems like a missed opportunity since documentation in Elixir is so centralized, consistent, and high quality, compared to a mishmash of documentation in other ecosystems

mjrusso · December 27, 2024, 2:04pm

I’m not aware of any mechanism for this (other than submitting to a corpus like Common Crawl, which are certainly being used to train on).

As for improving Hexdocs-in-general, @mayel is on it add markdown formatter / exporter by mayel · Pull Request #1976 · elixir-lang/ex_doc · GitHub

bradley · February 19, 2025, 7:58pm

Just wanted to share an update, @mjrusso. I’ve been using hex2text alongside Cursor’s documentation indexing, and it’s been working pretty well. All I do is point it to the hex2text link for the library I need, and voila! I’d really love to see something like this integrated into Hex proper.

Overall, this workflow is decent, but Cursor’s index doesn’t always get utilized when I expect it to, and it still generates a fair amount of incorrect Ash code. That said, it’s still the best solution I’ve tried so far—better than Windsurfer, Copilot, Zed, Cline, and Aider.

Also, cursor just added support for MCP which is a bonus.

bradley · April 4, 2025, 2:27pm

Just wanted to post an update—I finally got fed up with how Cursor was generating code, so I wrote my own MCP server. So far results are promising.

mikesax · April 25, 2025, 8:36pm

I tried to use Claude Code to create an Ash project the other day and unfortunately, it got hopelessly confused trying to generate and run Ecto migrations. So I am wondering if there has been any progress or tips to help AI aids “understand” Ash?

At least for now, regardless of Ash, nothing good comes from “vibe coding” without close supervision, but I have found that using Claude Code with regular Phoenix projects has saved lots of typing and looking up specific functions. As long as I take very measured steps (instructions) and read all the code it suggests as if I’d be typing it myself, it truly saves me time without compromising quality and elegance.

Ash makes me much more productive.

Claude Code can make me much more productive.

Is there a way to have both?

bradley · April 25, 2025, 8:43pm

Have you had a chance to try my MCP server yet? I usually point it to HexDocs ahead of time so it has the context of Ash before writing code. If it implements something incorrectly, I ask the AI to review the HexDocs before fixing the issue. That approach seems to work well for me, but your use case might be different, so it may or may not fit your needs.

dmitriid · April 29, 2025, 3:03pm

I’ve tried to set it up, and well, I’ve run into the early adopters’ problems BTWm this isn’t anything you did wrong, or a reflection on the quality of the tool. It’s just the state of the very early tooling in general

Runing in WSL2 on Windows:

npx -y @smithery/cli@latest install @bradleygolden/hexdocs-mcp --client cursor does nothing even though it says it did something

setting config by default fails with then fais to start the client with

[info] -mcp: Starting new stdio process with command: npx -y hexdocs-mcp@0.4.1
[error] -mcp: Client error for command spawn npx ENOENT
[error] -mcp: Error in MCP: spawn npx ENOENT
[info] -mcp: Client closed for command
[error] -mcp: Error in MCP: Client closed
[error] -mcp: Failed to reload client: MCP error -32000: Connection closed
[info] -mcp: Handling ListOfferings action
[error] -mcp: No server info found
[info] -mcp: Handling ReloadClient action
[info] -mcp: Starting new stdio process with command: npx -y hexdocs-mcp@0.4.1

Running npx -y hexdocs-mcp@0.4.1 in the terminal works just fine

Though I really want this to work

Sorry for the abundance of smileys. View this as my eye twitching.

bradley · April 29, 2025, 3:09pm

I really appreciate your feedback. I want to make it as easy to use as possible.

Did you try adding this to your cursor MCP settings? If not, can you try this?

{
  "mcpServers": {
    "hexdocs-mcp": {
      "command": "npx",
      "args": [
        "-y",
        "hexdocs-mcp@0.4.1"
      ]
    }
  }
}

I might remove the smithery install altogether since it seems buggy.

dmitriid · April 29, 2025, 3:32pm

Ah, that was my bad, I didn’t write everything. Yes, when I added this manually, it appeared int he list of MCP servers in Cursor, but it couldn’t be started. The only output from cursor was the errors above.

Cursor 0.49.6

bradley · April 29, 2025, 7:08pm

Can you try using the full npx path? I think this means cursor is having a problem finding the npx executable: Client error for command spawn npx ENOENT.

I haven’t verified it works with windows yet so that could be the issue as well.

dmitriid · April 30, 2025, 5:18pm

Ah. I should’ve searched before complaining

So, Cursor is trying to run the MCP server in Windows even when connected to wsl. A workaround is

{
  "mcpServers": {
    "hexdocs-mcp": {
      "command": "wsl",
      "args": [
        "HEXDOCS_MCP_MIX_PROJECT_PATHS=/path/to/mix.exs,/path/to/another/mix.exs",
        "bash",
        "-c",
        "'npx -y hexdocs-mcp@0.4.1'"
      ]
    }
  }
}

So this starts the server successfully.

However

(when using fetch)

    Fetch failed: Command failed: .../hexdocs-mcp/bin/hexdocs_mcp_linux fetch phoenix

Failed to fetch Phoenix docs. The tool requires mix to be available in the environment.

(when using search)
No results found. You need to fetch the Phoenix docs first using mix hex.docs.mcp fetch phoenix.

The quest continues.

stryderjzw · April 30, 2025, 6:53pm

Anyone tried Context7?

bradley · April 30, 2025, 10:56pm

Haven’t tried it yet. Have you? I like that it supports multiple languages. Im curious about the quality given it parses GitHub instead of hex docs directly.

I do plan to host hexdocs-mcp to simplify setup but I wonder if context7 is just better and maybe it’s not worth it?

bradley · April 30, 2025, 10:58pm

Sorry for the troubles. As mentioned above, I’d give context 7 a try for now until I find a solution.

Just be aware your queries go directly to their servers whereas hexdocs-mcp is fully local. I’m not sure how secure their servers are.

stryderjzw · April 30, 2025, 11:15pm

I haven’t spent too much time with Context7 yet, but the one time I tried, it worked okay and fetched the appropriate docs.

Yeah, it does send your queries to their server, so it was nice to see your implementation.