Aider + claude 3.5 sonnet works really well with Elixir

preciz · July 19, 2024, 8:23pm

I just want to tell you guys that I have started using aider and since I paired it with claude 3.5 sonnet it has been a game changer.

I was able to accomplish tasks that I had delayed for ~2 years in just minutes.

One of these tasks was that we had an old modal that was still using vanilla JS from the old times when LiveView didn’t have the JS module.
I linked the docs to aider with the
/web https://hexdocs.pm/phoenix_live_view/Phoenix.LiveView.JS.html

Then after adding the necessary files and shortly prompting it, I waited for the result and it was done.

This combination well deserves the hype.
(I’m not associated with any of the above)

AndyL · July 19, 2024, 9:09pm

Can anyone here can compare the performance of Claude 3.5 vs OpenAi 4o for elixir tasks?

preciz · July 19, 2024, 9:19pm

The top model on Aider LLM leaderboard is claude-3.5-sonnet.

I have switched from OpenAI 4o and in my experience in Elixir claude 3.5 sonnet is way ahead.

Wojciech · July 19, 2024, 10:16pm

Thanks for recommendation but i have some questions.

It seems that aider auto-commits requested changes, is it worth it? Currently i commit my changes every time i fix a bug or i want to have a place where i can safely come back after doing something i later may want to undo, making a commit every time i ask for some changes doesn’t seem like the best approach(at least at first).

Wouldn’t it be more convenient to ask for changes, and then have an option to accept, discard or make adjustments to generated code(that’s how Cursor works).

PS. i also use claude 3.5, it’s awesome

preciz · July 19, 2024, 11:01pm

Auto commits were annoying for me too so I have this in my .aider.conf.yml:

  ## Enable/disable auto commit of LLM changes (default: True)
  auto-commits: false
  
  ## Attribute aider code changes in the git author name (default: True)
  attribute-author: false
  
  ## Attribute aider commits in the git committer name (default: True)
  attribute-committer: false

AndyL · July 20, 2024, 1:46am

@preciz thanks for the tips on Aider and Claude. Already helpful.

mudspot · July 20, 2024, 2:24pm

I use continue.dev + claude 3.5 sonnet.

+1 to Claude Sonnet!

itzmidinesh · July 23, 2024, 7:01pm

From my experience, Claude 3.5 Sonnet outperforms OpenAI 4o in handling Elixir tasks. Having worked with Elixir for eight months, I’ve consistently found Claude to be superior in generating Elixir code. Previously, Opus was quite good, but Sonnet has significantly surpassed it.

Zurga · July 23, 2024, 8:29pm

Do you have more examples? Can it generate a module which was not already in the training data? I can imagine it saw many examples of a Modal with JS functions.

Maybe you can ask it to create a module that can record interactions with a LiveView and can then generate tests based on those recordings. All the database query can be mocked out by what is found in the recordings. Think ExVCR for LiveView.

If it works, I would love to use that code!

dani · July 23, 2024, 9:20pm

Is Claude 3.5 really that good?

cmo · July 23, 2024, 9:32pm

Claude 3 Opus was already good and 3.5 Sonnet is good too, and apparently quicker.

Wojciech · July 23, 2024, 11:06pm

I have used both GPT-4o and Claude 3.5 for writing elixir, and in my opinion Claude is better. I consider starting paid plan to get access to Claude 4.

sbuttgereit · July 24, 2024, 1:55am

I’ve been using Claude 3 Opus and more recently 3.5 Sonnet via Source graph Cody. It’s pretty good, though I wish the in editor completion suggestions would allow partial acceptance… I find that often times the first bits are pretty good, but it gets worse the longer the suggestion is.

I do find that the Elixir support is pretty good and it can be helpful mostly in refactoring and, documenting things (except style can get inconsistent.). But for me the killer app aspect are certain tasks that are not really about Elixir. For example… I hate writing shell scripts (mostly because I suck at it): AI to the rescue… My shell scripts are infrequent and small enough that it actually hits a sweet spot for LLMs just producing a completely competent final result (just never trust that it’s so)… Same thing with regular expressions. Those things alone are worth the subscription.

I’ve tried OpenAI gpt 4o and just found I preferred Claude’s results better.

preciz · July 24, 2024, 2:27am

That modal was just a simple example I gave but of course I have used it for more complex things. One of my favorite is to /add a file I just created and ask it to write tests for it. Then I edit the tests it created if necessary. This way I write more tests and save a lot of time.

It is capable to do meta programming correctly, change code from runtime to compile time etc. it did a lot of things for me.

Overall it’s good enough to save time and boost productivity for most programmers I would say.

lud · July 24, 2024, 6:16am

Coding with the voice is my dream

Do you pay for Anthropic ? I can see this pricing:

Input $3 / MTok
Output $15 / MTok

How much do you pay every month and/or how far can you go with the free plan ?

edit: I’m trying to claim the free $5 credits to try with an anthropic API key but it does not work …

Zurga · July 24, 2024, 6:25am

I know, and I’m glad it can do these things. I was just wondering whether Claude was able to do more complex stuff.

Are these simple refactors deterministic? So if it is refactoring multiple files, is it going to do it the same way each time.

I tried it with the example question in my earlier post, and it did not seem to quite grasp what i meant.
ChatGPT seems to be going in the right direction: https://chatgpt.com/share/3fac846c-9a7c-45f0-9a2b-4e64920194df

preciz · July 24, 2024, 11:37am

Determinism is a matter of the seed value and temperature settings of the inference, which I don’t configure. But I believe that aider is taking care of these optimally already.

https://github.com/search?q=repo%3Apaul-gauthier%2Faider+temperature&type=code

I think we can’t expect that proprietary models will be deterministic but if you use aider with a lcoal one like deepseek coder v2 you can control it more.

However I must mention that it’s not a matter of importance for me anymore that the model gives back the same code always. Through usage that turned out not to be as important as it presents itself at first.

preciz · July 24, 2024, 11:59am

Currently on average I see ~150K token in and ~20K tokens out per day.
This is mixed in with our team mates aider usage and that we use the API also through Open WebUI for chatting.

So personal usage when coding with Elixir should be just <10$ per month. Of course, that’s an estimation.

lud · July 24, 2024, 1:57pm

Of course!

Thank you

tozz · July 24, 2024, 9:02pm

Can echo what people in the thread has already said, Claude 3.5 Sonnet outperforms 4o by far when it comes to Elixir, even comes with syntax highlighting.
I have cancelled my OpenAI subscription and will go with an Anthropic subscription instead.