Is anyone using AI assistance in coding Elixir?

It will direct you to your particular Dunning-Kruger minima as fast as possible and once there assure you that a better solution is inconceivable.

7 Likes

LOL :003: I am stealing your quote!

1 Like

I’ve been using Cursor AI for the last six or eight weeks. Been very helpful and a lot of fun!

An LLM can definitely lead you down a rabbit hole. I found that when I go back to good solid techniques, working in small steps with tests, It really, really shines.

2 Likes

I’ve been lurking on this discussion and it’s all very interesting.

Does anyone have a link to a video of someone live coding proficiently with a coding assistant? I still haven’t used one myself and whenever I see someone using one (be it in person or online) it always kind of looks like hell, lol. Code is flying in their face that they constantly have to stop and read to see if it’s right. Sometimes they accidentally accept it then have to undo… in fact, it really seems like there is a lot of rejecting the wrong thing be it parts of the suggestion of the whole thing. It’s neat when it can write a function for you, though I fall into the camp of this taking away one of the most fun things about programming—I enjoy the mental exercise of figuring out a solution on my own. For boilerplate I still use snippets, and for simple refactoring tasks it seems that so long as you have a certain level of editor mastery I’m not sure there is any improvement here (and again, this is another fun thing about programming for me).

I can absolutely see value in having it write a function for you when have no idea what it is you’re doing. Having it right in editor would be nice, though whenever I’ve used ChatGTP I still find myself looking up other solutions to compare results, thus I’d be switching to my browser anyway.

I may just be holding it wrong, though, and would be great to see someone using it really proficiently.

3 Likes

I don’t really “use” it so much deliberately but macvim/vimr has a GitHub copilot thing going and I haven’t disabled that. While I experienced quite a bit of hallucinating (eg code that looks good but doesn’t work in the slightest) in Python, the Elixir ideas copilot come up with are much more usable.
I don’t know if that is because I tend to code cleaner in Elixir or what else might be the reason.

I use chatgpt 4o canvas for targeted small tasks and high level architect brainstorming. I don’t use LLM for in-editor code completions. I have done enough prompt engineering and DNN models to see LLM as a glorified sentence transformer.

For posterity, I’m sharing how I structure my prompt so it may help everyone. I simply start a chat with sentence “Use your knowledge on elixir [programming language and coding patterns] to advise me.” where the bracket part is optional. Follow with a short blurb about the code I’m about to dump, then dump all the relevant code, finally write my request at the end, like “complete the TODO in the code”.

Here is the chat for that “check to see if two lists have the same members” task. ChatGPT - List Member Comparison in Elixir . Since I asked it to advise me, LLM tends to provide options, and I believe the consensus in this thread is to use Enum.frequencies due to “the overhead of sorting”, that’s option 2.

2 Likes

I’m using copilot and it’s kinda convenient for PR reviews, on rewriting something simple to the standard you want and sometimes its suggestions are kinda what you really need. But sometimes not)

1 Like

This is incorrect - Enum.sort(a) == Enum.sort(b) is faster than Enum.frequencies(a) == Enum.frequencies(b). At least that’s true for lists of numbers, it seems.

This is a bit messy, but here’s a bunch of varieties: benchmarking some list comparisons · GitHub

My output looks like this (m2 macbook air 8gb):

Operating System: macOS
CPU Information: Apple M2
Number of Available Cores: 8
Available memory: 8 GB
Elixir 1.17.3
Erlang 27.1
JIT enabled: true

Benchmark suite executing with the following configuration:
warmup: 2 s
time: 5 s
memory time: 0 ns
reduction time: 0 ns
parallel: 1
inputs: none specified
Estimated total run time: 2 min 55 s

Benchmarking freq diff length ...
Benchmarking freq different ...
Benchmarking freq list_reversed ...
Benchmarking freq list_sorted ...
Benchmarking freq same ...
Benchmarking sort by diff length ...
Benchmarking sort by different ...
Benchmarking sort by list_reversed ...
Benchmarking sort by list_sorted ...
Benchmarking sort by same ...
Benchmarking sort desc diff length ...
Benchmarking sort desc different ...
Benchmarking sort desc list_reversed ...
Benchmarking sort desc list_sorted ...
Benchmarking sort desc same ...
Benchmarking sort diff length ...
Benchmarking sort different ...
Benchmarking sort list_reversed ...
Benchmarking sort list_sorted ...
Benchmarking sort reduce while diff length ...
Benchmarking sort reduce while different ...
Benchmarking sort reduce while list_reversed ...
Benchmarking sort reduce while list_sorted ...
Benchmarking sort reduce while same ...
Benchmarking sort same ...
Calculating statistics...
Formatting results...

Name                                      ips        average  deviation         median         99th %
sort list_reversed                    97.38 K       10.27 μs   ±241.33%        9.25 μs       25.29 μs
sort desc list_reversed               68.78 K       14.54 μs    ±49.31%       13.25 μs          25 μs
sort list_sorted                      66.85 K       14.96 μs    ±32.73%       13.83 μs       25.29 μs
sort desc list_sorted                 50.79 K       19.69 μs    ±23.54%       17.71 μs       33.17 μs
sort reduce while list_reversed       46.92 K       21.31 μs    ±31.00%       18.21 μs       37.96 μs
sort reduce while list_sorted         38.17 K       26.20 μs    ±25.56%          23 μs          41 μs
sort desc same                        14.45 K       69.20 μs    ±13.73%       67.17 μs      100.51 μs
sort same                             14.08 K       71.03 μs    ±12.33%       69.00 μs      110.38 μs
sort diff length                      13.54 K       73.85 μs    ±12.99%       70.54 μs      109.81 μs
sort desc different                   13.33 K       75.03 μs    ±13.30%       72.08 μs      100.96 μs
sort desc diff length                 13.08 K       76.42 μs    ±48.78%       72.63 μs      123.55 μs
sort different                        12.97 K       77.11 μs    ±15.20%          74 μs      117.58 μs
sort reduce while same                12.91 K       77.47 μs    ±20.59%       73.04 μs      140.75 μs
sort reduce while different           12.54 K       79.77 μs    ±20.90%       74.92 μs      142.12 μs
sort reduce while diff length         11.37 K       87.94 μs    ±20.62%       84.50 μs      156.51 μs
sort by list_reversed                 11.10 K       90.07 μs    ±14.71%       88.25 μs      138.04 μs
sort by list_sorted                    9.96 K      100.44 μs     ±3.35%      100.13 μs      113.71 μs
sort by same                           5.29 K      189.03 μs    ±17.91%      184.25 μs      299.25 μs
sort by diff length                    4.95 K      202.01 μs    ±14.98%      191.29 μs      304.10 μs
sort by different                      4.78 K      209.37 μs    ±14.98%      204.42 μs      317.11 μs
freq different                         4.48 K      223.43 μs    ±14.48%      227.25 μs      290.60 μs
freq same                              4.39 K      227.63 μs    ±14.08%      229.25 μs      288.85 μs
freq diff length                       4.35 K      229.71 μs    ±16.38%      230.17 μs      301.77 μs
freq list_sorted                       2.92 K      342.31 μs    ±11.63%      337.21 μs      426.22 μs
freq list_reversed                     2.83 K      352.81 μs    ±11.30%      359.88 μs      431.96 μs

Comparison: 
sort list_reversed                    97.38 K
sort desc list_reversed               68.78 K - 1.42x slower +4.27 μs
sort list_sorted                      66.85 K - 1.46x slower +4.69 μs
sort desc list_sorted                 50.79 K - 1.92x slower +9.42 μs
sort reduce while list_reversed       46.92 K - 2.08x slower +11.05 μs
sort reduce while list_sorted         38.17 K - 2.55x slower +15.93 μs
sort desc same                        14.45 K - 6.74x slower +58.93 μs
sort same                             14.08 K - 6.92x slower +60.76 μs
sort diff length                      13.54 K - 7.19x slower +63.58 μs
sort desc different                   13.33 K - 7.31x slower +64.76 μs
sort desc diff length                 13.08 K - 7.44x slower +66.15 μs
sort different                        12.97 K - 7.51x slower +66.84 μs
sort reduce while same                12.91 K - 7.54x slower +67.20 μs
sort reduce while different           12.54 K - 7.77x slower +69.50 μs
sort reduce while diff length         11.37 K - 8.56x slower +77.67 μs
sort by list_reversed                 11.10 K - 8.77x slower +79.80 μs
sort by list_sorted                    9.96 K - 9.78x slower +90.17 μs
sort by same                           5.29 K - 18.41x slower +178.77 μs
sort by diff length                    4.95 K - 19.67x slower +191.74 μs
sort by different                      4.78 K - 20.39x slower +199.10 μs
freq different                         4.48 K - 21.76x slower +213.16 μs
freq same                              4.39 K - 22.17x slower +217.36 μs
freq diff length                       4.35 K - 22.37x slower +219.44 μs
freq list_sorted                       2.92 K - 33.33x slower +332.04 μs
freq list_reversed                     2.83 K - 34.36x slower +342.54 μs

the specifics aren’t too important, but it’s pretty clear that Enum.frequencies/1 is the slowest option. This isn’t an excellent test, and you should always profile. The point is, ChatGPT does not (and I would argue cannot) know the answer, so anything it spits out needs to be assumed to be false.

Unfortunately, a broken clock is correct twice a day. The same is, of course, true for LLMs, but very few people know if they can read a clock (to continue the analogy).

I am using Cursor and it is pretty epic. Works best if there are good examples in the codebase. 100% recommend.

3 Likes

+1 for cursor and its autocomplete on steroids

1 Like

For the benefit of the community I would strongly advise you all not to use ChatGPT with Elixir. Their models simply do not understand the language. I burnt through easily 10k tokens today getting nowhere on a medium difficulty problem. Out of frustration I reach for the Claude free tier, and set it to “concise.” It solved the problem in 1k tokens “no sweat”! Before today I believed that the “frontier” LLMs were all roughly comparable when writing Elixir, but it’s clear to me no there are order of magnitude differences in capabilities.

Use Claude!

4 Likes

I was about to write that I broke my LLM virginity lately. I needed help with an obscure-ish library that assumes 100% proficiency in its domain to be used. Claude taught me how to use it and helped me troubleshoot a problem with the input data.

I still don’t view LLMs as killing programming – they are VERY long way from that – but I can’t deny that with the main search engines being near-useless, the LLMs fill some gaps well.

7 Likes

A useful trick is to periodically ask the LLM to refactor your modules for human readability and easy extensibility by LLMs. Usually this process is divergent for Python/React spaghetti code, but for Elixir and Claude I usually find that I understand the code better and the next pass Claude takes on the module generates better results. With the AI code assistant editors it’s easy to generate new code endlessly but it’s important when using these tools to pare back on occasion.

3 Likes

I haven’t used any LLMs for coding, but I’ve found ChatGPT to be very useful as a sounding board and an (occasionally mistaken) source of information for project planning, etc.

LLMs are brilliant in managing bureaucratic correspondence for you. All these prompts like “please rewrite ‘Are you goddamn bonkers?—I paid all my bills for electricity in the last year’ to sound politely and confidently in three pages of Shakespeare-like language.”

2 Likes

I was very against using LLMs for coding a few months ago and now use Claude often.

Maybe I mis-placed some of my “work identity” in being able to write “complex” stuff myself …? It’s okay, there’s a lot of people quicker than me at algorithms. So if I precisely know what I need to do, and a bot captured that kind of knowledge for me to leverage, why not.

Elixir lends itself to write 100% pure functional focused modules for a lot of logic, and those are the perfect thing for both human and bots to work on : self-contained, no side effects.

Being solo I now find this tool useful for repetitive stuff, one-off things that will not make it to prod, and to lighten cognitive load when the day starts to feel long.

I also use it for discovery of other ecosystems, with two windows : one where I ask questions about how is the common stuff done, another with google to quickly cross-check.

On the other hand, I see how it can be a source of confusion for new developers. It routinely generates errors when I’m running it on very specific business things “outside of the set”.

In my (niche) hobby that involves scripts with math and optics I never got a single good result, equations came out mostly wrong and “reasoning” was very bad too.

Just don’t get fooled about thinking/reasoning. It’s sophisticated imitation, and parts of our work that are not unique can benefit from imitation.

I also feel that it’s quite healthy and human to have strong and polarized opinions about those new things.

Edit : removed parts that went off topic and made the post more concise… manually (-; .

6 Likes

Hello, sorry if this has been asked/discussed recently, i couldn’t find.

tl;dr; Whats ai setup (which ide/which ai and how integrated/what hardware) you have allowing ai support your elixir development?

Recently Chris McCord has teased with some ai integration for development. x.com

Personally i use openai (20usd version) through browser (sometimes it helps, but i dont see huge life/game changing benefit), tried to link with zed, but zed requires api key, which i can get only with another princing plan for openai. (if im not mistaken)

Just few days ago deepseek released. If today you would by dev machine, would you consider local ai model, like deepseek as viable alternative to cloud ones and buy maxed out mac/pc?

I tried zed for the last couple of days, to test out the AI integration. I don’t think I’ll continue using it though. I’ve tried the AI integration with deepseek, but it seems like the reasoning models are not properly integrated yet. It will unfortunately write the whole thinking process into your sourcefile.

Also, I don’t think you have to buy a maxed out machine to run deepseek locally and get decent results, but if you have the budget might as well go all out :wink:

You could set up a cluster of 8 Mac Minis with 64GB each…

DeepSeek is really exciting!

I’ve been using cursor for months with Claude sonnet 3.6 and o1 mini and it’s great.

Not having to write all the code is the best

2 Likes