I let LLMs write an Elixir NIF in C; it mostly worked

This post documents how I built a cross-platform Elixir NIF in C to get on-demand up-to-date disk-usage stats without relying on os_mon and its disksup service. I had Grok 3 generate the initial C code and Makefile, then iterated through multiple code reviews by Gemini 2.5 Flash and GPT-5 to make it work on Linux, macOS, Windows, and the BSDs (except DragonFlyBSD).

Along the way, I ran into typical LLM hiccups that speak volumes about the breathless hyperbole often peddled by LLM vendors, compute providers, and over-enthusiastic consultants, middle managers and executives on LinkedIn. Nevertheless, the result is a working, cross-platform Elixir package on Hex.pm, plus a real-world case study in where LLMs shine, where they fail, and what “human-in-the-loop” can mean in practice.

Spoiler alert: the hype is exactly that; even so, we ended up with working code that is, at the very least, a solid starting point for further improvements by actual general intelligence.

3 Likes

It mostly worked means it didn’t work. Good job getting it there though.

3 Likes

It was an experiment that I could have avoided, had I looked more closely at :disksup.get_disk_info/1, which immediately returns the disk-usage info (regardless of the update interval).

Some comments on HackerNews were adamantly against letting LLMs write C, and after running splint on the C file, I don’t blame them… An excerpt:

Summary
Splint 3.1.2 --- 21 Feb 2021
...
disk_space.c: (in function make_errno_error_tuple)
disk_space.c:74:6: Unrecognized identifier: strerror_r
  Identifier used in code has not been declared. (Use -unrecog to inhibit
  warning)
...
disk_space.c: (in function is_valid_utf8)
disk_space.c:108:7: Operands of < have incompatible types (unsigned char, int):
                       data[i] < 0x80
  To make char and int types equivalent, use +charint.
disk_space.c:111:14: Operands of == have incompatible types (unsigned char,
                        int): (data[i] & 0xE0) == 0xC0
disk_space.c:113:25: Operands of != have incompatible types (unsigned char,
                        int): (data[i + 1] & 0xC0) != 0x80
...
disk_space.c: (in function get_path_from_term)
disk_space.c:142:37: Passed storage bin contains 4 undefined fields:
                        size, data, ref_bin, __spare__
...
disk_space.c:144:11: Null storage returned as non-null: NULL
  Function returns a possibly null pointer, but is not declared using
  /*@null@*/ annotation of result.  If function may return NULL, add /*@null@*/
  annotation to the return value declaration. (Use -nullret to inhibit warning)
disk_space.c:144:16: Only storage bin.data (type unsigned char *) derived from
    variable declared in this scope is not released (memory leak)
  A storage leak due to incomplete deallocation of a structure or deep pointer
  is suspected. Unshared storage that is reachable from a reference that is
  being deallocated has not yet been deallocated. Splint assumes when an object
  is passed as an out only void pointer that the outer object will be
  deallocated, but the inner objects will not. (Use -compdestroy to inhibit
  warning)
disk_space.c:144:16: Only storage bin.ref_bin (type void *) derived from
    variable declared in this scope is not released (memory leak)
disk_space.c:147:8: Operand of ! is non-boolean (int):
                       !is_valid_utf8(bin.data, bin.size)
disk_space.c:148:11: Null storage returned as non-null: NULL
disk_space.c:148:16: Only storage bin.data (type unsigned char *) derived from
    variable declared in this scope is not released (memory leak)
disk_space.c:148:16: Only storage bin.ref_bin (type void *) derived from
    variable declared in this scope is not released (memory leak)
disk_space.c:152:11: Null storage returned as non-null: NULL
disk_space.c:152:16: Only storage bin.data (type unsigned char *) derived from
    variable declared in this scope is not released (memory leak)
disk_space.c:152:16: Only storage bin.ref_bin (type void *) derived from
    variable declared in this scope is not released (memory leak)
disk_space.c:156:15: Only storage bin.data (type unsigned char *) derived from
    variable declared in this scope is not released (memory leak)
disk_space.c:156:15: Only storage bin.ref_bin (type void *) derived from
    variable declared in this scope is not released (memory leak)
disk_space.c:160:7: Operand of ! is non-boolean (int): !enif_is_list(env, term)
disk_space.c:161:10: Null storage returned as non-null: NULL
disk_space.c:161:15: Only storage bin.data (type unsigned char *) derived from
    variable declared in this scope is not released (memory leak)
disk_space.c:161:15: Only storage bin.ref_bin (type void *) derived from
    variable declared in this scope is not released (memory leak)
disk_space.c:166:10: Null storage returned as non-null: NULL
disk_space.c:166:15: Only storage bin.data (type unsigned char *) derived from
    variable declared in this scope is not released (memory leak)
disk_space.c:166:15: Only storage bin.ref_bin (type void *) derived from
    variable declared in this scope is not released (memory leak)
disk_space.c:171:10: Null storage returned as non-null: NULL
disk_space.c:171:15: Fresh storage path not released before return
  A memory leak has been detected. Storage allocated locally is not released
  before the last reference to it is lost. (Use -mustfreefresh to inhibit
  warning)
   disk_space.c:164:31: Fresh storage path created
disk_space.c:171:15: Only storage bin.data (type unsigned char *) derived from
    variable declared in this scope is not released (memory leak)
disk_space.c:171:15: Only storage bin.ref_bin (type void *) derived from
    variable declared in this scope is not released (memory leak)
disk_space.c:174:21: Function enif_alloc expects arg 1 to be size_t gets int:
                        len
  To allow arbitrary integral types to match any integral type, use
  +matchanyintegral.
disk_space.c:174:3: Fresh storage path (type char *) not released before
                       assignment: path = enif_alloc(len)
   disk_space.c:164:31: Fresh storage path created
disk_space.c:176:11: Null storage returned as non-null: NULL
disk_space.c:176:16: Only storage bin.data (type unsigned char *) derived from
    variable declared in this scope is not released (memory leak)
disk_space.c:176:16: Only storage bin.ref_bin (type void *) derived from
    variable declared in this scope is not released (memory leak)
disk_space.c:178:40: Function enif_get_string expects arg 4 to be unsigned int
                        gets int: len
  To ignore signs in type comparisons use +ignoresigns
disk_space.c:180:11: Null storage returned as non-null: NULL
disk_space.c:180:16: Fresh storage path not released before return
   disk_space.c:174:3: Fresh storage path created
disk_space.c:180:16: Only storage bin.data (type unsigned char *) derived from
    variable declared in this scope is not released (memory leak)
disk_space.c:180:16: Only storage bin.ref_bin (type void *) derived from
    variable declared in this scope is not released (memory leak)
disk_space.c:184:10: Null storage returned as non-null: NULL
disk_space.c:184:15: Fresh storage path not released before return
   disk_space.c:164:31: Fresh storage path created
disk_space.c:184:15: Only storage bin.data (type unsigned char *) derived from
    variable declared in this scope is not released (memory leak)
disk_space.c:184:15: Only storage bin.ref_bin (type void *) derived from
    variable declared in this scope is not released (memory leak)
disk_space.c:186:14: Only storage bin.data (type unsigned char *) derived from
    variable declared in this scope is not released (memory leak)
disk_space.c:186:14: Only storage bin.ref_bin (type void *) derived from
    variable declared in this scope is not released (memory leak)
disk_space.c: (in function stat_fs)
...
disk_space.c:414:51: Fresh storage path not released before return
   disk_space.c:273:48: Fresh storage path created
disk_space.c:417:46: Fresh storage path not released before return
   disk_space.c:273:48: Fresh storage path created
disk_space.c: (in function load)
disk_space.c:422:40: Parameter priv_data not used
  A function parameter is not used in the body of the function. If the argument
  is needed for type compatibility or future plans, use /*@unused@*/ in the
  argument declaration. (Use -paramuse to inhibit warning)
disk_space.c:422:64: Parameter load_info not used
disk_space.c: (in function nif_init)
disk_space.c:450:200: Initial value of entry.num_of_funcs is type arbitrary
    unsigned integral type, expects int: sizeof((nif_funcs)) /
    sizeof((*nif_funcs))
disk_space.c:450:261: Local entry.reload initialized to null value:
                         entry.reload = NULL
  A reference with no null annotation is assigned or initialized to NULL.  Use
  /*@null@*/ to declare the reference as a possibly null pointer. (Use
  -nullassign to inhibit warning)
...

Finished checking --- 93 code warnings

Version 1.0.0 with the NIF in Rust coming soon.

Also, I really hate the entire “vibe-coding” process; it turns me into a stupid part of a slow loop. I feel the need to understand what I’m doing, but also know that I have no time to dig into Rust or OS internals. Nevertheless, I feel compelled to finish the “re-vibe-coding” in Rust, to at least release something that isn’t as broken as the prior versions.

Should have asked me in a DM, would have told you to use Rust from the get go. :smiley:

Now if I can only fix the strange out-of-memory in the GitHub CI runners, I’d be able to finally release my library… (It’s not because of Rust though, my tests so far have put it somewhere in the native building stack in a previous stage of the entire process, or a compile-time switch.)

If you have questions how to best work with Rustler, just ask.

1 Like

Thanks @dimitarvp! I looked at what mix rustler.new created and saw that it different from what many older articles online were talking about.

From then on, it was pretty smooth sailing (if you exclude how much I hate vibe-coding lol).

Here it is: disk_space | Hex v1.0.0

It now works on Linux, macOS, Windows and Elixir 1.14 to 1.18 (probably also 1.18.4) and Erlang/OTP 25 to 27 (probably also 28) and the 4 BSDs with their Elixir and Erlang/OTP versions from their respective package repositories.

And geez, HackerNews really dislikes C…

As they should. Are you surprised?

Are you kidding me? After seeing the output of splint, I’m in awe at Linux, if anything :slight_smile: I’m not surprised. I’m very positively surprised by Rust though.

But I’ll stick with Elixir!

1 Like

Why not actively use both, like me? :smiley:

Making stuff work with C is nearly the same as making stuff work with Python: it is working despite the language. Not because of it.

1 Like

I’d have to learn Rust, and since 2023 I’ve decided to spend at least 5 years non-stop on Elixir. It brings me joy to write Elixir code. I know it sounds corny, but it’s true.

Edit: And I really enjoy writing Elixir-related books. I learn, I figure things out, I document my learning for others. Triple win. Brings me back to my development-engineer days at ABB when I was doing the same for every design/development project (develop the thing, document the results, the process and the pitfalls)–that company had an incredible documentation culture at the time.

I love Elixir and so far to me it is irreplaceable (9.5 years after discovering it). It covers aspects that so many other languages / runtimes don’t even dare begin tackling (Rust and Go are exceptions but they just don’t take the problem seriously enough and as such kind of accidentally try to invent their own BEAM, badly).

I am saying that my attention can be on multiple things. Focus is hugely important and it’s been a lackluster part of my methods (one that I am actively fixing lately) but I also find it extremely important to have more tools in my bag so I can practice the “right tool for the job” confidently.

1 Like

I hear you. I dabbled in Go for a while, and I generally like it, but most of all the generation of a single binary that can bundle frontend code, too.

If I had the time, I’d learn Go and Rust next to Elixir. Zig would be the next one on my list.

Knowing many programming languages and their ecosystem also helps with cross-polination of ideas.

For example, on a plain-PHP backend I’ve been maintaining, I got tired of the same query-params and request-body validations in every route handler, so I wrote two classes that make it possible to do this:

function schema_finance(): Schema
{
    $schema = new Schema();
    $schema->add_field('rel_slug', 'string');
    $schema->add_field('title', 'string');
    $schema->add_field('transaction_type', 'string', false, 'finance_transaction_type');
    $schema->add_field('due_date', 'string', true, 'finance_due_date');
    $schema->add_field('amount', 'float', false, 'finance_amount');
    $schema->add_field('payment_method', 'string', false, 'finance_payment_method');
    $schema->add_field('payment_reference', 'string', true, 'finance_payment_reference');
    $schema->add_field('category', 'string', true, 'finance_category');
    $schema->add_field('subcategory', 'string', true, 'finance_subcategory');

    return $schema;
}

function changeset_finance_create(array $body): Changeset
{
    $schema = schema_finance();
    $permitted = $schema->get_fields();
    $optional = ['due_date', 'payment_reference', 'category', 'subcategory'];
    $required = array_diff($permitted, $optional);

    $helper = new GlobalConstants_Helper();
    $transaction_types = array_keys($helper->getTransactionTypes());
    $payment_methods   = $helper->getPaymentMethods();

    $changeset = new Changeset($schema);
    $changeset
        ->cast($body, $permitted)
        ->validateRequired($required)
        ->validateLength('rel_slug', ['is' => 26])
        ->validateLength('title', ['max' => 50])
        ->validateNumber('amount', ['min' => 0.01, 'max' => 100000])
        ->validateInclusion('transaction_type', $transaction_types)
        ->validateInclusion('payment_method', $payment_methods)
        ->validateLength('payment_reference', ['max' => 30]);

    return $changeset;
}

That’s thanks to my continued exposure to Ecto, which makes the use of schemas and changesets seem like a very good way of doing things.

It doesn’t yet track the state of changes to avoid updating unchanged fields, but has already helped cut down massively on code repetition. Plus, I enjoyed bringing a bit of Elixir flavor into a PHP codebase.

1 Like

For anyone interested, there’s a follow-up to this experiment that I posted to devtalk.to: Vibe coding leaves me with a very sour taste - Blogs/Articles/Talks/Podcasts - Devtalk