arcanemachine

A call to action: Use LLM agents to find vulnerabilities in your code before someone else does!

I was looking at Hackney on hex.pm and, I noticed in the Versions tab that there were a lot of vulnerabilities… and they are super recent!

Takeaways so far:

If you’re using Hackney and you haven’t updated in the last 3 days (!), you should probably update sooner than later.
The new hex.pm interface looks great!
hex.pm shows which versions of packages have known vulnerabilities. (Is this part of the redesign? I love it!)

OK, so that’s pretty crazy… How are there so many vulnerabilities so recently?! (If you’re like me, your first thought was probably “Mythos!!!”, which is close, but wrong.)

So I came here to search for info about Hackney, and… nothing! But there’s a post over on the Erlang Forums about it. At the very end, @benoitc notes:

Thanks to @PJUllrich, Ganbagana and tepel-chen for the reports, and to
maennchen for coordinating disclosure.

I recognized the first name immediately: Peter Ullrich. (I don’t know who the others are, but I’m thankful for their efforts as well!)

So I hopped over to his blog, and: jackpot!

Long story short, he used Opus 4.7 to find vulnerabilities in some packages, including Decimal (EDIT: I just realized I need to update that one too!). He doesn’t mention Hackney (the blog post is from May 12, 2026, but the security patch for Hackney is from May 25, for version >= 4.0.1). But I think it’s safe to assume that he applied the same process to Hackney as well. (Thank you, Peter!)

So where does that leave us? I’m glad you asked!

As I’m sure you know, we live in a world where you can basically talk to your computer, and it can write code to some degree or another!

Unfortunately, the bad guys know this too! And it’s only a matter of time before they make their way over to you. Yes, you, some guy in Nebraska!

So, to everyone out there, whether you are a library maintainer, a cog in some corporate machine, a coder with some free time on their hands, or some random idiot dumping yet more AI-related content to the Elixir Forums, I entreat you: Start using LLM agents to break the software you use before someone else beats you to the punch! Break your own stuff, break your dependencies… Break everything! (And be a good citizen and report it responsibly, please. )

We have a fairly nice, quiet ecosystem here. Let’s do our best to make sure we don’t get pantsed every other day like the poor folks in some of those other ecosystems!

It doesn’t have to break the bank… the $20 Claude plan doesn’t get you very far these days, but the $20 OpenAI/Codex plan can still get you pretty far (for now…). There are also open models such as GLM 5.1 and Kimi K2.6 which are surprisingly capable, and can be run on cheap subscriptions such as Opencode Go if you’re cash-strapped. (It’s $5/month for the first month, then $10/month. GLM 5.1 is a little slow sometimes, but it provides unbeatable value for the money.) If you’ve never used a coding harness, OpenCode is a great open-source tool you can get started with.

If you’re even remotely curious, I encourage you to get out there and give it a shot!

7 comments

#ai #vulnerability #pentesting

4 683 7

2026-06-07 17:05:53 UTC

Most Liked

maennchen

The EEF CVE Numbering Authority is currently handling a much higher volume of reports than we did only a few months ago:

CVE Activity - https://cna.erlef.org/

We also have a backlog of findings that is larger than our current capacity to triage, report, and coordinate. Because of that, we are prioritizing based on severity, exploitability, and how many users are likely affected.

I do recommend scanning your own projects. That is useful, and maintainers are in the best position to understand whether a finding is actually relevant in context. One good tool to do so is: GitHub - alpha-omega-security/scrutineer: Security through scrutiny · GitHub

If you find an issue in a library you maintain or depend on, please reach out to the CNA so we can coordinate disclosure properly: Maintainer Process | Erlang Ecosystem Foundation CNA

What I would not recommend at this point is everyone scanning everything, uncoordinated, and then sending large numbers of raw findings to maintainers. There is a lot of work involved in proving that a finding is real, understanding impact, preparing a patch, coordinating with maintainers, and publishing a useful advisory. We do not want to overload maintainers with unverified reports or duplicate work.

If you want to help with broader scanning efforts, please reach out to us on the EEF Slack so we can coordinate that work and focus it where it helps most.

On the Hex side, there is also active work happening to make this easier for users:

Advisory integration is already visible on Hex.pm and will also be integrated into the CLI, including mix deps.get / hex.audit.
Dependency cooldown is being worked on to reduce the risk from freshly published compromised or vulnerable releases.

There’s a lot of background work that went into making the advisories actionable: Starting the CNA, exporting data into OSV.dev, standardizing how we report metadata etc.

All of this work takes a lot of time. So if your employer is not yet a sponsor of the EEF, I highly recommend to become a sponsor to make this work sustainable: Erlang Ecosystem Foundation - Supporting the BEAM community

Post #3

benoitc

LLMs won’t always help. They may eventually find an issue, but not necessarily depending on the version being used, the available context, or the memory they have access to. Some of the “critical” issues that were reported were actually related to how requests were being sent, and I am still not convinced that all of them should have been classified as security issues when they were relatively small functional bugs.

What would be more helpful is for users to spend some time testing the software that is available on Git before making such reports. Functional bugs and other issues can often be identified that way before they become larger discussions.

This is one of the reasons we open source the software in the first place: so people can use it, test it, validate it, and contribute feedback. Otherwise, there is little difference from simply buying a piece of software and interacting with it as a closed product.

Post #6

dimitarvp

I personally am grateful for these initiatives. The industry has been collectively dragging its feet on security for literal decades, only reacting when there’s a fire.

The really bitter discovery here is that it had to be LLMs that gained the trust of the executives – not like savvy techies were not raising alarms literal tens of thousands of times in the past.

But maybe there’s something else at play here: executives are much more concerned with bad PR than technical excellence. And in the past they could silence security researchers – horror stories like these happen all the time even today. However, the big labs f.ex. Anthropic’s Claude Mythos are impossible to silence so now it’s suddenly OK to be security-conscious.

The lesson for us the programmers is: stay on top of things. I felt slightly better after upgrading decimal to 3.1 as I work in finance. You never know what will somebody send over the wire. The tighter your external-world-facing code is, the better.

That being said, I’ll not break my legs rushing to upgrade f.ex. hackney. All the security findings lately will be sneaked in, one by one, after a lot of testing. Because sometimes a security fix subtly changes a (potentially buggy but still stable and well-working) code piece that is critical for your app(s).

Post #7

Last Post!

GrammAcc

Making libraries more secure is always a good thing, but there’s also the issue of ChatGPT generating a dockerfile that will expose your database to the WAN with postgres:postgres.

I don’t actually know if ChatGPT would do that. Probably not. But an engineer that doesn’t know that docker binds to 0.0.0.0 by default when using port-mapping probably would. Point is that it’s very easy to create a security vulnerability in an application that has no CVEs in its supply chain. There seems to be this idea among product engineers that security is always someone else’s problem (the compiler engineers, the library maintainers, the package manager devs, the in-house security team, etc), but this is simply false. Security is the responsibility of the engineer putting up the PR whether they are adding a dep to the npm lockfile or implementing some new data serialization algorithm with pointer arithmetic in C.

I don’t envy DevOps/Infra engineers these days. The only thing holding the internet together is their firewall/private dns config, and even that is questionable since the hosting provider is probably also rushing out features for their product offering.

Post #8

Where Next?

View thread on forum (has 7 responses!)

vulnerability

pentesting

Home Chat & Discussions>Discussions

#ai #vulnerability #pentesting

21 683 7

Last post

A call to action: Use LLM agents to find vulnerabilities in your code before someone else does!

arcanemachine

A call to action: Use LLM agents to find vulnerabilities in your code before someone else does!

Most Liked

maennchen

benoitc

dimitarvp

Last Post!

GrammAcc

Where Next?

Popular in Discussions

Elixir vs. Java, Ruby, and PHP - Seeking case studies or an interview with a community member

Phoenix LiveView is now... live!

String.capitalize() should have a “leave the rest of the word alone” option

Are you using a clustered Elixir deployment?

Phoenix LiveView vs SPA

Rails and Phoenix as 'one person frameworks'

Jose Valim: "Elixir is, officially, a gradually typed language"

Other popular topics

Transform a list into an map with indexes using Enum module

ElixirLS - the Elixir Language Server

Websocket connection works on localhost, but get 403 error when deployed via docker

Best way to send multiple files as HTTP response

Checking if an enum is empty - Credo vs Compiler

Neovim - Elixir Setup Configuration from Scratch Guide

Chat & Discussions>Discussions

Latest on Elixir Forum

Sponsor Spotlight

Our Sponsors

Categories:

Sub Categories:

Forums

Popular Tags

Our Sponsors

We're in Beta