Change my mind - the optimal AI prompt is the code itself

,

The optimal AI prompt is the code itself.

2 Likes

Take it you’re not a fan then :lol:

3 Likes

Actually, I’m using it/them on a daily basis and they are driving me nuts more often than not. For as long as they’re used as “smart” search engines (a hint book), almost everything is fine. Face them with anything deeper than that and it becomes a waste of time.

The other day I confronted two of them because they had conflicting suggestions. I gave them both a snippet of Elixir code (like 30 LOC) to double check if I did it correctly and analyze against possible race conditions.

At first I was convinced ChatGPT gave me a competent answer. The analysis sounded competent, going through different scenarios and all. Then just in case I also gave it to the newest Grok 4.1 - bam! Equally competent sounding analysis, but with a conflicting conclusion. Then I took the suggestion of one and gave it to the other.
“The other LLM is completely wrong.” was the answer, and then some. As if I hit its ego. Then I did the opposite (gave the suggestion of the second to the first one) - same result if not worse. In the end, I realized they both did the same mistake - neither took the current version of the Elixir Task module source code (until I explicitly copy-pasted it for them in the end) but they did blame each other in the meantime for not doing it.

Anyway, the above is just a drop in the ocean of my dissatisfaction and I truly cannot figure out how on Earth do some people think they can use these tools to write fully operational software.

9 Likes

I also just use it as a better search, although I’d have to say “different” search. I still haven’t quite figured out what my criteria are but I find myself reaching for a chatbot first or a search engine first depending on what I’m looking for. I’ve experienced it both ways where I can’t for the life of me get the answer from a search engine and then a chatbot one-shots, and the same in reverse where I can’t for the life of me get a chatbot to give me a coherent answer and then a search engine answers in the top result :person_shrugging:

4 Likes

I don’t really understand the appeal of AI for code gen. It can do some cool stuff, and I can see it being really useful for things like generating a bunch of hard-coded yaml or whatever, but I’m much faster at writing code than reviewing it, so it feels like a net negative in productivity to outsource the production of the code and take on a full-time reviewer role. Reviewing code is also much more mentally taxing, so it wears you down earlier in the day.

Most of the 20-30+ year veterans I talk to swear that AI is changing everything and always talk about how amazing it is, which I’m guessing is because they have 20+ years of experience in senior roles where they were mostly reviewing code and mentoring other people. I don’t have that level of experience though, and I imagine that if everyone starts with AI, they’ll never develop that experience because reviewing AI code is not the same as reviewing a human’s code. Human code review is a collaborative process that involves mentoring for more junior submitters and design discussion for peers. With AI, it’s just making sure it didn’t hard-code the API keys in the source files again. :upside_down_face:

10 Likes

Is it possible they have a stake in AI somehow? :thinking: :grin:

3 Likes

Lamport begrudgingly wrote “Paxos Made Simple” despite believing that the best way to present an algorithm like Paxos is through proofs (i.e. a TLA+ spec). As it turns out, not only is the paper widely mocked for not being simple, but the explanation in the paper contains a bug which is present depending on your interpretation of the algorithm as specified in “plain English”. Vindication!

Programmers do not understand AI because it’s not for them. These models are a revolutionary technology for people who do not understand how to program computers. They are useful for doing things that we cannot do with computer code.

I remember seeing someone (I think it was antirez?) make a point that all of the hype for MCP (the protocol) was a case of everyone focusing their attention on the most boring and unremarkable part of a fundamentally revolutionary technology.

There is something so comically absurd about taking a technology which is revolutionary specifically because it can do things we cannot do with computer code and then using those models to write the same computer code we were already writing, except badly. Talk about missing the forest for the trees.

But hey, at least we finally managed to put radio on the internet!

4 Likes

I’m not sure I follow your point.

“writing code” is not something we used to do with computer code. At least not “writing ad-hoc, task focused code”. We could generate crud boilerplates, generate API clients based on specs, etc, but you would not have some computer program that could spit out an app that “tells if you’re in a national park when you take a picture” without that generator having most of the implementation ready somewhere and available for copy.

I think the opposite about MCP. It is great because it allows to just write some code for things that can be done with code, so the AI does not have to do those things, and just ask the MCP server to provide them.

Of course the fact that it is difficult for programmers to understand my point was in fact my point :slight_smile:

The reason I love that XKCD so much is that it is a perfect, quintessential demonstration of what separates programmers from “regular people”. Programmers have trained themselves to view the world through a lens of rigidity that is not present in everyday life. We do this because that rigidity is a trademark of how computers function. If you have been writing code long enough you probably don’t even remember what it’s like to think normally. The first thing new programmers are taught is often “the computer will take you literally”; this is the beginning of a process of unlearning.

I suspect this is also why most open-source software is well-engineered but unusable for the average person. Overlap between good programmers and good interface designers is quite narrow because the latter must retain the ability to think like a normal human.

Machine learning is another world: one where computers actually work like non-programmers think they do. In the beginning the techniques were arcane and ineffective, but deep neural networks (nobody even says those words anymore) changed that, and pretraining on massive corpuses of language changed it further.

This is exciting because it makes trivial what was so recently impossible, as that comic nicely demonstrates. But it also means that programmers are the least equipped to understand the value-add, for the same reasons that they were most equipped to understand that GPS is easy and recognizing a bird is hard, right up until that dichotomy was erased by progress.

Attempting to coerce this technology back into the rigid box from which we have just escaped is absurd.

With respect to MCP, I do not personally have any opinion except that I do not care about it at all.

5 Likes

I understood it as why on earth would we write programs to identify birds when a chatbot is that program. Using an LLM to write code that… uses an LLM… isn’t very interesting at all. Am I off based on my understanding of what you’re saying here?

MCP is nice in theory, but considering there are still new products being made with SQL injection vulnerabilities, I feel like giving the user access to prompt a model that has write access to the server filesystem or production db is a disaster waiting to happen.

Except it already has happened. lol

2 Likes

Providing an MCP with delete/insert/update SQL capabilities or uncontrolled select capabilites is as dumb as providing the same capabilities from a REST API. I would not say that this is a protocol problem, it is a careless people problem :slight_smile:

1 Like

I actually agree. That’s my general opinion on this subject too. But historically, “just do it right” hasn’t really worked for the industry. Hence how we still have new apps being developed with SQL injection vulnerabilities even without any vibe-coding. :slight_smile:

As for the MCP stuff, I don’t know the details since it’s outside my wheelhouse, but IIRC, the engineers at Supabase basically said it wasn’t possible to fully sandbox the AI, and there was always going to be some possibility of prompt injection. Not sure how true that is on a technical level though since I’ve never messed with MCP.

I think the issue might simply come down to the usefulness of the tool. If we were to sandbox an AI to a point where it’s actually completely secure, it wouldn’t have access to any of the things that make it useful, and thus it wouldn’t be any better than existing tools. But it would be more expensive. An MCP interface to GitHub for example probably wouldn’t be very useful if it doesn’t have push access to the repos I’m working on, and if it has push access, it can do a git push --force origin/main at any time.

The simplest example for the thought experiment would be a todo-list AI. If you don’t give the AI write access to the todo-list, then you still have to go in and check off all the tasks manually, so it’s not that useful. Maybe it could still be beneficial for summaries or reminders, but I think most people would want an AI-powered todo-list to automatically update and check off tasks by just saying “I finished the grocery shopping for this week.” But if it can check off tasks, it can also just delete your whole to-do list.

My understanding is that this problem comes down to how deterministic we can make the AI. Basically, how can I make sure that an AI that has push access to my repo will never force push to main? Is that even possible with the way this technology works? I dunno cause I’m not an AI researcher, but it seems like we at least haven’t figured that out yet. :slight_smile:

P.S. Off topic, but this is the best git alias: alias git-fucked="git push --force origin/main"

3 Likes

That sounds like you just want to write and not review much?

That then sounds like you code as a hobby and not [that much] commercially?

80-90% of commercial programming is reviewing code. Part of the skill-set necessary to transition to a senior is to build the brain pathways to review quicker and more efficiently.

I do understand why you would not want to do even more of that by having to review LLM code. That I sympathise with. Though in part of my commercial work I found myself not having enough time budget to do proper deep research. In those case I was happy to just dump an entire OSS library’s source code to Gemini and told it: “I cannot quickly figure out how are we using this library wrongly. We need X and we get Y. What’s the way to do X?” – and it delivered. In those cases I reviewed as well but it was kind of a leap of faith, further enhanced by good tests.

Oh, how I would love to work in a place where the team is not always behind schedule (according to the obviously hugely competent executives). I already forgot what it is to work in a slower-paced, more human, environment.

3 Likes

The classic argument against LLMs is “if I have to check its output so much then I might as well write it myself” – which is 100% fair… but misses nuance, as most one-liners do.

A more junior dev definitely needs to build mental models and good habits by writing stuff themselves. Barring SciFi tropes like RNA injections that give you new memories right into your brain, there are no shortcuts to building an actual experience. Juniors / learners absolutely should use LLMs just as a better search engine because the Web has been in shambles for good 10 years and finding good material with a basic query became a luxury. How would one learn if they cannot even find answers to their questions? LLMs fill that gap very nicely. And search engines like DuckDuckGo make this even better! You search right in your browser’s bar and you get results + an AI answer. Saved me good chunks of time many times this year.

For seniors / experienced, it’s not as simple. You might already know 99% of what you must write and you know it’s going to take you 30 minutes. So you sigh and get on with it. Or you spend 10 minutes carefully explaining the requirements to the LLM: inputs, behaviour, edge cases, desired output, tests. Gemini has one-shot a bunch of such things for me and yes I verified them, very thoroughly. Interesting fact: I also got a little educated here and there. So it was also a nice personal journey to remove some ego as well.

There are other cases like what I said above: sometimes you are pressed for time (i.e. a security problem that would barely even make sense to abuse) and everyone is panicking and saying “fix it within 2h!” – an LLM helped me greatly in few such cases (not just security problems) and I also once again emerged a bit more educated.

Finally, and this intersects with the previous paragraph, is that there are libraries that are just difficult to use unless you invest a lot of hours in them, something that many of us either cannot afford to do regularly (me) or outright refuse to do so (others I’ve met and I don’t bash on them, family and hobbies and recreation are super important!). In those cases enlisting an LLM and assuming it actually gets the job done can be also hugely beneficial.

You seem to have gotten a terrible first impression by mere 30-line Elixir code which is fair. I however transitioned from a strong advocate against LLMs to “use the right tool for the job and sometimes that is an LLM”. You do you obviously, but I and many others have derived a true and demonstrable value out of the agents.

4 Likes

I would like to know what this is like as well. :slight_smile: I’m at my first full time dev job (programmed as a hobby for about 8 years before this), and I work for a startup in the clinical research field. Due to the nature of the industry, deadlines can’t be extended (studies are approved to start and end at certain times by the government), and because it’s a regulated industry, we can’t cut any corners either. With the volume of feature work and R&D that we have to do to meet our client deliverables, our deadlines for individual tasks are often measured in hours or even minutes. And we also have to deal with being periodically audited to ensure regulatory compliance, so we can’t sweep anything under the rug and fix it later.

I also have reviewer rights on a few repos, so I do handle peer review for colleagues and seniors. But I’m also still very early in my career, so I’m still figuring out my process for how to context switch efficiently and give good useful reviews. Especially when under time pressure.

As I mentioned before, some of my lack of enthusiasm about AI is due to my level of experience. It seems like seniors can make better use of AI than I can. But my deadlines are intense enough that I can’t afford to spin my wheels prompting an AI if I as an individual can get to the finished product faster by coding it myself. And when I say finished product, I don’t mean code that does what the requirements doc says, I mean code that is secure, tested, documented, has extension points for the known future enhancements on the roadmap, and has some kind of overarching consistency in its patterns and design that can be followed by the rest of the team so that it doesn’t turn into spaghetti as soon as someone has to fix a bug. I’m not confident in my ability to deliver on those implicit requirements through an AI, but I am confident in my ability to deliver through my knowledge, skill, and speed at producing code. :slight_smile:

1 Like

Where did you find this figure? Been in this for a life time and, apart from checking my own code time and over again (but always only more complex parts, far from everything I write), I haven’t noticed these stats present anywhere. Even when I used to manage some very junior devs it wasn’t the case. Back then if I saw a spaghetti mess I wouldn’t even bother reviewing more of it. There were other remedies in place.

.. and then it fails to produce a desirable result, then you spend a “couple” 10 min turns more without getting what you need, then you write it yourself, ending up happy it only took you 3 times longer than you needed to write it completely on your own.

Yes, I may be into stressing the instances of my frustrating experience with it far more than the positive ones, but again, I’m sticking to what I’ve said previously - for as long as I treat it as a hint book or an API guide, it mostly meets my expectations. Beyond that it’s been pain and misery.

..and then there are libraries (such as interact.js - a blessing for when you really need it) that you spend hours trying to figure out how to properly use it (down to the last detail) only to realize the LLMs are being patently wrong in understanding how its API works. Then again, you do the hard work of ol’ good trial and error yourself, anyway.

Anyway, my point being is not that LLMs are not useful (far from it, especially for things one doesn’t deal with on a daily basis, so any instant help is more than welcome), but given the direction in which it is going (and how it’s going) the chances I am going to unleash “AI” agents on my code base are getting slimmer by the day.

2 Likes

I’m with you. Every time I tried to get help with a problem from an LLM, I wasted time chasing problems that didn’t actually exist and trying solutions that couldn’t have possibly worked. Eventually, I gave up and solved the problem myself.

I can get behind using LLMs as a “better” search engine. It’s really a worse search engine, since it makes up arbitrary stuff, but thanks to Google search being borderline unusable for years, I see the appeal.

2 Likes

Agreed. Completely ceased using Google search like over a year ago. Even when typing into the google search I only look at its AI response in the top left corner. LLMs are the new search engines now, despite the “hallucinations” issue. Just waiting to see them spoiled too as in “Before I answer to your Elixir related question, click here to check out the pure brilliance of the latest C# features.”

I use Kagi, which I pay for, but I don’t want to pay for the AI stuff so I use it along with deepseek. So long as you don’t end your search with a ? in Kagi, it won’t give you an “AI” summary.

I too question this figure but it is still pretty high most days! But again, and I don’t care how much of a broken record I sound like, pairing solves this issue. When you pair you don’t have to do code review eliminating the worst part of the job. Agentic coding makes the entire job the worst part of the job which is why I don’t understand why so many devs are on board with it. My only (very and likely unfairly pessimistic) guess is that everyone is so used to throwing up their hands and saying, “Good enough… approved!” that agents haven’t really changed this process much.

3 Likes