I just don’t trust their code or logic. I make them TDD everything, and read through the tests with a fine-toothed comb (or write them myself). Elixir 1.19 has mix help MyMod and I just call it out for not reading the docs.
Ultimately, if it’s giving bad results, tweak the prompt and try again. It’s not about it getting it right the first time, it’s about it getting it right faster than you could have written it.
Most of the time I only use them on things that are tedious and I don’t want to do. If it’s mission critical, I don’t let it do it, and may only use it as a rubber duck / pair reviewer when I don’t have another human available.
Also, f*** Grok. It tries to groom kids. That company has no safety policies
I’ve been avoiding bringing up testing because somehow testing is controversial, but it makes absolutely no sense to me to have AI “do TDD for you.” The entire point of TDD is that you feel the pain of your API and what it takes to setup a scenario to use it. This helps you understand how to simplify your interfaces. Unfortunately, TDD has always gotten a general bad rap and has recently been made worse by social-media “influences” like The Primagen (who rags on it with no actual understanding of what it really is).
As a counterpoint, having “AI” auto-complete a single test for me has been one of the biggest benefits for me (and maybe this is what you are talking about). I’ve experienced something as outdated as co-pilot write almost the exact test I was going to write. But I still do it one at a time and adjust.
To be perfectly honest, this is really getting to the crux of it for me, though while they may not be backed by Elon Musk, I don’t feel any better about Antropic, ClosedAI, or Google and have no desire to send them money or have my employer send them money on my behalf (or use anything that will result in my seeing their ads). I’m aware there is a whole hypocritical rabbit hole here of where to draw the line (I still watch YouTube) but that is perhaps another topic?
Nice troll. Blatantly incorrect. But, I’ll play along in hopes of providing some input to the audience.
AI cannot yet read your mind to guess accurately about anything you are thinking without some kind of input in addition to the code. What about a greenfield project? The code does not yet exist. Upon what, then, would the AI operate without any kind of added prompt instruction?
Of course, though you did not state an qualifiers, you could be thinking of a script that is sufficiently simple to be pasted as the prompt itself. In that case, it may just perform an analysis of it automatically, perhaps a linting, maybe a refactor or bug fix of something that may be obvious. How have you experimented with this idea? Did you even try? What were the results? You could test it easy enough. Do a fresh git clone on a decent sized code base (to avoid any prior AI artifacts interfering with the test.) Then, start claude, cursor, or windsurf/cascade on that fresh code base and just tell the prompt to “proceed” or just “go” and see what happens. It has all of the code. The key detail it does not have is why any of it matters to you. It exists. Yay! All done. lol Please report back after you’ve proven your original conjecture. It may be fascinating to hear of what new ideas the AI has come up with in your behalf.
You seem to be confusing data with an expression of intent. The code base is merely data upon which an AI can operate, inferring some context for the intent expressed by the prompt instructions. Unless there is a glaringly obvious flaw the AI may just happen to find, it cannot guess what needs to be changed. You have to provide AI the intent in addition to the code. You may want behavior X and Y instead of Z. The AI cannot infer that with no other input. Similarly, perhaps you want a better layout of whitespace on the home page. The AI cannot infer that either with no other input. It is very easy for an AI to guess (hallucinate?) incorrectly given poorly expressed or incomplete instructions. Giving even less instructions (none) is not likely to improve the result.
Recently, I used AI (Windsurf, Claude Sonnet 4.5, GPT-5.1) to add an entirely new feature to a product. Prior to this, the feature only existed in my head and in a few conversations I have had with other humans. There is no way an AI could intuit this idea to create it, especially not in the way I envisioned it, at least not without input. Instead, it took several hours of conversations with the AI before it could even document exactly what I wanted created in the way in which I wanted it. I had it create detailed plans for 11 phases of implementation that were suitable for a junior dev. On Friday, I had it implement each phase in turn, so that phases 1 through 9 were completed and working by the end of the day. Phase 10 will be the documentation. Phase 11 will be some polish and final testing. The idea the AI could have come up with and implemented all of this entirely on its own using only the original code as the input is, quite frankly, laughable.
Oh, boy.. It’s hard for me to believe it, but you’ve misinterpreted my message entirely. I understand why you missed it though. You’re so vested into the “AI” tools, you can’t afford to perceive the message in a way that’s even less suitable to your belief system.
Let me break it down for you: The meaning of the message is that the code itself (the desired end-product), written by a human is the optimal alternative to the LLM generated result of any prompt you can possibly come up with.
Do not overestimate your trolling skills. The subject was obvious. But the post is specifically about “ai prompt” and not output, so that makes at least one post in the thread that tries to adress that, I liked it.
The subject was obviously not obvious enough if it’s not obvious to you two. Had there not been for the word itself in the code itself you’d be on point, but you’re not.
And my post is not intended for trolling but as a challenge to prove me wrong. So far no one did.
Or let me put it for you this way: “The optimal AI prompt is no prompt at all”. Clear now?
Using LLMs is non-deterministic; they will sometimes give you drastically different recommendations if you just opened two separate browser tabs and copy-pasted, byte for byte, the same prompt.
I see zero sense in claiming “so far no one did” in these conditions. Would not be a scientific experiment anyway.
As you were told by multiple people – if you don’t find value in LLMs then don’t use them. Others have derived direct value, some of which was also financial, not just intellectual curiosity and productivity.
Not sure what else are you looking for. Maybe a detailed big session with many dozens of prompts where people demonstrate the value of them to you? If that’s the case I’d go to Reddit and ask – people are happy to share their sessions.
It’s not black and white and I’ve said it myself from the beginning. I find it very valuable (for hints and code snippets), but am of a firm belief it’s never gonna be capable of writing software for us (it being an LLM), regardless of how much time and effort one invests into writing those prompts.
If time is money, then yes, absolutely, using it has contributed to my financial well being as well for I can do more in less time, but it’s still me the one who has to do it at the end of the day. But also, the sole sensation when you’re deep into a problem and then you know how LLMs “get” things and then see how some people entrust them with agent-driven iterative corrections until the runtime error is no more (so everything’s suddenly all fine, right?) and believe it will somehow help them construct software.. Sorry, that’s simply delusional.
Well it seems that you came here to argue with extremists and they don’t exist in this forum. Sorry to disappoint!
People quickly learned not to do that and started doing multi-agent workflows where you or an LLM are project managers / coordinators, make a plan for a feature / bugfix, break it apart on smaller bits and pieces, write them out well and leave nearly zero leeway for the LLM agents [that do each of those smaller tasks] to misinterpret the requirements.
Personally I don’t do that. That is my red line because my brain barely participates in any coding – though I’ll absolutely join @TimButterfield here because just writing down good project requirements and sub-tasks is by itself very valuable regardless if you are using LLMs to implement them or not.
And after the AI craze wears off – those literal hundreds of billions have to be repaid at least partially and most AI companies operate at a loss – I still want to be able to do problem solving by myself.
But I am not certain whether tearing down strawmen that do not exist on this forum is interesting as a topic of discussion.
I wouldn’t bet on it. The two guys up above have both accused me of trolling. Apart from it being a disgusting term IMO, in psychological terms it means they most likely got “triggered” (another disgusting word given the context, also IMO), which on the other hand implies they may have made an emotional investment into the hype, hence their accusation of me “trolling”.
You will do well to take a few deep breaths because your tone was not quite productive at places either. Let’s not have this thread locked. There are interesting sub-plots here. And IMO discussing pros and cons of the current breed of AI gives people perspectives – that alone is extremely valuable.
I was not expressing bias or vested interest. I was responding to what you actually wrote, which may not be what was intended.
In your original message, you used the term ‘prompt’ while just now, you used the term ‘result’. To me, prompt is input while result is output. Those are opposite meanings. At this point, it is not clear to me which you actually meant.
It’s a figure of speech called adynation (a hyperbolic figure of impossibility), meaning what I’ve already told you. I could’ve written it in a longer and more explicit way like: “No AI prompt that generates code can produce a result that is as optimal as the code written by humans.”, but it would no longer sound as dramatic for a headline, but rather dull.
And yes, it did cross my mind that there might be some ambiguity to it, especially if taken at face value, but I opted for it anyway. Plus, I myself was is in a dramatic mood at the moment still fresh from yet another frustration with LLMs (hence the post).
Btw, another episode with LLMs just around the time of your reply. Seems they are particularly dumb when it comes to Elixir macros, but I just solved it myself (again) so I won’t complain (for now).
This might not be a useful observation. But programming languages are not great tools for expressing ideas, they are implementation tools for controlling a machine, they all have constrains & flaws and bypassing their constrains you need either some esoteric knowledge (battle scars) or time. Hence why we use various PL for various tasks
I think on the other hand prompting is better for expressing ideas, the key is to constrain what the LLM can do with said prompts. Last week I was firmly on the AI & LLMs are garbage for generating software, this was based on my claude use from earlier this year, but since the release of gemini 3, I have heavily revised my opinion and I strongly believe that eventually LLMs will be great at “one-shotting” software generation in the coming months
This just reminds me of a programmer I respect on twitter saying that he starts all new projects with AI, now this was like 3 months ago and I thought such proclamation was cringe/corny, but now I can see where this is headed
This seems to be a pretty common approach, and it makes sense. AIs are good at producing something, so they are a great tool for getting over the initial inertia of trying to start a project. If LLMs were around back when I was in school I would have used them to write the first draft of every paper. I was always pretty good at revising my essays, but getting that first draft down was the hard part.
When it comes to software though, I personally find that revising is actually harder depending on what kind of foundation is in place. I don’t think there’s anything wrong with using an LLM to generate the initial project boilerplate as long as the programmer driving it is still putting thought into the structure of the code and the modules and all that.
All the little decisions early on add up to unmaintainable software if we’re not careful.
It kind of seems like the industry is hoping that LLMs will get good enough that they can just rewrite it once it gets bad, so this would become a non-issue. I think that’s probably not feasible without another major breakthrough in AI research though. The current approach to training LLMs just doesn’t seem like it would be sustainable with the amount of improvement they’d need to get to that point.