I’ve been extensively using claude code for backend development and data wrangling in the last 6 months and it has worked very well.
I generally feel that LLMs work much better in the backend than they do in the frontend and most of the time the resulting UX is not that great unless I do heavy supervision.
I’m wondering if you have had a great experiences on developing beautiful frontends with Elixir? What are you using and how are you instructing your LLM agents?
Hi Onni! Between frontend and backend, where would you consider yourself more experienced and where do you hold more opinions?
The topic you bring up reminds me of the Gell-Mann Amnesia Effect, which is the idea that we see the flaws in things we’re intimately familiar with, but fail to identify issues on other areas and instead tend to take information as truth.
My experience has been getting good results intertwined with hair-pulling-throw-everything-away moments, across a broad spectrum of subjects and tasks.
The way I see it, each of our personal bar for what “success” means makes all the difference when coming out online to talk about our experiences.
In particular, designing user interfaces and thinking of the intended experience is something that I find more challenging to describe and set deterministic automated tests or conditions, specially when you don’t know what you’re looking for. It’s easy to become hostage to the vibes.
What LLMs have shown to be able to do with greater consistency is replicating existing patterns with adaptations. So, e.g., maintaining a vetted set of core UI components helps.
I have just tried it with vanilla liveview by asking “idiomatic elixir/phoenix code”. I feel that spacing, interactions, navigation and buttons are mediocre comparing to situations where I ask claude code to write svelte using https://www.shadcn-svelte.com/ and svelte-autofixer-mcp: Tools • Svelte MCP Docs . I have given playwright mcp for claude code to check the end results.
My gut feeling is that the component library for the shacdn-svelte is so great that it makes it easier to produce better results.
I will try Tidewave soon .
Anyway I would really appreciate just seeing examples from others and hearing what works for you on the frontend side
For example how are you using LLM assisted workflow? Are you satisfied with the results you get in the frontend?
My experience has been the opposite. Very bad results on the BE but fine for whippping up some basic UI prototype (have not yet tried to let Claude loose on our very messy React FE).
We have just started experimenting with “AI driven development” and so far the results have been abysmal. I am much more BE focused and the code we get from Claude there has been straight up stupid out of the box at almost every level. Far far below what I’ve seen from juniors in quality, if not quantity. Lots of boneheaded mistakes like not being aware of default values, or figuring out on its own that there is a structure file that should let it know certain fields can’t be null etc. Also will construct some elaborate abstraction to add a missing piece of logic, when actually a single new pattern match clause can do the same thing (and various similar existing pieces are implemented that way). But by far the scariest is the high level stuff. For example, proposing to fix db pool timeouts by just arbitrarily imposing timeouts on random queries. That’s just what I’ve seen this week.
That said, I’m certain we are just not doing it right yet. What I seem to be hearing from everyone making a serious attempt at this is that there is a massive amount of configuration, context massaging, “prompt engineering” etc required to get good results from Claude. We are iterating on all of that but I am still very skeptical about the possibility of getting good code from these things yet. I guess I will know for sure when we inevitably bring in one of the coming wave of AI consultants and see if they can actually deliver acceptable results. My joke is that having “robot skills” is going to be the new “people skills” resume buff.
On a positive note, we integrated Claude in our PR review flow and it has been absolutely invaluable, catching a ton of stuff that regularly gets missed because people do not review things nearly as close as a robot is willing to do. And the occasional whiff is easy to just ignore so there’s very little downside compared to the actual development process.
I’ve had very good results with Claude using the LVGL library to make frontend for small local displays on microcontrollers, and then going off an recreating the look pretty much perfectly in SVG for use on websites for reporting and remote control.
For controls that didn’t exits in the LVGL library I’ve described what we’d need, and which variables do what visually. And then asked for a control in the same style to keep it consistent. A bit of fine tuning visually, but it would be hard to tell they weren’t part of the original library.
I have used Claude with some dead and Liveview pages, and functionally that has worked pretty good. From a design perspective it seems Claude doesn’t have much of a design sense out of the box. Things will not be lined up, color choices are dubious, and UI functionality leaves a fair bit to be desired. The latter can be fixed by referring to how things are expected to work with tool tips, mouse overs, focus, accessibility and so on. Making or finding a skill for that would likely help a lot as the issues seems to be missing awareness of what is expected.
For the design issue in itself I did a test by asking Claude to make a skill with three different unique design styles. I believe it was consumer facing homepage, business industrial and information heavy content. I then referred to 8-12 websites of each style chosen from some winning web designs. Then I supplied some test content and asked Claude to make a web page in each style. The overall designs were much better, and clearly distinct from both each other and the sites referenced earlier. It might not get all the way, but sure helped.
As for actual code (Elixir and other languages) I do make skills for each language, and I prefer coverage over saving context. The C skill for instance is almost 32000 lines with a central hub skill and some 60 subskills. (Granted a bit on the extreme side there). I’ve found coding challenges like exercism and advent of code to be very useful for finding issues and calibrating code quality. I basically just feed the challenge as given as the prompt and let it rip.
Where I found Claude usually struggles is with architecture decisions. It is also a lazy, lying, lying about lying, and overly confident most of the time. And tends to get off track if left alone for too long, and has the memory of a gold fish so some ongoing notes helps. And every time there has been a compaction I wonder if I get the village idiot Claude, the run of the mill, or a real clever one. In the first case I just compact them away before damage is done.
Just plain HTML, CSS & Tailwind, JS. Able to iterate very quickly and looks better than anything I could make on my own. Wrap in LiveView components where it makes sense.
And using the frontend-design skill from Anthropic. This comes up with some very cool designs that aren’t the classic LLM vibe.