Engineering leads, what are you doing to stop the slop?

As more and more code is written by agents, driven by humans, what strategies are you employing to keep the slop at bay?

Managing a team of 2 engineers, it’s become quite difficult to review and keep the slop out at the rate of features being built. I can’t imagine what it’s like with a team of 20 engineers. I know the status quo will not work period.

I’m hiring a third engineer soon, and want to make sure I do my best to mitigate this new reality.

4 Likes

Leaning more and more on linters.

In the progress of shifting time spent in review to planning, with prompts specifically being taken up into official documentation/review process.

Giving up? :person_shrugging: I fear we will soon all be one with slop, part of its warm, loving, all encompassing embrace….

Half joking but my own code is worse now. I don’t see how it will be possible to keep up.

2 Likes

Keep in mind that it was always a slop fest in fortune 500.

Now its just easier then ever to write it. The only way our industry will learn from this is by failure, so keep the slop up until people start demanding not to use those tools. It’s kinda funny since rewrite might becoming more affordable for organizations then ever.

5 Likes

The cost of producing correct and useful software has moved from generation to verification. We’re moving forward with the understanding that if your name is on the commit, you better be able to explain what it’s doing in a 2 am outage.

With that in mind, deep code review becomes more important, and the impetus is on the author to explain the changes, preferably in a live session when possible.

A team of humans still needs to understand what’s going on. LLMs haven’t changed that.

2 Likes

Yeah, I totally agree. But how to do this? How to make sure that reviews are deeper? My experience tells me that reviews have always been a best-effort thing which could not catch bugs, and reviewers dont usually understand what problem the request solves

1 Like

We are preferring in person presentations (or video call) for the author to explain the changes and respond to real time questions. It’s more of a “defense” than a review to be honest.

I don’t think this model can apply directly to an open source community, though, due to individual schedules and time zones, etc.

1 Like

We’re still trying to figure this out. A few things we’re playing with:

  1. Plan and implement end to end on a one shot in a separate repo first – basically built a full working prototype from a plan to see a possible solution, debate approach/architecture, look at performance, and confirm with stakeholders. Sometimes do two or three separate complete implementations to compare.
  2. Refine that plan based on that feedback (the plan document you’ll use for building) – approach is important, and the doc leading to that approach remains important.
  3. Still require work be delivered in steps that are reviewable by humans– the ability to deliver what would have been an entire epic in the same time as a story doesn’t mean that it’s confirmable and doesn’t mean that you’ve reduced the general integration risk of many changes at once. So we need to figure out a new grain size, but it’s probably still smaller than you can feasibly build in one go.
  4. We were always heavy on credo + custom credos, dialyzer, and test coverage– we’re only using AI to even further enforce standards using credo rules especially. We’re also playing with additional bot reviews to catch more and more stuff automatically.
  5. Reviews focus on test quality and whether tests tell a story. That was always somewhat true, but AI makes test writing cheap but the tests are not always good.
  6. Raise the standard on docs and guides being generated alongside the code.

One thing I’m doing that we haven’t extended to the team is trying harder and harder to actually have my commits be reviewable and tell a story as well. The amount of code per PR is going up no matter what I do, so I care way more than before (we squash merge) that the PR is made up of commits that move you through the code narratively. I do a fair amount of back and forth and prep to accomplish this with the bots and drive the bots to build in that way.

7 Likes

Obviously we are all trying to navigate this frontier at the same time. I am skeptical we can stem the tide of LLM primary authorship, especially after experimenting with that approach in personal projects. It opens so many new doors! Yes, there are significant downsides compared to the software craft we are all accustomed to, but to me it feels like AI is the next higher level “language”:

  1. C abstracted the CPU out of Assembly (low → mid level)
  2. Java abstracted the memory management out of C (mid → high level)
  3. Claude abstracts the syntax out of high-level languages (high → meta level)

I think we need to face the fact that this is another revolution of the nature of our craft. (There are still many practical details to work out in the meantime!)

1 Like

This analogy is frequently drawn, but I more and more feel like it significantly oversimplifies the change. I think the progression is more captured along the lines of hand tools → power tools → factory automation. A woodworker who uses a table saw is still working the wood. Very few of the people working in a furniture factory know anything at all about wood, and may never even hold it in their hands.

1 Like

Full agreement with this. I no longer do some super annoying manual operations and I am very happy with that decision.

I feel video review should become the norm, especially for remote work. It doesn’t have to be an attack but yes, person needs to justify the code/feature; it’s just easier on a higher level of communication.

I agree my analogy does’t fully capture the nuances, and can get behind your “tool use” alternative. Either way, I think our transition to AI is going to have to be deeper than trying to shoe horn agents into our existing workflows.

Not at all trying to downplay the stress/grief of others in this thread trying to navigate this challenging environment!

I only just recently hit the acceptance stage for agentic coding, so I don’t have a lot of experience (and I’m not a manager). One thing I have not been letting it do is commit for me. This probably ruins some workflows where one person has a bunch of agents working on several features at once, though I haven’t reached the level of acceptance where I think that is in any way even a remotely good idea for production apps :sweat_smile: In any event, manually committing forces me to work in a more careful way, reviewing each step of the output along the way (the commit message itself can still be generated, of course).

2 Likes

I could not agree this analogy is somewhat accurate.

C is 14 years younger than LISP. And C has not abstracted the CPU by any means. LISP did.

That’s not Java abstracted the memory management but the virtual machine per se. Erlang had VM a decade before Java has been ever created. LISP had a garbage collector almost 40 years before Java.

Abstracting syntax means operating on MetaAST level, not on another ad hoc, informally-specified, bug-ridden, slow implementation of half of COBOL (which each and every natural human language is.)

Allowing code assistants to abstract the syntax is a critical mistake leading to an unsupported sloppy software monsters, born to be dead in a year. Code assistants might produce a valuable result if and only if the developer knows exactly how the code is to be written in details, what language is to be used, what paradigm is most suitable for this case etc.

3 Likes

I don’t have an answer and I’m increasingly worried that the solution might lie in the realm of hiring. Teams will have to hire people with very similar views on agentic coding and LLM usage. Otherwise they’re risking to play a repeated prisoners dilemma game where people who put more human effort into work consistently lose and burn out.

Good news is that we’ll never have to argue over tabs vs whitespaces anymore or stuff like that. None of that is important comparing to your LLM usage pattern.

3 Likes

Huh? As a person who cares about typography since the XXth century, the argument over m-dashes () vs double-dashes (--) vs dashes (-) in each and every documentation piece/blog post is literally my nightmare nowadays.

I humbly beg your forbearance of my poor choice of exact syntax. I nevertheless stand by my general claim and hypothesis.

I don’t know what you’re talking about. The only conversation I know around m-dash is not about m-dash but around LLM generated writing

To keep my response tied to the OP, my prediction is that we are going to lose this battle, and quickly. I think that in the next 6-12 months, many many dev teams will abandon their pride in the “old” craft and embrace the speed at which new features can be shipped. I’m not saying I endorse this shift.

This conversation makes me wonder if we can solve the problem with new dev tooling. What does linting look like in the agentic era?

That’s exactly what a TAB user would say :grin: I have to explicitly note that this is a joke


If you read the topic name carefully, you will see that this is the discussion about the practical details we are all trying to work out. Your whole post is missing the topic