Dmk

Autonomous AI Dev Workflows

Is anyone here using any AI agent frameworks/loop programs/scripts?

I’ve been using Claude Code/Gemini/Codex pretty heavily, all manual - IE I create a new Git worktree, open a new terminal start claude and go into plan mode to explain what I want. This works well, but is not automated in any fashion of course.

What I want to do is have a CLI program/script that continuously pulls from a backlog of issues and spins up agents for different stages - ultimately creating a PR for review.

There are multiple tools that do this, some seem very complex and hard to understand what they are doing/how to use it, other seem pretty close to what I want:

Has anyone tried any of these or others? What has your experience been?

30 comments

#coding-agents

3 1581 30

2026-02-17 07:17:06 UTC

Most Liked

dimitarvp

That’s a strange form of FOMO. Who is putting pressure on you to deliver 24/7, even while you sleep?

Post #5

dimitarvp

Of course I would, what crazy lunatic would not?

It’s the “do it well” thing that’s still under active debate. It’s very far from a given. It’s far from being proven as well. It’s very, very much in the air still.

I disagree; the worst case is that it balloons your line count by 15x and you are left holding the bag and burning through even more tokens (and through your wallet) trying to fix the mess.

In my experience during 2025, that risk is under-represented and downplayed by a lot of folk. I will not go as far as to claim it’s a conspiracy by the LLM vendors. Maybe people with sunk cost fallacy did not want to look bad so they either avoid the topic or downplay the negative outcomes.

In any case, fully autonomous agentic coding is still high-risk. That’s how it looks from where I am standing. It’s not a win-win at all, careful curation is still very much needed. And that part is still extremely difficult to automate or outsource to other agents (though that latter part might be already changing).

Post #7

Cheezy

My experiences with all three of your questions is yes - absolutely without question. For me Claude is build very high quality code (verified by static code analysis) that is well tested. There are many others that have had similar success.

For the prd - are you trying to create a document that is for humans to read or are you trying to create a good document for Claude? They are not necessarily the same thing. Human targeted documents have the potential to leave out a lot of implementation details and therefore leave the agent to make a lot of decisions. I wrote about this here → What is a task? | Cheezy's Blog

Another challenge you might have is speed. Claude will implement a change much faster than a human can. Therefore in order to keep Claude busy you will need to create a lot more requirements and fast.

Finally, the prd.md file is simply a document that is added to context (if I understand how you are using it). Think of it like a suggestion. There is no enforcement from that perspective. You will be prompting it repeatedly to have it follow the rules you have created precisely. Making this a Skill will remove it from context and will cause Claude to follow it more closely. That might be a good step for you.

My process is that I will use the superpowers brainstorming skill ( superpowers/skills/brainstorming/SKILL.md at main · obra/superpowers · GitHub ) to create a document that describes the next feature the team plans to implement. I then ask Claude to break this document down into goals and tasks using Skills that are a part of Stride. When this is finished I end up with a list of very detailed tasks that the agents can consume.

Hope this helps.

Post #21

Last Post!

Lucassifoni

I did not fully understand from this link if this is a joke or not, but in that kind of prospective discourse I feel like the frontier between a joke or a real product is very blurry :

Edit : joined their discord to have a look around, definitely not a joke. There are conversations like : “have you considered making buildings a metaphor for subfolders ?”. I love it as an experiment in UX/DX.

The delay between my “predictions” and things happening in that space proves to be very compressible..

Post #31