>Seems like everyone over there is vibing and no one is rationalizing the whole.
Claude Code creator literally brags about running 10 agents in parallel 24/7.
It doesn't just seems like it, they confirmed like it is the most positive thing ever.
It's software engineering crack. Starting a project feels amazing, features are shipping, a complex feature in the afternoon - ezpz. But AI lacks permanence, for every feature you start over from scratch, except there is more of codebase now, but the context window is still the same. So there is drift, codebase randomizes, edge cases proliferate, and the implementation velocity slows down.
Full disclosure - I am a heavy codex user and I review and understand every line of code. I manually fight spurious tests it tries to add by pointing a similar one already exists and we can get coverage with +1 LOC vs +50. It's exhausting, but personal productivity is still way up.
I think the future is bright because training / fine-tuning taste, dialing down agentic frameworks, introducing adversarial agents, and increasing model context windows all seem attainable and stackable.
I usually have multiple agents up working on a codebase. But it's typically 1 agent building out features and 1 or 2 agents code reviewing, finding code smells, bad architecture, duplicated code, stale/dead code, etc.
I'm definitely faster, but there's a lot of LLM overhead to get things done right. I think if you're just using a single agent/session you're missing out on some of the speed gains.
I think a lot of the gains I get using an LLM is because I can have the multiple different agent sessions work on different projects at the same time.
I think that the current test suite is far too small. For the Claude Code codebase, a sensible next step would be to generate thousands of tests. Without that kind of coverage, regressions are likely, and the existing checks and review process do not appear sufficient to reliably prevent them.
My request is that an entirely LLM-written feature should only be eligible for merge once all of those generated tests pass, so we have objective evidence that the change preserves existing behavior.
I know at least one of the companies behind a coding agent we all have heard of has called in human experts to clean up their vibe coded IAC mess created in the last year.
Claude Code creator literally brags about running 10 agents in parallel 24/7. It doesn't just seems like it, they confirmed like it is the most positive thing ever.