Exactly this. Just this week an engineer who seems to purely vibe everything submitted a +700ish LoC fix for what seemed like a pretty simple issue. Moreover it was a perf issue, which in my experience is not usually best fixed by adding more stuff.
Today, I merged my fix, net -381 LoC.
I'm using them too of course, they read and type and hunt for bugs and test faster than I can. But I'm using them as my tool, not being a tool using them.
I've noticed and felt this trend, but I haven't ever seen it put so well or really connected the dots with figma & pretty rectangles.
I remember discussions over relative easy of use of gray box wireframes... and that led to better products.
Now I've got designers vibing monstrosities that would have fit right in in the Flash era, I guess in order to draw even nicer rectangles now that execs than wave at AI and get a design.
Not OP but I've been thinking about this a lot (like everyone ha) and I think my answer is, yes?
I hope there's a "good enough" point but I don't think we're there yet. Like for me hardware got good enough several years ago. But while opus 4.7 is really good compared to everything else, it's not so good that I would use it at a discount over whatever is available in a few months. The improvement in quality, speed, and daily frustration is worth it to me... Spoken as someone whose employer is footing the bill, so take that with a grain of salt.
I want to run my own local models, but I don't think that's feasible without lots of frustration until a few generations of frontier models are so good that they're almost indistinguishable for common tasks. Kind of like how MacBook pros have been for a while.
While I can imagine that I'd want to use Opus 4.8 over 4.6 for a fair number of things (at least if they can avoid further speed regressions), I also have noticed that certain types of failures seem to be systemic. Bigger context has been helpful for bootstrapping, but still doesn't fix problems of getting stuck on the wrong things - you can toss more things in the blender, but you don't necessarily know which way it'll slice them up in advance, or which things from them it'll latch onto. And output still seems to get into "blindered" states where important details get dropped - even though it'll agree very quickly when you point that out. As long as we're in that sort of "spit something out in local targeted manner, and then do a revision loop until tests are green" style of execution, bigger models haven't shown me the ability to really avoid finding non-optimal / subtly-broken outputs for complex problems.
Using Cursor to hop between models, I've found Opus to be generally better at really tricky debugging than GPT 5.5 or earlier models, but not reliably better at execution because of these things. I'm not sure Composer 2.5 is quite there yet for the execution side, but it's getting pretty close to those other ones, such that I'm definitely still in a "debug and plan with slow, execute with faster ones" operating model for working on hard shit.
Why should I need to talk to Opus 4.7 when my day-to-day task is about programming in Java and Python? I don't need my model to know about biology or chemistry. If I need those capabilities (for someone who is working as software engineer in chemical industry), I will talk to Opus 4.7 for planning and then fan-out work for cheaper coding models. I think we will soon start to see specialized highly effective English language only programming models. I don't need my coding model to know about literature, art, philosphy, ethics, etc.
If there were a coding model as good as opus that didn't know multiple languages, biology, etc, I would happily use it. But I'm not aware of one - are you?
It actually seems somewhat difficult to train such a model since "all the text on the Internet" is easier to provide in bulk than a highly curated set.
Well language detection isn't all that hard in the scheme of things (especially now), but maybe having only training on English makes models less effective programmers. It would be interesting to see that as an experiment.
I would think that the surrounding chemical "knowledge" could be useful in the context of programming in that industry. Have you ever found it to draw links and conclusions between what you're doing in computer science and the chemistry it's in the middle of?
I would use Opus 4.7 for the planning stage where chemical knowledge is required then delegate to smaller English-Programming-Only-Opus to do the actual coding.
But you and I and everyone else here knows about MIT and can probably list a few major breakthroughs that came out of MIT... what's the name of your major public research university?
Today, I merged my fix, net -381 LoC.
I'm using them too of course, they read and type and hunt for bugs and test faster than I can. But I'm using them as my tool, not being a tool using them.
reply