This was true since Claude Sonnet 3.5, so over a year now. I was early on the LLM train building RAG tools and prototypes in the company I was working at the time, but pre Claude 3.5 all the models were just a complete waste of time for coding, except the inline autocomplete models saved you some typing.
Claude 3.5 was actually where it could generate simple stuff. Progress kind of tapered off since tho, Claude is still best but Sonnet 4.5 is disappointing in that it does't fundamentally bring me more than 3.5 did it's just a bit better at execution - but I still can't delegate higher level problems to it.
Top tier models are sometimes surprisingly good but they take forever.
Not really - 3.5 was the first model where I could actually use it to vibe through CRUD without it vasting more time than it saves, I actually used it to deliver a MVP on a side gig I was working on. GPT 4 was nowhere near as useful at the time. And Sonnet 3 was also considerably worse.
And from reading through the forums and talking to co-workers this was a common experience.
Up until using claude 4.5 I had very poor experiences with C/C++. Sure, bash and python worked ok, albeit the occasional hallucination. ChatGPT-5 did ok with C/C++ and fairly well with python(again having issues with omitting code during iterations, requiring me to yell at it a lot). Claude 4.5 just works and it's crazy good.
Claude 3.5 was actually where it could generate simple stuff. Progress kind of tapered off since tho, Claude is still best but Sonnet 4.5 is disappointing in that it does't fundamentally bring me more than 3.5 did it's just a bit better at execution - but I still can't delegate higher level problems to it.
Top tier models are sometimes surprisingly good but they take forever.