How does one keep up with all this change? I wish we could fast-forward like 2-3 years to see if an actual winner has landed by then. I feel like at that point there will be THE tool, with no one thinking twice about using anything else.
One keeps up with it, by keeping up with it. Folks keep up with latest social media gossip, the news, TV shows, or whatever interests them. You just stay on it.
Weekend I got to running Kimi K2, last 2 days I have been driving Ernie4.5-300B, Just finished downloading the latest Qwen3-235b this morning and started using it this evening. Tonight I'll start downloading this 480B, might take 2-3 days with my crappy internet and then I'll get to it.
Ergo my point about work and personal obligations (family, especially small kids). 2-4 hours per day for a solitary hobby is a surefire way to a divorce and estranged kids.
I'm married, kids, got an elderly parent at end of life that I'm caring for, and so on and so forth. How do I do it? Balance, right now, the kids are packing their bags to go to camp, so I have about 10 mins. I just replied to my prompt from last night, and will head out to drop them off, when I come in, I'll have a reply and enter my next prompt before I sign in for work. When the kids come in from school, they stay in my office and do their workbooks or watch TV while I sink in some work. You don't have to stay there for 4 straight hours, I get on the computer for 5 minutes, do a few and step out, then from that time till I get back on, I keep thinking about whatever problem I'm trying to solve.
Yeah second this. I find model updates mildly interesting, but besides grok 4 I haven’t even tried a new model all year.
Its a bit like the media cycle. The more jacked in you are, the more behind you feel. I’m less certain there will be winners as much as losers, but for sure the time investment on staying up to date on these things will not pay dividends to the average hn reader
I'm using claude code and making stuff. I'm keeping an eye and being aware of these new tools but I wait for the dust to settle and see if people switch or are still hyped after the hype dies down. X / HackerNews are good for keeping plugged in.
The underlying models are apparently profitable. Inference costs are in a exponential fall that makes Gordon Moore faint. OpenRouter shows Anthropic, AWS, Google host Claude at same rates, apparently nobody is price dumping.
That said, code+git+agent is only acceptable way for technical staff to interact with AI. Tools with sparkles button can go to hell.
We don't actually need a winner, we need 2-3-4 big, mature commercial contenders for the state of the art stuff, and 2-3-4 big, mature Open Source/open weights models that can be run on decent consumer hardware at near real-time speeds, and we're all set.
Sure, there will probably be a long tail, but the average programmer probably won't care much about those, just like they don't care about Erlang, D, MoonScript, etc.
Things will be moving faster in 2-3 years most likely. (The recursive self-improvement flywheel is only just starting to pick up momentum, and we’ll have much more LLM inference compute available.)
Figuring out how to stay sane while staying abreast of developments will be a key skill to cultivate.
I’m pretty skeptical there will be a single model with a defensible moat TBH. Like cloud compute, there is both economy of scale and room for multiple vendors (not least because bigco’s want multiple competing bids).
I'm actually waiting for something different - a "good enough" level for programming LLMs:
1. Where they can be used as autocompletion in an IDE at speeds comparable with Intellisense
2. And where they're good enough to generate most code reliably, while using a local LLM
3. While running on hardware costing in total max 2000€
4. And definitely with just a few "standard" pre-configured Open Source/open weights LLMs where I don't have to become an LLM engineer to figure out the million knobs
I have no clue how Intellisense works behind the scenes, yet I use it every day. Same story here.
“Good enough” will be like programming languages; an evolving frontier with many choices. New developments will make your previous “good enough” look inadequate.
Given how much better the bleeding edge models are now than 6 months ago, as long as any model is getting smarter I don’t see stagnation as a possibility. If Gemini starts being better at coding than Claude, you’re gonna switch over if your livelihood is dependent on it.
It depends on the level of 'keeping up'. I follow the news, but it's impossible to dip your toe in every new model. Some stick around, but the majority pass through.
> Mass adoption is rarely a quality indicator. I wouldn't want to pay for the mainstream VHS model(s) when I could use Betamax (perhaps even cheaper).
Oh, but it is.
Imagine you were then, back in those days. A few years after VHS won, you couldn't find your favorite movies on Betamax. There was a lot more hardware, and cheaper, available, for VHS.
Mass adoption largely wins out over almost everything.
Case in point from software: Visual Basic, PHP, Javascript, Python (though Python is slightly more technically sound than the other ones), early MySQL, MongoDB, early Windows, early Android.
Why do you believe so? The leaderboard is highly unstable right now and there are no signs of that subsiding. I would expect the same situation 2-3 years forward, just possibly with somewhat different players.