Nothing excites and excuses a journalist with opinions more than a volatile stock. Doesn't matter what your opinion is, there's always a change that proves it. Even better: any rapidly changing stock value has to be volatile, so if something is in the news it's probably supporting your position! When you see these claims, ask yourself: How many times has the stock changed in this manner over the last year? How is its current static valuation relative to its long run history? Is the stock in a period of rapid change where higher volatility is expected? Those won't tell you where the stock is headed but they should at least help remind you not to trust horoscopes.
For the people who judge these innovations by how well they can read their minds and how well it stands up to what people feel they were promised by Star Trek, the last 24 months have been a disappointment.
For those who are becoming far more productive than they previously were, learning new skills, and building the next generation of companies.. that same time has been life changing.
I don’t spend too much time worrying about wall street and I also don’t deny the bottom could fall out, but I have no doubts about the ability of these technologies to dramatically improve lives and be foundational for future improvements.
> For those who are becoming far more productive than they previously were, learning new skills, and building the next generation of companies.. that same time has been life changing.
As someone who is very much in this camp, I feel that life changing is a bit of an exaggeration for my own experience.
GitHub Copilot is a really good autocomplete. Stable Diffusion and company were a fun toy for a while. Perplexity.ai is pretty good at doing what Google could do before the enormous wave of AI junk made search completely intractible. But otherwise... life goes on. I got a few better tools out of it, but the chat models have been reliably useless for my professional work and I therefore don't trust them much for learning something new (see Gell-Mann amnesia [0]).
And I'm not someone with my head in the sand—I fine tuned GPT-2 back before anything LLM-based had hit the tech mainstream, much less the general public. I was one of the first to sign up for the Copilot Beta, and I run Ollama locally with a bunch of different models that I've tried out on several side projects. I have access to both Anthropic and OpenAI models at work and regularly break them out to try again... it just never actually helps.
People have all these amazing stories of all the productivity they gain by using these models, but at their best I always come away feeling like I spent a few minutes coaching a junior developer through a problem without actually having taught anyone anything. I'm left wondering if I'm somehow doing it wrong or if all the stories of amazing results are coming from juniors who don't realize the model isn't very much better at this than they are.
I had to scold and mentor juniors who tried to push their crappy ChatGPT code in our codebase. They believe it gives them amazing powers because they don't have the experience to know whether a solution is good, bad, or completely buggy. Most of the time it's inefficient and full of bugs.
I do believe you learn from your own mistakes, but that requires thinking for yourself, not generating random answers.
> I'm left wondering if I'm somehow doing it wrong or if all the stories of amazing results are coming from juniors who don't realize the model isn't very much better at this than they are.
And here I am, being impressed that SOTA AI passes the Turing programming test for a human junior programmer(using your own interpretation of the skill levels involved). Let's at least appreciate that 10 years ago this level of computer intellect was science fiction. And while I'm very much a "junior developer" (not my day job and I do have a BSCS), I can still use AI productively for my personal backlog of things I'd like to accomplish, and it's been pretty helpful for me notwithstanding its limitations.
> Let's at least appreciate that 10 years ago this level of computer intellect was science fiction.
I definitely appreciate that the tech has come a long way! All I'm saying is it hasn't changed my life, and OP's characterization of the AI skeptics as essentially uninterested in learning or having sci-fi expectations is unfair. Some of us may simply be too experienced to get much out of it yet (if ever).
My experience is very somilar. In my own narrow domain chatgpt, Claude, copilot et al has been a toy, as in, mostly useless answers to my very specific questions.
I have honestly never gotten GPT to work like a junior developer as it can't seem to get details right so the actual coding usually ends up being done by me.
What it is great for is being a general teacher (sometimes wrong) and something to bounce ideas off of. It is enormously useful but not quite in the same way that people think it should. I am not the "manager" with the LLM being a direct report. At least I have never gotten it to work like that.
I think it's strictly inferior to a junior dev because it has no domain-specific knowledge, and even training it on your org repo does not really help it seems (ironically, a junior is doing it at my org as upper management want us to use GenAI, so maybe he fudged something)
I've heard that too and it's strictly not true for me. I regularly try it but it's wrong more than half the time.
Again, some of this may just come down to experience: by the time I'm turning to ChatGPT or Google it's because I'm in deep water and the docs have failed me. A search engine can turn up the one obscure github issue where someone had the same problem and someone else hints in a direction that ends up having the solution, but that won't have been encoded in the weights with enough strength to be spat out by ChatGPT.
Life changing seems fair, I'm making changes to my life assuming that programming jobs are about to get cutthroat competitive and then disappear. We've got something that is about as good as a junior developer in ChatGPT, much cheaper to employ for a month and it's competence is improving. It may not learn anything in a single session but at a higher level the rate of improvement of these models seems to be much faster than a human can manage. A human couldn't possibly learn as much as ChatGPT has in the time it has taken from founding OpenAI to now, and hardware improvement doesn't look like it'll cap out any time soon.
We've got all these people in a state of delusion that somehow the job of a developer isn't glorified autocomplete of what someone else prompts them with. We can't be sure what the speed is that it will happen, but the writing is on the wall that AI will obsolete software development roles as they currently exist.
If ChatGPT becomes as good as a junior, it may replace this position and we will be stuck with a decreasing number of seniors. But right now it's bad, makes a lot of mistakes and hallucinates APIs which is unacceptable.
> the job of a developer [is] glorified autocomplete of what someone else prompts them with
Clients don't know what they want, need, or what machines are capable of. Managers can't properly translate that either to the developers. My job is to understand all this and propose solutions, or say no if it's impossible. Will AI do that in the future? Do you remember when Cobol was supposed to replace programmers? Same thing for me. I have yet to see a fully generated solution from the specs to the deployment that takes responsibilities for its mistakes. But not yet another ReactJS frontend that anyone can do in a few days, something real that is useful to me and that you can sell as a real product.
We currently have a large number of business guys (look it up on LinkedIn) who learn "prompting" because they believe it will make them gods in the future even if they don't know what a programming language is. And a lot of scammers who sell their prompts to those people. It feels like cryptocurrencies all over again.
Last but not least, who will create the new APIs, technologies, and languages that we use? Will we be stuck with C++2x forever? ChatGPT will need a few additional revolutions to do this and to be independent on the open-source that already exists.
Yeah I feel like people have been focusing on getting more, faster compute done. Both software and hardware side. But the hard question of 'what compute should be done' remains the important one.
It might be more accurate to forget about the various current job categories. There is nothing that requires a technology to eliminate a certain existing job category(as if a job category is an immutable constant in the universe). Instead, it idea more likely to accelerate some types of activities and perhaps completely automate others.
> We've got all these people in a state of delusion that somehow the job of a developer isn't glorified autocomplete of what someone else prompts them with.
You could make the same oversimplification about any other knowledge or managerial job, up to and including C-suite executives.
If this technology is coming and taking everyone's jobs, we're going to need incredibly strong social support structures like Universal basic income and radical redistribution of wealth from AI-owners to fund it. Do you have any ideas for tackling that?
It's the dotcom boom all over again. Right now the webvans and pets.com of the world are getting insane valuations just by using a .ai tld. When the hubbub dies and the froufrou wash away, the googles, amazons, facebooks that know what they're doing will quietly emerge. Too many hyper-funded startups that are little more than a thin wrapper over foundational LLMs, with dubious usefulness.
Is that going to happen? Or are the limits of LLM technology being reached?
There are basic techniques for hallucination detection being developed.[1] But they're for "closed domain question answering", which seems to be what a general search engine could find.
Techniques for open-ended question answering seem to mostly involve asking the same question multiple times, perhaps in different ways, and comparing the results.[2] This sort of works, but fails when the question legitimately has multiple equally good answers.
Most of these are black-box approaches - they're not going inside the LLM.
(I'm not saying that better AI is impossible. Merely that this particular approach may be hitting its limit.)
That we haven't yet seen the successor is why I'm suspicious we're leveling out.
OpenAI teased their new model for months and willingly let the media assume it was GPT-5, then released it as 4o because it was faster and cheaper but no better. They intentionally burned a lot of the hype for GPT-5 on a different, less exciting product—that's not a move you make if you have a clear path towards the next generation.
> That we haven't yet seen the successor is why I'm suspicious we're leveling out.
Not clear if we're leveling out on what's possible, but price/performance may be increasing with model size. There's an economic limit. If the main way to fix hallucinations is to run the same query many times and look for consensus, that's going to be expensive.
How you describe it is irrelevant, the results are what's relevant and LLMs have made it possible to significantly improve efficiency in a really broad range of tasks. And not just in BS tasks like blog spam copy, SEO sites, etc.
I'm really curious how much valuations, especially for big companies, have been based on things like "anyone can create a website with no technical knowledge", "programming will just be natural language in a few years", and "you are now the manager for an infinite number of junior developers".
I think wall street is a bit smarter than that, but I'm not totally sure. I'm thinking that there are expectations of efficiency gains across a really broad range of tasks, i.e. programming, support, copy-editing, graphic design, etc, which will lead to significant increases in economic value.
But it's impossible to know what the broad market is thinking, so IDK
I can't go back to using it. I spent too much time reviewing the code it spit out and noticed when it absolutely wouldn't do what I needed it to, I was absolute slog since it was some else code.
There seem to be two different camps here: Copilot-users that use it largely for autocomplete and Cursor/Sonnet-users that use it for generating larger blocks of code. Personally I'm in the former camp; the efficiency gains are substantial and hallucinations are easy to control. Larger blocks are fine if one wants to sit down and review the whole thing, and also as a writer's-block-breaker.
It's the folks generating whole codebases with no ability to review or debug that are in deep Monkey's Paw territory.
Larger blocks are also perfect for scripts and tools for personal use, in this case you can just check if it works or not, Claude 3.5 Sonnet usually does not disappoint.
I tried copilot like 8 months ago for supercharged auto complete only, but I wasn't a fan because it was too high latency. I usually liked what it would write, but the second or so it took would really break my flow and had the overall effect of slowing me down. Makes me wonder if a local LLM could be superior here.
It seems to have gotten much faster recently so it might be worth another look. I also found it much slower while working on remote folders (with VS Code at least).
That's good to know, the first time I tried it I was working on remote folders (just the way things are setup at work). Maybe I'll have to try it on my own computer and see.
Who cares? Solve problem, write test case, move on.
It's a huge efficency boon for those of us who realize our jobs are just mechanisms to make money. It sucks if you're into artisinal coding or whatever..
I try LLMs once a month with variations of the same question on a specific library that I use. It hallucinates all the time. I can't use those tools as long as they don't give me the answers I need.
There's a quote I read somewhere once that I have struggled since to find the original source: "if something is 100% correct, it can't be called intelligence"
AI shoot to fame with Generative images and Text generation. Found many use cases. For me personally, I have to read a lot less documentation now to Code stuffs.
Apparently emergence of new use-cases is pretty thin. I see many domain specific opportunities, where AI can help. It will take some time to develop software where AI can be purchased and activated for niche company-internal use cases. Till that happens, AI (Neural Network training etc.) are hard core software development problem. Not usable by most of the companies.
The piece itself mostly discusses Nvidia, which is an interesting company from an "AI hype" perspective in that they can only physically produce so many chips, and they are riding a very thin line where their margins are very high and FLOPS (or the equivalent) are in many ways already a commodity, so everyone wants to be set up to compete with them at the first misstep.
- ChatGPT: The OG, most used AI chat. GPT-4/4o/multi-modal is already the prosumer LLM use case - excellent as an office/tech/language/creative assistant. Apparently Open AI's revenue is already in the billions and there's no denying people are using ChatGPT. There are also countless open-source libraries requiring `OPENAI_API_KEY` and many companies allocating budgets for spending there.
- LLM summarization: Amazon is already using LLMs to summarize reviews at the top of their product pages, search engines are using LLMs to summarize results at the top. GitHub has an LLM bot that you used to have to wait days for support to reply for - these are popping up everywhere. These will only improve to save people time and surface the exact information you're looking for. It's changing the way people access information and accomplish basic self-serve customer tasks.
- Customer service: The text-to-speech AI is getting very very good, to the point you can't tell it's not a human anymore. This alone is hugely disruptive as customer service touches so many verticals. It's already happening and will obviously only continue - especially when you talk to a service agent on the phone, increasingly it will be an LLM text-to-speech agent.
- llama.cpp/Ollama/HF & cloud services like Replicate: Infra for companies to run their own models instead of using Open AI due to cost/limitations. Obvious benefit for companies is you just want to pay for the compute you use, not for a consumer product like ChatGPT that abstracts away the usage - Open AI gets very expensive very quickly if using it on behalf of an application with many users. Also for niche use cases it might not be possible to accomplish what you want as an Open AI customer, you might need more control over models.
- Midjourney/Stable Diffusion type models: Totally separate from LLMs and language, there are diffusion models on image generation sites like LeonardoAI for creating concept art, illustrations, photography, and other visuals. There is a lot of obvious value here for creative work - it could displace a lot of conventional services and change the way we think about and pay for applied art. It's controversial in the way mp3s were, which itself will spawn some innovation in the legal space and in the way we interact with creative content.
- Self-driving cars: People are already taking the rides. AI modeling will continue to play a huge role in automating personal transit, and also for fleets. There will be self-delivering food, mail, and freight.
Particularly promising areas considering other modalities and fine-tunes:
- Music generation
- Film (imagine prompt-2-film)
- Extremely accurate/good legal advice where there are no flaws in your case
- Medical diagnoses/analyses that are highly specific to your body and full of info and visuals particular to your condition
- Negotiation AI for buying/selling, signing contracts for jobs or legal settlements, and general payments/invoicing
- General automation ("Call the DMV and ask about my license", "Call the auto shop, don't let them upsell any services")
- All the tech needed to integrate this stuff in the best ways