The first link is about consciousness. The second link is that language is not thought. The third link is that intentions as "discrete mental states" may not be found in the brain.
This does not necessarily disprove the existence of an world model, and these papers are not directly dealing with the concept. As shown with how LLMs work, the world model (and how the brain thinks) may be more implicit rather than explicit philosophical/psychological constructs within the neural net of the brain.
As opposed to neural nets and LLMs, neuroscientists and the like have no way of taking out a human brain and hooking it up to a computer to run experiments on what our neurons are doing. These software are the next best thing at the moment of determining what neurons can do and how they work.
Perhaps there is a dialectical synthesis that can be made of your position that I interpret to be something like "there does not exist discrete cartesian states within the brain" with how neural nets learn concepts implicitly through statistics.
The first link is about how philosophy and psychology is used to describe brain-cognitive behavior research, which has a limited explanatory capability compared to a hypothetical interpretation using its own vocabulary instead of those borrowed from other fields.
The second link is about an AI that detects consciousness in coma patients.
The third link is about how coma is associated with a low-complexity and high-predictability passive cortical state. Kickstarting the brain to a high-complexity and low-predictability state of cortical dynamics is a sign of recovery back to consciousness.
I think video and agentic and multimodal models have led to this point, but actually making a world model may provide to be long and difficult.
I feel LeCun is correct that LLMs as of now have limitations where it needs an architectural overhaul. LLMs now have a problem with context rot, and this would hamper with an effective world model if the world disintegrates and becomes incoherent and hallucinated over time.
It'd doubtful whether investors would be in for the long haul, which may explain the behavior of Sam Altman in seeking government support. The other approaches described in this article may be more investor friendly as there is a more immediate return with creating a 3D asset or a virtual simulation.
The idea of this style of research and engineering in general is to create an approximation of observations of nature.
To follow completely in your footsteps would require recreating the totality of physical evolution that led to intelligence within a fraction of the time. This isn't feasible to an investor expecting returns within a reasonable timeline.
Getting to this point leaped from text generation to image generation to video generation, and now these three approaches are experimenting on what the next step should be. These three approaches did not come from an isolated vacuum and are the result of iterative progress.
The general idea now is to take the video model and give it 3D spatial capabilities to better model the implicitly symbolic and virtual worlds it is depicting and reason what would happen next. Fei-Fei Li wants to produce a 3D scene asset. DeepMind wants to simulate what can happen in that 3D scene. Yann LeCun wants to expand upon the symbolic reasoning by adding another layer of intelligence.
Traditional AI balk at the lack of inherent agentic purpose and goals, but LLMs separately evolved from pattern matching and statical analysis of the digital output of human labor.
A number of people in the LLM field have accepted that recreating the animal brain is not the point. Instead they work on a unique digital intelligence as if it was select fragments of the human brain existing in a digital world, informed by neuroscience research.
I think LLMs may not reason like a human with skin in the game, but humans are rather flawed in making sense of the nihilistic stage of history we find ourselves in. It is difficult for at least half of humanity on an IQ level to reason with what we have created. I think there is a case for a separate digital intelligence to analyze and make sense of the digital world which only seems to merge further with reality. Maybe this is a transhumanist singularity not in a technological term, but in terms of human idealism in creating values for ourselves.
These are not approximations, they are arbitrary simulations that have nothing to do with observation.
They’re irrelevant in terms of any idea of intelligence as intel is built upon topologies.
Intelligence is tied to development where functional outcomes are by genetic tinkering with environments for flexibility. These gaslighted trivial models are function aimed, so they have zero ability to even mimic what intel is.
Face it this is a massive washout. It has no ambition. It lacks even a weak definition of the models name “spatial intelligence” and it lacks one because there is no such thing.
Fundamentally world models do not exist and are oxymoronic.
Neural nets and LLMs were created based on neuroscience research. Ultimately they are approximations of how parts of the human brain works.
The real concern of having no biomechanical skin in the game is lacking sensory input that could ground it within our reality. All input into LLMs are based on digital output of human labor, which are ultimately symbolic representations filtered through our brain and its ideas of reality. However, this may not be too different from how our real human brains work.
There has been a philosophical dilemma over how real consciousness can be as if it is imagined by our brains since our brains provide convincing hallucination of what seems like real sensory input or even free will. That is to say that humans at a philosophical level live in their brains interpreting a fragment of reality based upon how it interprets sensory input.
Now the LLM as a brain cuts out an entire step of agentic sensory input and they exist wholly as the result of our ideas.
They have no functional or processual relationship to brains, there are scores of papers making light of this. There are no valid parallels between AI and brains.
There were never approximations merely false models.
The field is trapped in bad definitions and decisions
The keyword of that study is consciousness, which I'd consider a separate goal than an "intelligence". LLM proponents are aware that their architecture lacks many parts of what constitutes a complete brain, and there's other AI researchers who disagree that LLMs will lead to either AGI or consciousness. I largely consider these tangential to the topic. A neural net simulation of a virtual reality does not need consciousness as it has to model the consequences of agentic actions.
It’s not a keyword, it’s the seat of intelligence. What coders don’t grasp is nothing g related to symbols metaphors words language manifests as consciousness and or intel. Your field is a wash.
“We refute (based on empirical evidence) claims that humans use linguistic representations to think.”
Ev Fedorenko Language Lab MIT
When I look up that quote, it leads back to Hacker News comments. It is also a strange way to make a citation. You make blanket statements that are easily argued against, and now you respond with this nonsense. I accuse you of being an LLM bot.
Take great offense at being called a bot, especially considering any glance can spot my numerous typos. And the weakness of your search capability: that quote is from a discussion of Ev’s following the pub of this paper
That exact quote does not appear in the paper. You cannot attack me for your lack of due diligence.
This paper does little to dismiss LLMs. LLMs can use a different medium than text and that would not take away from its underlying mathematical models based on neuroscience. LLMs only understand language representations implicitly through statistical analysis, and that may instead show a commonality with how the human brain thinks as written in this paper.
I will not apologize for how you keep pushing an agenda despite how poorly supported it is. I have tried to be intellectually honest about the state of the industry and its flaws. I would implore you to instead do the research about LLMs so you can better refine your critique of them.
Your intellectual insecurity doesn’t mean I offer due diligence for existing information, nor does it give you any protocol to shift apologies especially since we evaluate software for special effects in high budget streaming. And none of our research indicates LLMs RL or frontier approaches will work in spatially specific ways. It’s a wash, we can see it.
You appear to be using random words and phrases to intentionally obfuscate the lack of substance in your responses.
There is a baseline expectation of how quotes and citations are supposed to work within Western intellectual circles. The fact that you do not know them and refuse to accept it means either you are not familiar with Western academia or you are an intellectually dishonest Internet troll or an LLM bot.
Spatial reasoning and world models are a research topic because elements of them were found in video and agentic models, and investors want to further refine either of them.
I do not have the time to read through this entire Google doc, but from what I have skimmed, I can see that the most substantial critiques are from academia being honest of the current state of AI and its limitations. That is fine.
However, the opening paragraphs aren't impressive. Language is arbitrary, yes, but they must also be intelligible by other humans. It is like a canvas to pattern match and create all sorts of inductive reasonings. There isn't much to explain how pattern matching math would be inherently incapable of pattern matching the written language. This reads like a basic understanding of postmodernist philosophy as if it is proof of math becoming a failure when applied to a socially constructed reality. However, philosophy and other social sciences do not surrender and give up as if their fields are fundamentally flawed. They make do and continue matching patterns to make observations of social reality.
The burden is ultimately on you to prove that the limitations of current AI/LLM cannot be overcome or that there is something that cannot make world models or spatial reasoning possible. Simply having a mountain of text to read is not an argument. There has to be some summary or point that can be used at the thrust of your position. As they say, brevity is the soul of wit.
You’ve just explained how neither images nor words cohere thought in LKNs Gaussian frontier or otherwise- they are wholly arbitrary. They reference nothing in and of themselves. And investors have been sold a bubble in every model so far. Enjoy the ride!
Humans have created semantic connections of images and words to thoughts, and LLMs learn from the implicit semantic meanings behind the words used in text. Humans after all have to communicate thoughts to each other. If you are correct, then communication would not exist and we would still be apes.
For example, "LKNs Gaussian frontier" is another random phrase you have pulled out as if you are an LLM hallucinating something.
The bubble is in whether the investors will get their return on time. This is orthogonal to the underlying technology. Investor interests are not in progressing technology but to get a profit. Hope you enjoy the ride too because this is going to affect all of us.
Are you for real dude? Lkms is a typo of LLMs. Semantic is only in task demands, and they’re variable. They extend limitlessly from action or spatial syntax, they don’t exist in words or images. You’ve been sold junk tech, read any Gary Marcus or Rodney Brooks and yes am enjoying the ride, we are the next stage - analog entertainment.
You cannot be serious if you expect people to mind read through your typos and make sense of them. Is this supposed to be a performative art piece to demonstrate your point? If you actually care about expressing your point to another person, then you should show some attention over how you present your responses so the other person can understand it.
Is "task demand" what the LLM would expect to do in order to respond to the user prompt? It seems incredulous that semantics would only exist here. As I have mentioned before, semantics is already embedded in the input and output for the LLM to implicitly discover and model and reason with.
https://arxiv.org/html/2507.05448v1
This paper is an interesting overview of semantics in LLMs. Here's an interesting quote, "Whether these LLMs exhibit semantic capabilities, is explored through the classical semantic theory which goes back to Frege and Russell. We show that the answer depends on how meaning is defined by Frege and Russell (and which interpretation one follows). If meaning is solely based on reference, that is, some referential capability, LLM-generated representations are meaningless, because the text-based LLMs representation do not directly refer to the world unless the reference is somehow indirectly induced. Also the notion of reference hinges on the notion of truth; ultimately it is the truth that determines the reference. If meaning however is associated with another kind of meaning such as Frege’s sense in addition to reference, it can be argued that LLM representations can carry that kind of semantics."
As for reference-based meaning reliant on truth, this was mentioned earlier in the paper, "An alternative to addressing the limitations of existing text-based models is the development of multimodal models, i.e., DL models that integrate various modalities such as text, vision, and potentially other modalities via sensory data. Large multimodal models (LMMs) could then ground their linguistic and semantic representations in non-textual representations like corresponding images or sensor data, akin to what has been termed sensorimotor grounding (Harnad, 1990). Note however, that such models would still not have direct access to the world but a mediated access: an access that is fundamentally task- driven and representational moreover. Also, as we will argue, the issue is rather that we need to ground sentences, rather than specific representations, because it is sentences that may refer to truth. But attaining the truth is not an easy task; ultimately, future LMMs will face the same difficulties as we do in determining truth."
In other words, this is the approach Fei-Fei Li and other multimodal models are using to create the world model.
There has been a consistent desire to unify the desktop environments, but the fragmentation has largely been because of differing use cases and philosophical perspectives.
This divide is being further driven on the issue of X11 vs Wayland, and now the drama of decentralized libertarianism vs centralized corporatism. The latter manifests itself as a culture war over the code of conduct or woke software. Now it is coalescing into a political line between hyprland & X11 and GNOME & Wayland. (hyprland uses Wayland, but it and X11 have a similar political affinity by loud and divisive proponents.)
The woke have an affinity with centralized corporatism and want to unify freedesktop collaboration under it and ensure that there is identitarian representation, so contributors don't have to worry about petty discrimination and office politics.
The opposition have an affinity with decentralized libertarianism, and they reject identitarian politics as they want their personal freedom to do what they want even if it is not politically correct, believing that this is the best way for their ideas to flourish into better software.
I personally believe there's some merits to either side, and a fine balance has to be made. We can try to get the best parts (the best software) without the bad parts (discrimination and office politics).
I'm a bit mixed on things as well... I like some of the technical solutions that are coming from the more woke organizations, but I really don't like the abuses under their Code of Conduct or war against the non-woke. Such as activists inside Gnome calling to reject funding from Framework because they're also funding Omarchy and Hyperland.
Identitarian politics can either be explicitly enforced or implicitly allowed. The woke chose the former. The opposition can do the latter but they also believe in checking identitarian interests in favor for personal freedom.
There are concerns over this escalating into fascism, but the logical extremification of ideas only muddles the waters and makes communication difficult. A pragmatic and balanced solution gets moved out of reach, and as a result, the corporate watchmen can push lighter opposition to the extreme fringes. Who then watches the watchmen?
The case against LLM is thinking could be that "backpropagation is a leaky abstraction." Whether LLM is thinking depends on how well the mathematical model is defined. Ultimately, there appears to be a limit to the mathematical model that caps the LLM capacity to think. It is "thinking" at some level, but is it at enough of a significant level that can be integrated into human society according to the hype?
Andrej Karpathy in his interview with Dwarkesh Patel was blunt about the current limitations of LLMs, and that there would need to be further architectural developments. LLMs lack the capacity to dream and distill experience and knowledge learned back into the neurons. Thinking in LLMs at best exist as a "ghost" only in the moment as long as the temporary context remains coherent.
If we had a society of actual engineers and hackers, sure.
However, LLMs for the vast majority of people are simple chatbot oracles. The people paying exorbinant wealth in investments are aiming for essentially the Apple of AI where it magically just works and creates a new market to redefine the paradigm.
LLMs are yet again another cyclical cycle where ideas influence material reality and vice versa. Magic is seen to be worth more than the wizard and his tools behind the curtain. The market is hoping that the masses don't find out about the wizard and his tools, so the illusion can continue to live and provide the basis for dreams.
Open WebUI also lets you tweak those parameters, BTW.
After reading this, it could be said that instead of work, we have abstracted it away. The capitalism of Marx dealt with the real work of productive factories. Neoliberal capitalism however outsourced work, and knowledge workers worked about work. The expertise remained for workers whose jobs were exported, making them overqualified. Instead of capitalism creating the conditions for socialism, capitalism creatively destroys the working class.
Human capital, prescribed as a solution, stops to matter. The logical conclusion is the decreasing population and falling birth rates. Perhaps, basic income could provide relief for those affected. I doubt it would be successful in the long run as capitalism adapts to maintain the exploitative framework of "work". Instead of the intent of individuals directing the flow of the economy, it is wrested back by the central business and economic planners. What happens next would be speculation.
This does not necessarily disprove the existence of an world model, and these papers are not directly dealing with the concept. As shown with how LLMs work, the world model (and how the brain thinks) may be more implicit rather than explicit philosophical/psychological constructs within the neural net of the brain.
As opposed to neural nets and LLMs, neuroscientists and the like have no way of taking out a human brain and hooking it up to a computer to run experiments on what our neurons are doing. These software are the next best thing at the moment of determining what neurons can do and how they work.
Perhaps there is a dialectical synthesis that can be made of your position that I interpret to be something like "there does not exist discrete cartesian states within the brain" with how neural nets learn concepts implicitly through statistics.