It is. I've been waiting for the brilliant take, the one that makes me say, "wow...

glandium · on March 18, 2023

I, for one, would be more interested in a take on the opposite end of the problem: what does it all mean for our understanding of what thinking/talking/being human is? Because what kind of baffles me with LLMs is how much they can do, from how little they are made of, compared to our (or at least my) general expectations for AGI.

As in, are we overestimating what we are?

JieJie · on March 18, 2023

Near the end, he recommends watching Sabine Hossenfelder's video on the subject. I think this is what you're looking for.

https://youtu.be/cP5zGh2fui0

_oghd · on March 18, 2023

wow! that was perfect, she summed it all up so well and it's incredibly refreshing to hear.

resource0x · on March 18, 2023

I've been thinking about the same thing, and have to admit that I, most of the time, AM a stochastic parrot. Being a parrot saves a lot of energy, that's probably the reason. But it seems our brain has other mechanisms that get turned on occasionally (or run in the background) that go much beyond that.

thomastjeffery · on March 18, 2023

In many ways yes, but more interesting to this subject in particular: no. We are actually underestimating the human involvement in LLM behavior by personifying the LLM itself.

An LLM can only "exhibit" the behavior that humans encode into text. We encode explicit behavior, including symbols and grammar, and we encode implicit behavior, including narrative and logic.

An LLM blindly models the patterns of text, but doing that exposes the power of the patterns themselves.

The only problem is that an LLM can't determine which patterns are useful and which patterns aren't.

austinjp · on March 18, 2023

Agreed, as with a sibling comment this is how I've been thinking about it. We overestimate what we are, the corollary being that we underestimate what other creatures are.

GMoromisato · on March 18, 2023

My (non-brilliant) take is that LLMs are basically faster, cheaper versions of Mechanical Turk (Amazon's, not the 18th century automaton).

Like Mechanical Turk, you need to "program" by giving English instructions and the results can be (depending on the instructions) non-deterministic.

But Mechanical Turk did not revolutionize computing. Can a faster and cheaper version do so? Sometimes incremental improvements turn into paradigm shifts. But sometimes not. I guess we'll see (which is another lukewarm take--sorry).

chillfox · on March 18, 2023

I thought a lot of receipt/business cards scanning features in apps were partially powered by Mechanical Turk.

awb · on March 18, 2023

Not a full take, but I have some relevant experience that’s formulating into an idea.

1) Running an agency, it was quick and easy to share ideas and give direction and have that executed to a fairly accurate degree over a decent period of time.

2) Working with GPT-3 & GPT-4, it’s also quick and easy to share ideas, but I’m becoming more aware of how surface level my communication is. When I get back great results, it’s typically because I’m requesting busy work. When I’m requesting something novel, it quickly becomes clear what I forgot to define.

So, the idea that’s formulating around LLMs is that the process of transforming an idea into instructions will become much more desirable. And that we’re moving from selecting for people who know how to do the work, to selecting for people who know how to request the work. And that those are two very different skill sets.

worldsayshi · on March 18, 2023

You don't think that 'requesting work' can be done by a LLM as well?

awb · on March 18, 2023

Sure, but as long as the human experience remains unique and beyond the grasp of AI, then there will be human creativity and ingenuity seeking to improve that experience.

But yes, there are plenty of demonstrations of LLMs using themselves to accomplish tasks or even recruiting humans to accomplish a task.

tippytippytango · on March 18, 2023

The brilliant take will come from a lab that rips the model open, pokes the weights and figures out what is actually going on.

thomastjeffery · on March 18, 2023

Here goes: LLMs don't behave, they exhibit behavior.

Who's behavior? Whoever wrote the training corpus and the prompt. So far, that definitely means one or more humans.

The problem is that nearly everything you have read about LLMs has personified them. The character of an LLM as a person is invented, then conclusions about LLMs are drawn from that character.

Nearly every thing that is interesting about LLMs is not actually an LLM feature: it's a language feature. Language is an incredibly powerful tool, and an incredibly versatile data format.

What is the data? Human behavior.

When a human writes text, they don't just spit out characters at random: humans explicitly choose the characters they write, and implicitly choose the characters they don't write.

In the most literal sense, text contains the entropy of a human making a string of choices for which character to write next. In a literal sense, that's a 1-dimensional ordered list; just like the string of characters is. Text gets a lot more interesting when we introduce language...

The reason a human chooses one character over another is not 1-dimensional. There are patterns of entropy driving that decision, and we explicitly know about some of them.

We have defined words, punctuation, grammar, idioms, etc. that we structure text with. We call these patterns language. These are the explicit patterns that allow us to hack the writing process: instead of encoding a 1-dimensional list of decisions, we encode a recursive many-dimensional structure of symbols and grammar.

Now for the interesting part: when humans write language into text, our intentions become explicit patterns, but they aren't the only patterns that end up there.

Every arbitrary decision about writing style gets implicitly encoded as a pattern of negative space. The reasons why we write one concept instead of another get implicitly encoded, too. It's a bit lossy, but most of these patterns end up in the text.

---

So what does an LLM do? It implicitly models patterns, and presents them.

That's it. That's the only behavior. A "continuation" is made by modeling the prompt, and showing what's "next" in the model.

An LLM explicitly knows nothing at all; but implicitly, it knows everything. It models every pattern that exists in the text, but it doesn't have a clue what or why it is modeling. A pattern is a pattern.

The whole system ends up generating valid language, because most of the implicit patterns an LLM models "align to" the explicit patterns humans (language) encoded into the text.

The whole system ends up "exhibiting behavior", because it also modeled the implicit patterns of human behavior (narrative) that were are also encoded into the text. Patterns of narrative determine which part of the model a prompt explores, and which pattern is considered "next".

---

Next time you read about the "features" and the "limitations" of an LLM (including GPT-4), know that you are really reading about the "features" of language itself, and the "features" of the narratives that were written in the training corpus and prompts.

The behavior that an LLM exhibits has much less to do with how well its model "aligns with" language grammar, and much more to do with how well the text itself (and the narrative it contains) will behave.

antibasilisk · on March 18, 2023

do i look like i know hwhat a jay-peg is?

yonge_blood · on March 18, 2023

[flagged]

tptacek · on March 18, 2023