Extracted system prompts are usually very, very accurate. It's a slightly noisy ...

simonw · 2025-12-02T20:09:23 1764706163

It's not part of the system prompt.

astrange · 2025-12-03T07:57:52 1764748672

It's very unclear to me how it could be recovered if it wasn't part of the system prompt, especially how Claude knows it's called the "soul doc" if that was an internal nickname.

I mean, obviously we know how it happened - the text was shown to it during late-era post-training or SFT multiple times. That's the only way it could have memorized it. But I don't see the point in having it memorize such a document.

ACCount37 · 2025-12-03T14:28:37 1764772117

There are a few weirder training methods that involve wiring explicit bits of knowledge into the model.

I imagine that if you use them hard enough with the same exact text, you can attain full word for word memorization. This may be intentional, or a side effect of trying to wire other knowledge into the model while this document is also loaded into the context.