More

enraged_camel · 2026-04-07T22:16:11 1775600171

Let's be clear: your entire post is just pure, unadulterated FUD. You first claim, based on cherry-picked benchmarks, that Mythos is actually only "barely competitive" with existing models, then suggest they must be training to the test, then call it "odd" that they are withholding the release despite detailed and forthcoming explanations from Anthropic regarding why they are doing that, then wrap it up with the completely unsubstantiated that they must be bleeding subscribers and that this must just be to stop that bleed.

enraged_camel · 2026-04-07T21:34:48 1775597688

Yeah, I'm unsure why the OP thinks that massive chaos would somehow be "better for the public."

enraged_camel · 2026-04-07T18:53:15 1775587995

That does not sound very believable. Last time Anthropic released a flagship model, it was followed by GPT Codex literally that afternoon.

cyanydeez · 2026-04-07T20:11:22 1775592682

Ya'll know they're teaching to the test. I'll wait till someone devises a novel test that isn't contained in the datasets. Sure, they're still powerful.

enraged_camel · 2026-04-07T18:34:56 1775586896

>> Interesting to see that they will not be releasing Mythos generally.

I don't think this is accurate. The document says they don't plan to release the Preview generally.

redfloatplane · 2026-04-07T18:50:36 1775587836

Yeah, good point, thanks for noting that, I'll correct.

enraged_camel · 2026-04-06T20:46:53 1775508413

I read the entire performance degradation report in the OP, and Boris's response, and it seems that the overwhelming majority of the report's findings can indeed be explained by the `showThinkingSummaries` option being off by default as of recently.

enraged_camel · 2026-04-06T19:48:14 1775504894

>> Also Claude owes its popularity mostly to the excellent model running behind the scenes.

It's a bit of both. Claude Code was the tool that made Anthropic's developer mindshare explode. Yes, the models are good, but before CC they were mostly just available via multiplexers like Cursor and Copilot, via the relatively expensive API.

enraged_camel · 2026-04-06T17:40:53 1775497253

Yeah I think the 1M context is the issue. Because I use Opus 4.6 through Cursor at the previous 200k limit and it has been totally fine. But if I switch to the 1M version it degrades noticeably.

lelanthran · 2026-04-06T17:58:16 1775498296

> Yeah I think the 1M context is the issue. Because I use Opus 4.6 through Cursor at the previous 200k limit and it has been totally fine. But if I switch to the 1M version it degrades noticeably.

I thought it was already well-known that context above 200k - 300k results in degradation.

One of my more recent comments this past week was exactly that - that there was no point in claiming that a 1m context would improve things because all the evidence we have seen is that after 300k context, the results degrade.

seanw444 · 2026-04-06T19:50:42 1775505042

200k ought to be enough for anyone.

entrep · 2026-04-07T11:35:22 1775561722

One could try:

> export CLAUDE_AUTOCOMPACT_PCT_OVERRIDE=50

Which will have Claude Code auto compact at ~500k window size.

enraged_camel · 2026-04-06T16:38:46 1775493526

I have a similar workflow but I disagree with Codex/GPT-5.4 reviews being very useful. For example, in a lot of cases they suggest over-engineering by handling edge cases that won't realistically happen.

enraged_camel · 2026-04-06T15:57:56 1775491076

Claude Code inside the desktop app works for me.

enraged_camel · 2026-04-06T00:19:30 1775434770

>> AI assisted coding makes you dumber full stop. It's obvious as soon as you try it for the first time. Need a regex? No need to engage your brain. AI will do that for you.

Regex is the worst possible example you could have given. Seriously, how many people do you know who painstakingly hand-craft their own regexes as opposed to using one of the million tools out there that can work backwards from example inputs and outputs to generate a regex that satisfies the conditions?