What caused the switch was that we're building AI solutions for sometimes price-conscious customers, so I was already familiar with the pattern of "Use a superior model for setting a standard, then fine-tuning a cheaper one to do that same work".
So I brought that into my own workflows (kind of) by using Opus 4.6 to do detailed planning and one 'exemplar' execution (with 'over documentation' of the choices), then after that, use Opus 4.6 only for planning, then "throw a load of MiniMax M2.5s at the problem".
They tend to do 90% of the job well, then I sometimes do a final pass with Opus 4.6 again to mop up any issues, this saves me a lot of tokens/money.
This pattern wasn't possible with Claude Code, thus my move to Open Code.
In my experience, this correlates more with soft skills and “one man band” founder/maker companies that tend to sell training products or (if they do exist in a company environment at all) invariably work in DevRel and aren’t pushing code.
The whole point is to reinforce the track record of someone applying to said founding engineering role which you can look up what they have presented and see how well they answer questions from the audience which are soft skills applicable in founding engineer / CTO / senior roles which goes beyond AI-generated CVs or cover letters.
This can be found all the time, from many tech talks or conferences large or small and 99% of the time, the person presenting already covers most of the requirements and makes the selection process easier, not harder.
One part I did miss in my post was to require at least 2 out of 3 of them so, I added that in. But I'd rather optimize for hiring candidates who are builders and know what they are talking and what to build even with AI and can easily answer deep technical questions (because they have experience and have done it), than those studying for the interview and need constant hand-holding and are over-reliant on AI.
Remember, this is for recruiting founding engineers and the bar has to be high way above the noise.
"But there is an obvious solution: mandate the operating systems (iOS and Android) to share device users' ages when they download apps from the app stores – data the operating systems get as part of the hardware acquisition already. This would be a simple one-step way for parents to control all the different apps that their kids use (in the US, the average teen uses forty different apps per month) and would remedy the fractured app-by-app approach we have today. We should make a societal judgement about whether to set these age limits for smartphones or social media
use at thirteen, fourteen, fifteen or sixteen, then write it into law." in How to Save the Internet by Nick Clegg
Everything else aside, this naive belief system is right up there with "10 Myths every programmer should know about X" where X is email addresses, legal names, postal addresses, dates, timestamps, etc.
Or perhaps they are envisioning a "hardware acquisition" process where the purchaser is forced to take some oath and sign an attestation about all future users of the device...
I think it’s more about setting a norm and precedent that “Age verification is not our responsibility; the App Store layer does that and it’s an established truth now”.
Which itself conveniently helps as a defence in lawsuits when a teenager kills themselves over harmful content etc.
voxic11 is right that the AI Act creates a legal obligation that provides a lawful basis for processing under GDPR Article 6(1)(c).
To add to that, Article 17(3)(b) specifically carves out an exemption to the right to erasure where retention is necessary to comply with a legal obligation.
(So the defence works at both levels; you have a lawful basis to retain, and erasure requests don’t override it during the mandatory retention period).
That said, GDPR data minimisation (Article 5(1)(c)) still constrains what you log.
The library addresses this at write-time today, in that the pii config lets you SHA-256 hash inputs/outputs before they hit the log and apply regex redaction patterns, so personal data need never enter the chain in the first place.
This enables the pattern of “Hash by default, only log raw where necessary for Article 12”.
For cases where raw content must be logged (eg, full decision reconstruction for a regulator), we’re planning a dual-layer storage approach. The hash chain would cover a structural envelope (timestamps, decision ID, model ID, parameters, latency, hash pointers) while the actual PII-bearing content (input prompts, output text) would live in a separate referenced object.
Erasure would then mean deleting the content object, and the chain would stay intact because it never hashed the raw content directly.
The regulator would also therefore see a complete, tamper-evident chain of system activity.
Thanks both for the replies.
Can't you make it simpler: encrypt the data, store the encryption key separately and move the raw data to cold storage. If user wants to erase, delete the encryption key avoiding massive recompute from cold store. Do you think this is better approach? This is not efficient, but in large scale (peta bytes) this could work.
Developers make mistakes, if they miss encrypting due to some bug in the code, and they want to fix it, then the hash chaining will be a problem though.
IMO what you’re describing is essentially crypto-shredding.
It would definitely work (and when dealing with petabyte levels of data the simplicity of only having to delete the key is convenient).
We’re leaning toward the dual-layer separation I described though (metadata separate to content) mainly because crypto-shredding means every read (including regulatory reconstruction) depends on a key store.
In my view that’s a significant dependency for an audit log whose whole purpose is reliable reconstructability, whereas dual-layer lets the chain stand on its own.
Your point about developer mistakes is fair. It applies to dual layer as you say with your example, but I’d say crypto shredding isn’t immune to mistakes because (for example) deleting the key only works if the key and plaintext never leaked elsewhere accidentally in logs / backups etc.
The engineering-focused findings have been covered extensively (fake tool injection, Undercover Mode, KAIROS, etc).
This piece focuses on what these findings mean if you're using Claude Code to build AI systems subject to the EU AI Act.
TL;DR / spoiler:
Claude Code isn't a high-risk AI system in and of itself.
The EU AI Act regulates your deployed system and your process, not your tool vendor's internal engineering practices.