It is a fundamentally difficult problem to solve. On top of that, the costs grow exponentially with core count and processor scaling compounds.
If you look back at so many of the intel architectural extensions it is so hard NOT to draw the conclusion that Intel has a POOR understanding of where problems should be solved. They constantly try to solve problems which should be solved in software with hardware. This is why they are now an architectural generation behind AMD and why they are most likely going to lose the CPU market to Apple imitators.
Unless AMD can maintain its success despite the talent bleed it will start to face — the days of x86 are limited.
Why is Intel wrong for trying to solve software problems in hardware, while ARM/CHERI are celebrated for being ahead in trying to solve memory safety in hardware using pointer upper-bit tagging and now CHERI's provenance metadata (performing fine-grained bounds checks on every single pointer dereference, rather than just following instructions without performing extra work)?
I think it's less a criticism of trying to solve problems in hardware vs the _kinds_ of problems they're focusing on.
While CHERI is, from a pure theory standpoint, something that is perfectly avoidable with proper programs (i.e. memory unsafety _is_ efficiently avoidable in software), we ended up needing it we made the wrong choice in software too long ago to turn back (nobody is rewriting the Linux kernel anytime soon). In this way, CHERI is a good optimization because it does something we cannot _practically_ solve in software. ARM-PA plays a similar role in that hardware CFI can be made irrelevant by a) not having memory safety issues b) software CFI, but neither have really worked out in practice and it's cheap and efficient in hardware, so it's a worthwhile tradeoff.
Stuff like Intel TSX and ARM TME are sort of at the other end. Transactional memory is _super_ cool and it's been a common thread throughout architecture papers for the past twenty years. The thing is, we've never had transactional memory in commodity hardware (and nobody buried their heads in the sand about not having it like we did with memory safety) so all our software found decent work arounds eventually. TSX/TME does do what it says, the issue is just that it's not quite good enough when compared to existing software techniques and so the actual added value (cache noise and the resulting spurious aborts included) made it a less good deal. When adding the cost to both update software and the likely strongly polynomial (?) hardware cost of transaction support as core count grows (this is why ARM's Exclusive Monitor performs SO bad on systems with 64+ cores and why they added new atomics just to avoid the monitor), it just doesn't work anymore.