If you're going to rewrite LLVM, you should avoid just trying to 'do it again bu...

jcranmer · on Jan 24, 2025

> The backend code for LLVM is awful, and tablegen only partially addresses the problem. Most LLVM code for defining instruction opcodes amounts to multiple huge switch statements that stuff every opcode into them, its disgusting. This code is begging for a more elegant solution, I think a functional approach would solve a lot of the problems.

So one of the main problems you run into is that your elegant solution only works about 60-80% of the time. The rest of the time, you end up falling back onto near-unmaintainable, horribly inelegant kludges that end up having to exist because gee, real architectures are full of inelegant kludges in the first place.

Recently, I've been working on a decompiler, and I started out with going for a nice, elegant solution that tries as hard as possible to avoid the nasty pile of switch statements. And this is easy mode--I'm not supporting any ugly ISA extensions, I'm only targeting ancient, simple hardware! And still I ran into the limitations of the elegant solution, and had to introduce ugly kludges to make it work.

The saving grace is that I plan to rip out all of this manual work with a fully automatically-generated solution. Except that's only feasible in a decompiler, since the design of that solution starts by completely ignoring compatibility with assembly (ISAs turn out to be simpler if you think of them as "what do these bytes do" rather than "what does this instruction do")... and I'm worried that it's going to end up with inelegant kludges because the problem space more or less mandates it.

> You could try and solve the problem by using a superoprimisation approach from the beginning. Look into the Souper paper which automatically generates peepholes for LLVM:

One of the problems that Souper ran into is that LLVM IR is too abstract for superoptimization to be viable. Rather than the promise of an automatic peephole optimizer, it's instead morphed more into "here's some suggestions for possible peepholes". You need a really accurate cost model for superoptimization to work well, and since LLVM IR gets shoved through instruction selection and instruction scheduling, the link between LLVM instructions and actual instructions is just too tenuous to build the kind of cost model a superoptimizer needs (even if LLVM does have a very good cost model for the actual machine instructions!).

fuhsnn · on Jan 24, 2025

>So one of the main problems you run into is that your elegant solution only works about 60-80% of the time. The rest of the time, you end up falling back onto near-unmaintainable, horribly inelegant kludges that end up having to exist

This is generally true, though for small compiler backends they have the luxury to straight up refuse to support such use cases. Take QBE and Cranelift for example, the former lacks x87 support [1], the latter doesn't support varargs[2]; which means either of them support the full x86-64 ABI for C99.

[1]https://github.com/michaelforney/cproc?tab=readme-ov-file#wh...

[2]https://github.com/bytecodealliance/wasmtime/issues/1030

muth02446 · on Jan 24, 2025

I think you are generally correct but the two examples you gave "triggered" me ;-)

What damaged would there be if gcc or LLVM did decide to not support x87 anymore. It is not much different from dropping an ISA like IA64. You can still use the older compilers if you need to.

Similarly, what is varargs used for? Pretty much only for C and its unfortunate printf, scanf stdlib calls. If a backend decides not support C, all this headache goes away. The problem is, of course, that the first thing every new backend designer does is to write a C frontend.

jcranmer · on Jan 24, 2025

> What damaged would there be if gcc or LLVM did decide to not support x87 anymore.

For starters, you'd break every program using long double on x86.

And as far as "complexities of the x86 ISA" goes, x87 isn't really that high on the list. I mean, MMX is definitely more complex (and LLVM recently ripped out support for that). But even more complex than either of those would be anything touching AVX, AVX-512, or now AVX-10 stuff, and all the fun you get trying to build your systems to handle encoding the VEX or EVEX prefixes.

o11c · on Jan 24, 2025

"Everything should be as simple as it can be but not simpler!" —Roger Sessions, loosely after Albert Einstein