Again, somebody who comes to the realization something is seriously wrong with ultra-complex languages in the SDK (c++ and similar).
In other words, since this alternative LLVM is coded in plain and simple C, it is shielded against those who are still not seeing that computer languages with an ultra complex syntax are not the right way to go if if want sane software.
You also have QBE, which with cproc will give you ~70% of latest gcc speed (in my benchmarks on AMD zen2 x86_64).
C itself is an ultra-complex language. I do not understand the mindset of the "C is simple" crowd. Is it nostalgia or romanticism for the past? If we want to devise a truly simple language, we need to start by realizing that C is just the Javascript of its day: hacked together in a weekend by someone who wished they were using a different language and then accidentally catapulted into the future by platform effects.
The sleight of hand here is that by leaving so many things undefined, unspecified, and implementation-defined, C gets to foist complexity off on the implementations and then act as though it's absolved of blame when things go off the rails. The fact that what felt like half of all traffic on Usenet and IRC in the 90s was comprised of people language-lawyering over what is and is not valid C disqualifies it from being considered a simple language. It's as though someone designed a language where the spec is the single sentence "the program does what the user intends it to do" and then held this up pinnacle of simplicity. C has an entire Wikipedia article about how needlessly difficult it is to parse: https://en.m.wikipedia.org/wiki/Lexer_hack . C has an entire website for helping people interpret its type gibberish: https://cdecl.org/ . C's string handling is famously broken. C's formatting machinery is Turing complete! Coercions out the wazoo. Switch fallthrough having exactly the wrong default. The fact that Duff's Device works at all. Delegating all abstraction to an infamously error-prone textual macro language. You could spend an entire career learning this language and still find exciting new ways to blow your leg off. If C is our bar for simplicity, it explains a lot about the state of our profession.
I write C every day and wrote C compiler. I know all all the issues it has very well, but it still a relatively simple language. It also not that difficult to parse. The lexer hack is interesting, because you can almost get away with using a context-free lexer, but you need this one hack. But this is not really a problem. People failing to read c declarations is also not the same thing as complexity, although I would agree that those are weird. Null-terminated strings have safety issues, but this is also not the same thing as complexity.
I would argue that the problem of C is something else: It does not provide enough functionality out of the box. So instead of using some safe abstraction, people open-code their string manipulation or buffer management. This is then cumbersome and error prone, leading to complexity of the solution and safety issues which would could easily be avoided.
I prefer a simple computer language with many real-life alternative compilers, and on the long run: the code will be hardened, where appropriate, because it is not cheap, over time.
And we must not forget that, 100% "safe" high level code should never be trusted to be compiled into 100% safe machine code.
And you are right, C syntax is already too complex: integer promotion should go away like implicit casts, 1 loop statement is sufficient, should have had only sized primitive types, etc, etc. Just need a few new inline keywords for modern hardware architecture programming (atomics, barriers, endianness).
What aspects do you consider "ultra-complex"? I agree that it has a very strange syntax, many features of it being unknown to most people; but besides that, it's as easy as Pascal, isn't it?
When refering to Pascal, I mean something like Turbo, Vax or Apple Pascal, i.e. the version used at the height of popularity. Original Pascal has much less degrees of freedom. And I have no reason to assume that Turbo or Apple Pascal have less possibilities for memory corruption, or are better specified.
Starts by having proper strings and array types with bounds checking instead of pointers, followed by memory allocation with types instead of sizeof math, less scenarios for implicit conversions, reference parameters reducing the use cases where an invalid pointer might be used instead.
UB has nothing to do with complexity. In any case, from about 87 UB in the core language, we eliminated 15 in the last meeting, and already have concrete proposals for 10 more. C2Y will likely not have any trivial UB and hopefully also optional safety modes that eliminate the others.
It certainly has, as proven by recent talk at BlueHat 2024, on Windows kernel refactorings, as not everyone is knowledgeable of ISO C minutia and how optimisers take advantage of it, and still think they know better than analysers.
Maybe you can explain this better. People not knowing about footguns is also not the same complexity, it is just having footguns and people not knowing about them.
Just the integer promotion rules are more complex than many languages' entire parser. And undefined behaviour makes the language practically impossible to learn, because there's simply no way to confirm the answer to any question you have about the language (other than "submit a DR and wait a few years for the answer to be published" - even the people who wrote the standard are confidently wrong about what it says, all the time) - you can form a hypothesis about how the language works, write a program to test this hypothesis, observe that the result is what you thought it would be - and you've learnt nothing, because your program was almost certainly undefined behaviour under a strict enough reading of the standard.
I wasn't aware that it is that critical. I have been doing C projects of all sizes and on different platforms and with different toolchains for forty years, including many where the same code runs on different platforms and is built with different toolchains, and I have never come across an undefined behavior for which there was no practical work-around in reasonable time. I have also never seen a language specification that answers all questions, not even Pascal or Ada. I agree that implicit conversions are an unfortunate feature of C, but I think the same about all languages where you can easily assign floating point to integer variables (or vice versa), for example. Cross-toolchain and cross-platform experiments are a constant activity with all the programming languages I use.
> I have never come across an undefined behavior for which there was no practical work-around in reasonable time.
How would you know? You don't generally find out until a newer compiler release breaks your code.
> I have also never seen a language specification that answers all questions, not even Pascal or Ada.
Maybe, but I haven't see "upgrade your compiler, get a new security bug" be defended so aggressively in other languages. Probably more cultural than legalistic - obviously "the implementation is the spec" has its problems, but most languages commit to not breaking behaviour that most code relies on, even if that behaviour isn't actually written in the spec, which means that in practice the language (the social artifact) is possible to learn in a way that C isn't.
> I agree that implicit conversions are an unfortunate feature of C, but I think the same about all languages where you can easily assign floating point to integer variables (or vice versa), for example.
So don't use those languages either then?
> Cross-toolchain and cross-platform experiments are a constant activity with all the programming languages I use.
Sounds pretty unpleasant, I practically never need to do that.
It is great to see some new C tooling emerge. I will likely make my own C FE public some time but it now uses some toy backend which needs be replaced...
What's kind of amazing is those people who are "I wrote a small C compiler" and advocating for ultra-complex-syntax computer languages: They know they could _NOT_ have said "I wrote a ultra-complex-syntax computer language compiler"...
What are the specific things you do not like about C23 or the C2y road map (whatever this is, it is more random walk)? I have my own list of course, but overall I still have some hope that C2y does not turn out to be a total disaster.
I've spent too little time with the recent standards/draft to give a specific answer. But I have a general attitude: C89 with extensions or C99 were just perfect for almost any purpuse; newer standards may well correct minor inadequacies or integrate things, that used to be implemented by proven libraries, directly into the language; but the price for these relatively minor improvements is high; people who write supposedly reusable code in the newer standards effectively force all older projects to switch to the newer standard; the costs of this are rarely justifiable. And there is C11 which made mandatory parts of the C99 standard optional, thus breaking backwards compatibility.
In other words, since this alternative LLVM is coded in plain and simple C, it is shielded against those who are still not seeing that computer languages with an ultra complex syntax are not the right way to go if if want sane software.
You also have QBE, which with cproc will give you ~70% of latest gcc speed (in my benchmarks on AMD zen2 x86_64).