Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Is there a explanation for why C is slower than C++?




The code looks 100% identical except for the namespace prefixes. Must be something particular about github setup, because on mine (gcc15.2.1/clang20.1.8/Ryzen5600X) the run time is indistinguishably close. Interestingly, with default flags but -O3 clang is 30% slower, with flags from the script (-s -static -flto $MARCH_FLAG -mtune=native -fomit-frame-pointer -fno-signed-zeros -fno-trapping-math -fassociative-math) clang is a bit faster.

A nitpick is that benchmarking C/C++ with $MARCH_FLAG -mtune=native and math magic is kinda unfair for Zig/Julia (Nim seem to support those) - unless you are running Gentoo it's unlikely to be used for real applications.


The actual assembly generated for the hot loop is identical in both C and C++ on Clang, as you'd expect. It's also identical at the IR level.

It's probably down to the measurement noise of benchmarking on GitHub actions.

I suspect this is it. Any benchmark that takes less than a second to run should have its iteration count increased such that it takes at least a second, and preferably 5+ seconds, to run. Otherwise CPU scheduling, network processing, etc. is perturbing everything.

What if instead we measured with …

BenchExec "uses the cgroups feature of the Linux kernel to correctly handle groups of processes and uses Linux user namespaces to create a container that restricts interference of [each program] with the benchmarking host."

https://github.com/sosy-lab/benchexec


Certainly better, but you’re always going to be better off maximizing the runtime to a level where it just swamps any of the other effects. Then do multiple runs and take an average.

Probably LLVM runs different sets of optimization passes for C and C++. Need to look at the IR, or assembly to know exactly what happens.

It doesn’t as far as I know.

(I have spent a good amount of time hacking the llvm pass pipeline for my personal project so if there was a significant difference I probably would have seen it by now)


You are correct, that was an uneducated guess on my part.

I just glanced at the IR which was different for some attributes (nounwind vs mustprogress norecurse), but the resulting assembly is 100% identical for every optimization level.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: