More

lpribis · 2026-04-08T12:16:51 1775650611

I was curious what information I could glean from these for some popular repos. Caveat: I'm primarily an low-level embedded developer so I don't interface with large open source projects at the source level very often (other than occasionally the linux kernel). I chose some projects at random that I use.

*Mainline linux*

Most changed files: pretty much what I expected for 1 and 2... the "cutting edge" of Linux development over other OSes -- bpf and containers. The bpf verifier and AMD GPU driver might get a boost in this list due to sheer LoCs in those files (26K and 14K respectively). An intel equivalent of amdgpu_dm is #21 in the list (drivers/gpu/drm/i915/display/intel_display.c) and nvidia is nowhere to be seen (presumably due to out-of-tree modules/blobs?).

    186 kernel/bpf/verifier.c
    174 fs/namespace.c
    162 drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
    161 kernel/sched/ext.c
    159 fs/f2fs/f2fs.h

Bus factor: obviously none. The top 4

    10399 Christoph Hellwig -> I only know his name because of drama last year regarding rust bindings to DMA subsystem
     8481 Mauro Carvalho Chehab -> I also know his name from the classic "Mauro, shut the fuck up!" Linus rant
     8413 Takashi Iwai -> Listed as maintainer for sound subsystem, I think he manages ALSA
     8072 Al Viro -> His name is all over bunch of filesystem code

Buggy files: Intel comes out on top of GPU drivers this time (twice). Along with KVM for x86(64), the main allocator, and BTRFS.

    1477 drivers/gpu/drm/i915/intel_display.c
    1406 MAINTAINERS
    1390 sound/pci/hda/patch_realtek.c
    1102 drivers/gpu/drm/i915/i915_drv.h
     943 arch/x86/kvm/x86.c
     928 mm/page_alloc.c
     871 drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
     862 drivers/gpu/drm/i915/i915_reg.h
     840 fs/btrfs/inode.c

*GCC*

Most changed files: IR autovectorization code, riscv heuristics tables, and C++ template handling (pt.c is "paramaterized types").

    152 gcc/tree-vect-stmts.cc
    145 gcc/config/riscv/riscv.cc
    131 gcc/tree-vect-loop.cc
    116 gcc/cp/pt.cc

Buggy files: DWARF debuginfo generation, x86 heuristics tables, RS6000(?!) heuristic tables. I had to look up RS6000, it's an IBM instruction set from the 90s lol. cp-tree.h is an interesting file, it seems be the main C(++) AST datastructures.

   1017 gcc/dwarf2out.c
    885 gcc/config/i386/i386.c
    796 gcc/cp/cp-tree.h
    740 gcc/config/rs6000/rs6000.c
    720 gcc/cp/pt.c

*xfwm4* Most changed files: the list is dominated by *.po localizations. I filtered these out. Even after this, I discovered there is very little active development in the last few years. If I extend to 4 years ago, I get: 1. src/client.c - Realizing this project is too "small" to glean much from this. client.c is just the core X client management code. Makes sense. 2. src/placement.c - Other core window management code.

This has not told me much other than where most of the functionality of this project lies.

Bus factor: Pretty huge. Not really an issue in this case due to lack of development I guess.

    3298  Olivier Fourdan
     530  Anonymous
     319  Xfce Bot
     121  Jasper Huijsmans

Files with bug commits: Very similar distribution to most changed files. Not enough datapoints in this one to draw any big conclusions.

I think these massive open projects (excl xfwm) are generally pretty consistent code quality across the heavily trodden areas because of the amount of manpower available to refactor the pain points. I've yet to see an example of "god help you if you have to change that file" in e.g. linux, but I have of course seen that situation many times in large proprietary codebases.

grepsedawk · 2026-04-08T13:06:21 1775653581

Big projects tend to self-correct. These commands hit differently on private codebases with 3-10 contributors, where high-churn usually means one person patching the same thing repeatedly.

croemer · 2026-04-08T16:09:03 1775664543

This is better than the post itself. Showing real output from a real repo.

lpribis · 2026-04-03T11:38:08 1775216288

I don't think any realistic system will bootstrap C from shell. What is the shell implemented in if not C?

gaigalas · 2026-04-03T12:35:54 1775219754

This shell.c can be compiled from c89cc.sh itself:

https://gist.github.com/alganet/1513d7b6abef5c1a53a324d897c3...

Ouroboros self-hosting. They can self-host one another.

The idea is to make shell.c compile from an even simpler C compiler, such as M2-Planet:

https://github.com/oriansj/M2-Planet

Let me remind you that current stage0 bootstraps tinyc from mes, which is an interpreted lisp. It's not that different from the shell architecturally.

The current stage0 also features kaem as one of the first dependencies. kaem is, in fact, a simpler version of the bourne shell.

It's always a tower. You'll never get a one single clean dependency pass in bootstrapping.

zhongwei2049 · 2026-04-03T13:18:00 1775222280

The shell.c ouroboros is really cool. Being able to bootstrap trust through an entirely different language family (shell → C → shell) adds genuine value to the trusting-trust problem beyond just technical novelty.

cestith · 2026-04-03T14:24:07 1775226247

Even if your shell was compiled in C, it doesn’t mean it wasn’t cross-compiled from another platform.

lpribis · 2026-03-11T23:39:28 1773272368

He implied replacing nano was the first step, before using it for more complex (software development) tasks. First use it just for quick one-off edits of /etc/blah.conf then graduate to using it for longer editing sessions.

lpribis · 2026-02-26T08:43:47 1772095427

Rather than being about fast/simple/cheap, I think using SSN as a key was more about the fact that SSN is the only common identifier that almost all US citizens have.

Dylan16807 · 2026-02-26T09:30:13 1772098213

I think you're using the word "key" differently than OP. You're talking about identifiers, and they're talking about security.

SSNs were a good potential identifier, until the people that needed security cheaped out and started using SSNs as a bad implementation of security. Now they're bad at both purposes!

breakingcups · 2026-02-26T09:21:14 1772097674

Yes, designing and implementing a new common identifier almost all US citizens have would have been less cheap and fast.

lpribis · 2026-02-06T22:03:58 1770415438

Sure, but splitting "atomic" operations across a reboot is an interesting design choice. Surely upon reboot you would re-try the first `mv a b` before doing other things.

lpribis · 2026-01-21T10:48:09 1768992489

This is local to the device though. Nothing to do with the WAN. Would still work even on the "serverless" ipv6 network.

lpribis · 2026-01-15T11:13:34 1768475614

The Z80 instruction set lives on via the eZ80, Z180 and others which are binary compatible with the original Z80 instruction set. Unfortunately Zilog stopped making the 40 pin DIP package a couple years ago so yeah this specific board will be hard to source. You can still find them on gray market, mostly ones that have been desoldered from existing boards.

Even if you made a version of this board with the footprint changed to the QFP eZ80, it probably wouldn't work because the eZ80 has different memory mapping and clocking differences.

0xTJ · 2026-01-15T12:16:45 1768479405

The Z180 has however had its PLCC packages discontinued. Personally, I find SMD CPUs to not be appealing for these sorts of projects, even if the Z180 is a great chip.

lpribis · 2026-01-09T01:40:08 1767922808

Really, the best we can do with the NPU is a less battery intensive blurred background? R&D money well spent I guess...

lpribis · 2025-12-29T15:36:29 1767022589

They can't realistically do this in Germany because the tracks are so much more busy than the US. There would more than likely be a train coming the other direction within the next few minutes, and they cannot guarantee all the people have time to vacate the track area.

db48x · 2025-12-30T09:21:29 1767086489

Right, but Germany has stations every few miles. Here in America the next station might be hours and hundreds of miles away. Better to stop ¼ mile away instead; people will hardly know the difference. The point is that if for any reason they cannot reach the station then they’ll always stop at the nearest safe place instead. The crew always have an alternate stop.

For really long construction work they’ll actually build an entirely separate train station, like they did in Denver Colorado a few years back. They knew that the construction of the new station downtown would take a few years, so they built a really cheap platform a few miles away on a siding and moved all the arrivals and departures there for the duration.

lpribis · 2025-12-24T08:55:41 1766566541

TCO is not just for parse trees or AST, but in imperative languages without TCO this is the only place you are "forced" to use recursion. You can transform any loop in you program to recursion if you prefer, which is what the author does.