> When debugging a vexing problem one has little to lose by using an LLM — but perhaps also little to gain.
This probably doesn't give them enough credit. If you can feed an LLM a list of crash dumps it can do a remarkable job producing both analyses and fixes. And I don't mean just for super obvious crashes. I was most impressed with a deadlock where numerous engineers and tried and failed to understand exactly how to fix it.
After the latest production issue, I have a feeling that opus-4.5 and gpt-5.1-codex-max are perhaps better than me at debugging. Indeed my role was relegated to combing through the logs, finding the abnormal / suspicious ones, and feeding those to the models.
This probably doesn't give them enough credit. If you can feed an LLM a list of crash dumps it can do a remarkable job producing both analyses and fixes. And I don't mean just for super obvious crashes. I was most impressed with a deadlock where numerous engineers and tried and failed to understand exactly how to fix it.