Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
CircleCI: Our First Postmortem (circleci.com)
14 points by dlowe on May 21, 2013 | hide | past | favorite | 7 comments


Happens to the best of them... thanks for the detailed writeup guys. Love the service.


Thank you!


any reason why you guys choose to pipe the compressed content into `tar xzf` process inside the container instead of extracting it outside and overlay the extracted content onto the container via overlayfs or something similar ?


Piping it in allows the build driver to be agnostic about the physical location of the container.


Troubleshooting can be a bitch.

Could you add a tl;dr though?


I'm not sure if I can do any better than "troubleshooting can be hard", frankly. The actual details are all tangled together in a way that resists summary.


Just tell customers that there were queue backlogs caused by slow git clones that were exacerbated by server failures that occurred due to kernel panics and LVM snapshot problems. These were resolved, but due to MTU configuration changes made during troubleshooting there were further outages; later on an unrelated bug in schejulur caused another outage.

However, all these issues are now resolved and your service is far more robust because of it.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: