One thing I haven't seen being discussed is the BEAM internals becoming a little long in the tooth. We still have static reductions before the scheduler switches to another task, the priority system in scheduling is a bit dodgy, flipping vmargs is kinda complex, lock counting and crash dump tooling kinda suck, etc.
BEAM is great, although it's definitely missing something like pprof for go or java flight recorder.
Yeah, I don't know why this falsehood continues to persist. WhatsApp and Ericsson engineers continue to work together to evolve Erlang, alongside a bunch of other people across the industry.
Huh, kinda funny, I feel the exact same way. Few people IRL know about my website, 2-3 people occasionally write in a year about something, but weirdly it feels like the idea of a personal, non commoditized internet space has become so rare it's seen as odd.
I would be curious why kube cron jobs didn't seem to fit the bill, my favorite part of these posts are when they have a section hinting that they explored other options picked specific tradeoffs
Yeah, the article raises more questions than it answers them.
> When designing this new, more reliable service, we decided to leverage many existing services to decrease the amount we had to build
This might explain building from scratch. Maybe the existing solutions had dependencies they didn't want to maintain and they opted for using the existing internal systems. It feels like that influenced all the rest.
I definitely don't come anywhere near Slack's scale but I've managed systems where over 3,000 cron jobs ran per day, half of which came from a cron job running every minute which usually finished in a few seconds. Some of these jobs run for X minutes too.
It's nice because there's properties you can configure for each cron job around retries and if it should be uniquely run or not. Maybe certain cron jobs should be re-tried if they fail, for others maybe it's ok to be picked up on the next interval if it fails.
Overall it's been super stable for almost 2 years which is when I started using them. Only a handful of jobs failed over this period of time and they weren't the result of Kubernetes, it was because the HTTP endpoint that was being hit from the cron job failed to respond and the cron job failure threshold was reached.
It's a good reminder that important jobs run on a schedule should be resilient to failure (saving progress, idempotent, etc.).
Do you capture all of your job code in a single image and reference execution paths on container startup per job? Or, are you building an image per job?
The jobs all run curl commands to a specific API endpoint with a specific bearer token. Those tokens are loaded through an env through SealedSecrets.
They all use the public curl image where I override the command in the Kubernetes cron job definition. The job container itself starts almost instantly since there's no app to boot.
If I had a case you're describing I would use the main app's image and run a specific command, in this case I'm assuming if there's not an API endpoint it would be some callable script that lives in your app's code / image.
Spinning up a new Kubernetes pod for every single job run is a very expensive and wasteful operation, starting at least in the order of seconds (usually more) vs just milliseconds for a new process in an already hot environment.
Sure, but if you need that thing to run every hour for a few seconds, then seconds aren’t really the limiting factor. I don’t doubt that the resource management side of k8s would make it dicey at a certain volume of these things running, though, especially if they eat a lot of compute.
When I was doing my masters in anthropology at the same time as my SWE job, I ran into this issue of having too many notes and not enough organization. Random other tools never quite did it, generic search didn't work, notion et all kind of sucked, so I ended up rolling my own thing on top of Org roam. It's all plaintext org files, easily accessible.
When I do scholarly work, it's great because all of stuff is in one place. It's also useful for tracking the breadcrumbs I do for my real tech job.
Why is this news? Iraq does this, Jordan does this, it's common throughout the region to block internet. It's not effective, and usually there are carveouts (office internet still works, usually it's only cellular and public internet, etc). When I was living in Iraq this was just something you dealt with, like a rolling blackout once a while.