Whats the point of writing concurrent code if its not faster?

Jtsummers · on June 12, 2020

Contrasting with jdlshore, concurrency can make programs much easier to reason about, when done well. This is a benefit of both Go and Erlang, though they use different approaches.

Concurrency can help you separate out logic that is often commingled in non-concurrent code, but doesn't need to be. As a real-world example, I used to do safety critical systems for aircraft. The linear, non-concurrent version, included a main loop that basically executed a couple dozen functions. Each function may or may not have dependencies on the other functions, so information was passed between them over multiple passes through this main loop (as their order was fixed) using shared memory.

A similar project had about a dozen processes, each running concurrently. There was no speed improvement, but the connection between each activity was handled via channels (equivalent in theory to Go's channels, less like Erlang's mailboxes as the channels could be shared). We knew it was correct because each process was a simple state machine, separated cleanly from all other state machines.

The second system's code was much simpler, there was no juggling (in our code) of the state of the system, compared to managing the non-concurrent logic. If a channel had data to be acted on, the process continued, otherwise it waited. Very simple. And it turns out that many systems can be modeled in a similar fashion (IME). Of course, we had a very straightforward communication mechanism (again, essentially the same as Go channels except it was a library written in, as I recall, Ada by whoever made the host OS).

rukittenme · on June 12, 2020

Signals are not dependent on concurrency. And you don't need multiple processes to implement a state machine.

I mean think about it. Whats the difference between sending message A and then message B versus sending messages A and B into a queue and letting some async process pop from it? Less complexity and guaranteed message delivery come for free in single-threaded code.

Am I wrong? What am I missing?

jdlshore · on June 12, 2020

I don't think you're wrong, but in Jtsummers' specific case, I think multi-processing probably would be simpler. You don't have to implement the event loop, there's no risk of tromping on other processes' data, and if a process gets into an invalid state, you can just die without impacting others.

You'd need a good watchdog and error handling, but presumably some of that came for "free" in their environment.

Although if you take out the "free" OS support, watchdog, etc., I agree that there's likely a place between "shared memory spaghetti" and "multi-processing" that's simpler than both.

Jtsummers · on June 12, 2020

Exactly this. I had started my own reply and refreshed and saw yours, thanks.

The other benefit of the concurrent design (versus the single-threaded version) was that it was actually much simpler. This was critical for our field because that system is still flying, now 12 years later, and will probably be flying for another 30-50 years. The single-threaded system was unnecessarily complex. Much of the complexity came from having to include code to handle all the state juggling between the separate tasks, since each had some dependency on each other (not a fully connected graph, but not entirely disconnected either). The concurrent design made it trivial to write something very close to the most naive version possible, where waiting was something that only happened when external input was needed. So the coordination between each task just fell out naturally.

You still have to care about locking the system up, but in our case because each process was sufficiently reduce to its essentials, this was easy to evaluate and reason about.

detaro · on June 12, 2020

"some async process" is a concurrency mechanism, is it not?

rukittenme · on June 12, 2020

It is. The single-threaded example comes before the "versus". The async example comes after. I should have been more clear.

detaro · on June 12, 2020

Ah, indeed misread that. Then my answer is: Singlethreaded code sometimes has to implement things an async environment would handle for you.

I.e. when handling many in- and outputs I can write my own loop around epoll etc, write logic to keep of track of queues of data to send per-target etc. Or I can use a runtime that provides that for me and lets me mostly pretend things are running on their own.

jdlshore · on June 12, 2020

Concurrency is notoriously difficult to reason about. Concurrency bugs are also a f__king nightmare to debug.

Given how slow I/O operations are, and how much modern code depends on the network, we typically need some concurrency in our code. So for me, almost always, the question isn't, "which concurrency choice is fastest?" but rather, "which concurrency choice is fast enough while leading to code with the least bugs?"

rukittenme · on June 12, 2020

If you are I/O bound, concurrency has a use case. I don't argue against it. I'm pointing out that its pointless to write concurrent code if you don't expect a performance benefit from it.

It's like multi-threading 2+2.

rlpb · on June 12, 2020

See https://news.ycombinator.com/item?id=23502286 for a good example.