> Surprisingly, there are many more stories and publications about bugs caused by weak isolation levels than cases where stronger isolation levels caused impractically low performance.
I expected the article to substantiate the claim that serializable brings a large performance hit as in my experience it isn't so. The article basically makes the same point.
With serializable, you need to be a little careful not to have hot rows. Avoid them by sharding commonly written values. Another way to improve performance is to use true time for ordering non read-then-write transactions. It's a little finicky if the database doesn't provide such guarantees out of the box. Take Google's Spanner as an example. It offers the serializable isolation level and it's pretty performant (as long as you account for hot spots).
> With serializable, you need to be a little careful not to have hot rows. Avoid them by sharding commonly written values
Unfortunately, serializable isolation requires detecting or preventing read-write conflicts (i.e. one transaction writing a row that a concurrent transaction has read). This is the performance impact of serializability: you need to be very careful what you read, because if you read too many rows you prevent any concurrent transactions from updating those same rows. Read-only transactions are OK (because MVCC), and read-only tables are OK (because there's no read-write conflict if a table is mostly read only), but tables that are both written and heavily read are where you get performance problems.
With snapshot isolation (e.g. Oracle's serializable, Postgres repeatable read), only write-write conflicts matter. There it doesn't matter what a transaction reads, and reads never need to block (or abort) writers. So what you say is true for snapshot, but not for serializable.
Interestingly, serializable's lack of need to detect write-write conflicts means that (in some implementations) it can be faster than snapshot for blind writes (i.e. anything that's not a read-modify-write under the covers).
I agree with what you're saying. I think we're saying the same thing. I see that in the text you're quoting I only proposed a way to handle hotly written rows, and didn't address read rows being changed. This is also a problem.
My broader point is that with serializable you need to be aware of these bottlenecks in the database and you need to create a data model and access patterns such that permance hits are avoided as much as possible.
It’s somewhat rare for a transaction to read many rows and then write. Typically a transaction reads a single row, applies business logic, writes a single row. Then you have analytic workloads that do no writes and read entire tables.
The vtuber rabbit hole is very deep and interesting but this comment mentioning Saba seems to be very random, not related to the discussion at all. Left a bitter taste even on someone who knows a bit about vtubing. Not the best way to show off the community on hackernews
I expected the article to substantiate the claim that serializable brings a large performance hit as in my experience it isn't so. The article basically makes the same point.
With serializable, you need to be a little careful not to have hot rows. Avoid them by sharding commonly written values. Another way to improve performance is to use true time for ordering non read-then-write transactions. It's a little finicky if the database doesn't provide such guarantees out of the box. Take Google's Spanner as an example. It offers the serializable isolation level and it's pretty performant (as long as you account for hot spots).
reply