"Amazon RDS for PostgreSQL is an Amazon Web Services (AWS) service which provides managed instances of the PostgreSQL database. We show that Amazon RDS for PostgreSQL multi-AZ clusters violate Snapshot Isolation, the strongest consistency model supported across all endpoints. Healthy clusters occasionally allow..."
Direct, to-the-point, unembellished and analogous to how other STEM disciplines share findings. There was a time I liked reading cleverly written blog posts that use memes to explain things, but now I long for the plain and simple.
Multi-AZ instances is a long-standing feature of RDS where the primary DB is synchronously replicated to a secondary DB in another AZ. On failure of the primary, RDS fails over to the secondary.
Multi-AZ clusters has two secondaries, and transactions are synchronously replicated to at least one of them. This is more robust than multi-AZ instances if a secondary fails or is degraded. It also allows read-only access to the secondaries.
Multi-AZ clusters no doubt have more "magic" under the hood, as its not a vanilla Postgres feature as far as I'm aware. I imagine this is why it's failing the Jepsen test.
Software developers nowadays barely know about transactions, and definitely not about different transaction models (in my experience). I have even encountered "senior developers" (who are actually so called "CRUD developers"), who are clueless about database transactions.. In reality, transactions and transaction models matter a lot to performance and error free code (at least when you have volumes of traffic and your software solves something non-trivial).
For example: After a lot of analysis, I switched from SQL Server standard Read Committed to Read Committed Snapshot Isolation in a large project - the users could not be happier -> a lot of locking contention has disappeared. No software engineer in that project had any clue of transaction models or locks before I taught them some basics (even though they had used transactions extensively in that project)..
not what a RDBMS stakeholder wants to wake up to on the best of days. I'd imagine there were a couple emails expressing concern internally.
hats off to aphyr as usual.
Due to the way PostgreSQL does snapshotting, I don't believe this implies such a read might obtain a nonsense value due to only a portion of the bytes in a multi-byte column type having been updated yet.
It seems like a race condition that becomes eventually consistent. Or did anyone read this as if the later transaction(s) of a "long fork" might never complete under normal circumstances?
Am I correct in understanding either AWS is doing something with the cluster configuration or has added some patches that introduce this behavior?
I was worried I had made the wrong move upgrading major versions, but it looks like this is not that. This is not a regression, but just a feature request or longstanding bug.
We've worked around it by not touching the hot stove, but it's kind of worrying that there are consistency issues with it.
I understand the mission of the Jepsen project but presenting results in this format is misleading and will only sow confusion.
Transaction isolation involves a ton of tradeoffs, and the tradeoffs chosen here may be fine for most use cases. The issues can be easily avoided by doing any critical transactional work against the primary read-write node only, which would be the only typical way in which transactional work would be done against a Postgres cluster of this sort.