Jepsen: Amazon RDS for PostgreSQL 17.4

by aphyr

5902d ago

142 comments

Comments (142)

hliyan1d ago

I wish more writing in the software world was done this way:

"Amazon RDS for PostgreSQL is an Amazon Web Services (AWS) service which provides managed instances of the PostgreSQL database. We show that Amazon RDS for PostgreSQL multi-AZ clusters violate Snapshot Isolation, the strongest consistency model supported across all endpoints. Healthy clusters occasionally allow..."

Direct, to-the-point, unembellished and analogous to how other STEM disciplines share findings. There was a time I liked reading cleverly written blog posts that use memes to explain things, but now I long for the plain and simple.

luhn2d ago

It's not mentioned in the headline and not made super clear in the article: This is specific to multi-AZ clusters, which is a relatively new feature of RDS, and differ from multi-AZ instance that most will be familiar with. (Clear as mud.)

Multi-AZ instances is a long-standing feature of RDS where the primary DB is synchronously replicated to a secondary DB in another AZ. On failure of the primary, RDS fails over to the secondary.

Multi-AZ clusters has two secondaries, and transactions are synchronously replicated to at least one of them. This is more robust than multi-AZ instances if a secondary fails or is degraded. It also allows read-only access to the secondaries.

Multi-AZ clusters no doubt have more "magic" under the hood, as its not a vanilla Postgres feature as far as I'm aware. I imagine this is why it's failing the Jepsen test.

havkom1d ago

Good investigation!

Software developers nowadays barely know about transactions, and definitely not about different transaction models (in my experience). I have even encountered "senior developers" (who are actually so called "CRUD developers"), who are clueless about database transactions.. In reality, transactions and transaction models matter a lot to performance and error free code (at least when you have volumes of traffic and your software solves something non-trivial).

For example: After a lot of analysis, I switched from SQL Server standard Read Committed to Read Committed Snapshot Isolation in a large project - the users could not be happier -> a lot of locking contention has disappeared. No software engineer in that project had any clue of transaction models or locks before I taught them some basics (even though they had used transactions extensively in that project)..

cswilliams2d ago

Interesting. At a previous company, when we changed the pg_dump command in a backup script to start using parallel workers (-j flag) we started to rarely see errors that suggested inconsistency when restoring the backups (duplicate key errors and fk constraint errors). At the time, I tried reporting the issue to both AWS and on the Postgres mailing list but never got anywhere since I could not easily reproduce it. We eventually gave up and went back to single threaded dumps. I wonder if this issue is related to that behavior we were seeing.

baq1d ago

> This work was performed independently by Jepsen, without compensation

not what a RDBMS stakeholder wants to wake up to on the best of days. I'd imagine there were a couple emails expressing concern internally.

hats off to aphyr as usual.

ezekiel682d ago

In my reading of this, it looks like the practical implication could be that reads happening quickly after writes to the same row(s) might return stale data. The write transaction gets marked as complete before all of the distributed layers of a multi AZ RDS instance have been fully updated, such that immediate reads from the same rows might return nothing (if the row does not exist yet) or older values if the columns have not been fully updated.

Due to the way PostgreSQL does snapshotting, I don't believe this implies such a read might obtain a nonsense value due to only a portion of the bytes in a multi-byte column type having been updated yet.

It seems like a race condition that becomes eventually consistent. Or did anyone read this as if the later transaction(s) of a "long fork" might never complete under normal circumstances?

nijave2d ago

It's not entirely clear but this isn't an issue in multi instance upstream Postgres clusters?

Am I correct in understanding either AWS is doing something with the cluster configuration or has added some patches that introduce this behavior?

tibbar2d ago

The submitted title buries the lede: RDS for PostgreSQL 17.4 does not properly implement snapshot isolation.

badmonster2d ago

What safety or application-level bugs could arise if developers assume Snapshot Isolation but Amazon RDS for PostgreSQL is actually providing only Parallel Snapshot Isolation, especially in multi-AZ configurations using the read replica endpoint?

mushufasa2d ago

> These phenomena occurred in every version tested, from 13.15 to 17.4.

I was worried I had made the wrong move upgrading major versions, but it looks like this is not that. This is not a regression, but just a feature request or longstanding bug.

password43212d ago

It would be great to get all the Amazon RDS flavors Jepsen'd.

film422d ago

I think AWS will need to update their documentation to communicate this. Will a snapshot isolation fix introduce a performance regression in latency or throughput? Or, maybe they stand by what they have as being strong enough. Either way, they'll need to say something.

kchoudhu1d ago

I've suspected that there are consistency issues on RDS for a while now: if you push large quantities of data (e.g. 1MM+ rows) into a database quickly and then try to read the same data out on another connection, you'll periodically get null return sets.

We've worked around it by not touching the hot stove, but it's kind of worrying that there are consistency issues with it.

oblio2d ago

I wonder how Aurora fares on this?

wb141232d ago

Surprised to see Amazon RDS doesn't pass such simple test. Nicely done!

cr3ative2d ago

This is in such a thick academic style that it is difficult to follow what the problem actually might be and how it would impact someone. This style of writing serves mostly to remind me that I am not a part of the world that writes like this, which makes me a little sad.

gitroom2d ago

honestly this made me side-eye aws docs hard, i always think snapshot isolation just means what it says. good catch

henning2d ago

I thought this kind of bullshit was only supposed to happen in MongoDB!

skywhopper2d ago

This is an unfortunate report in a lot of ways. First, the title is incomplete. Second, there’s no context as to the purpose of the test and very little about the parameters of the test. It makes no comparison to other PostgreSQL architectures except one reference at the end to a standalone system. Third, it characterizes the transaction isolation of this system as if it were a failure (see comments in this thread assuming this is a bug or a missing feature of Postgres). Finally, it never compares the promises made by the product vendors to the reality. Does AWS or Postgres promise perfect snapshot isolation?

I understand the mission of the Jepsen project but presenting results in this format is misleading and will only sow confusion.

Transaction isolation involves a ton of tradeoffs, and the tradeoffs chosen here may be fine for most use cases. The issues can be easily avoided by doing any critical transactional work against the primary read-write node only, which would be the only typical way in which transactional work would be done against a Postgres cluster of this sort.

billiam2d ago

New headline: AWS RDS is not CockroachDB or Spanner. And it's not trying to be.

I understand the mission of the Jepsen project but presenting results in this format is misleading and will only sow confusion.