Distributed Transactions at Scale in Amazon DynamoDB
This paper was presented at the USENIX Annual Technical Conference (ATC) 2023. For those not familiar with DynamoDB’s architecture, previous readings on the DynamoDB ATC 2022 paper are recommended. The 2022 paper lacked a focus on transactions, which this paper addresses. It highlights DynamoDB’s scalability and AWS’s commitment to sharing information widely.
Overview
A key feature of DynamoDB is its predictability at any scale, as discussed in Marc Brooker’s post. In adding transactions, two main constraints were crucial: preserving high performance for single-key operations and utilizing in-place updates without multi-version concurrency control (MVCC) to maintain storage layer integrity. Despite the common belief that transactions hinder scalability, the team found creative solutions, like using one-shot transactions to meet most customer needs.
One-Shot Transactions
One-shot transactions are single requests, unlike traditional transactions involving multiple interactive steps. The DynamoDB paper deviates from the Sinfonia two-phase locking approach, opting for optimistic concurrency control (OCC). Specifically, it employs the Timestamp Ordering (TSO) algorithm, assigning a timestamp to each transaction to maintain order. Transactions are aborted if they attempt to access future data, ensuring serializability.
Architecture and API
The transaction architecture includes a coordinator for OCC TSO protocol execution, with non-transaction operations bypassing it. The API allows for single-shot “read-only” transactions through TransactGetItems, which can’t mix with TransactPutItems. However, CheckItem can be mixed with TransactPutItems for flexible transaction coding.
Write Transaction Execution
During the prepare phase, the transaction coordinator communicates with primary storage nodes. Transactions are accepted under specific conditions related to timestamps and storage states, and, if accepted, are committed through a second phase. Leveraging Paxos for storage node recovery and maintaining a ledger for transaction coordination addresses failure concerns.
Timestamp Effects
While timestamps aid transaction performance, correctness doesn’t rely on them. However, accurate clocks lead to more successful transactions. Risks of future timestamping are mitigated through safeguards and AWS’s time synchronization service.
Read-Only Transaction Execution
Read-only transactions avoid maintaining read timestamps to reduce costs. A two-phase, writeless protocol is employed, where transactions are validated through read consistency checks of log sequence numbers (LSN).
Experiments
Experiments used a uniform key distribution and 900-byte items, scaling operations from 100k to 1M per second. Results showed predictability, with minimal latency variance in high-load conditions. The evaluations also compared cancellation rates for varied workloads.
Conclusion
DynamoDB continues to provide predictable, high-performance operations at scale. Notably, it handled 126 million requests per second during a recent peak with low latency. The transactions were modeled with TLA+ to ensure fault tolerance. While the conference presentation is pending release, Doug Terry’s keynote on DynamoDB transactions offers detailed insights.