Part-3: Optimising DynamoDB Single-Table Model for Large-Scale Analytics Data
We would like to evolve supporting from our current volume growth to support 5x data volume with increase in number of user triggered jobs. As a first step, to address the scalability, performance, and cost challenges of our initial DynamoDB implementation, we introduced optimizations at both the application and architecture layers.
Code-Level Optimizations
Event Aggregation
Our initial design stored each generated event as an individual DynamoDB item. While straightforward, this approach resulted in millions of write operations for a single analytics job, driving up both network overhead and DynamoDB write costs.
To optimize this, we leveraged DynamoDB's maximum item size of 400 KB by aggregating multiple related events into a single item before persisting them. This significantly reduced the number of write operations, network round trips, and Write Capacity Unit (WCU) consumption.
An additional benefit was improved read efficiency. Since related events were co-located within the same item, downstream applications could retrieve larger logical datasets with fewer database requests.
Data Compression
We further optimized storage by compressing event payloads before persisting them to DynamoDB.
Although decompression introduces a small overhead during reads, the benefits far outweighed the cost. Compression reduced storage consumption, lowered write throughput requirements, and decreased overall DynamoDB operating costs. In addition, the smaller payload sizes reduced network transfer volumes, resulting in an observed performance improvement of approximately 15%.
