Photo by Bill Jelen on Unsplash

Throughput and IOPS: how much is enough?

Storage is often one of the least considered areas when launching an application in the cloud. When setting up an AWS EBS gp3 disk, you often use the default settings, 125MB/s, and 3,000 IOPS, right? You're in for a surprise, so keep reading.

The concepts of throughput and IOPS (Input/Output Operations Per Second) might seem intimidating, but understanding them is critical even for individuals with basic development skills. This article aims to clarify throughput and IOPS, explaining their significance and how they can influence your decision-making concerning data storage and processing requirements.

What is Throughput?

Throughput refers to the volume of data that can be processed or transferred within a specific time frame. It denotes the rate at which data can be read from or written to a storage device, network, or system.

Imagine a pipeline through which water flows. Throughput would be how much water can pass through the pipeline in a certain amount of time. Similarly, in computing, throughput indicates the volume of data that can move through a system within a specific period.

What are IOPS?

IOPS, or Input/Output Operations Per Second, is a metric measuring the number of read or write operations a storage device can perform in one second. This metric is particularly relevant when assessing the performance of storage systems such as hard disk drives (HDDs), solid-state drives (SSDs), or storage area networks (SANs).

Think of IOPS as the number of cars that can pass through a toll booth in one second. It measures the speed at which the storage device can handle requests to retrieve or store data. Higher IOPS generally imply better performance and faster data access.

The Relationship between Throughput and IOPS

Although related, throughput and IOPS are not interchangeable terms. While throughput focuses on the volume of data transferred, IOPS zeroes in on the number of individual read/write operations. However, these two metrics are interconnected.

Random Read and Writes

Consider a storage device with high IOPS capability; it can handle many small read/write operations per second. If each operation involves a small data amount, the overall throughput might be low. On the other hand, a storage device with lower IOPS capability might handle larger data chunks per operation, resulting in higher throughput.

The Amount of Data

When benchmarking and performance testing using tools like fio, different block sizes are used to simulate various workloads and evaluate storage system capabilities.

Commonly used block sizes include 4k and 64k, typically associated with file access and database workloads. These smaller block sizes measure the performance of random read/write operations, where the data is not read or written in a predictable order. Small blocks provide better granularity for random I/O, allowing efficient retrieval of specific data within larger files.

In contrast, larger blocks allow the storage system to read or write data in larger chunks, reducing the overhead associated with accessing individual smaller blocks. Large blocks are used in scenarios involving data warehousing, backups, and big data processing, where sequential access is logical. This pattern involves accessing data in a sequential order without frequent jumps between different locations.

You can use a calculator to convert IOPS to throughput, provided you know either the IOPS or the throughput AND the block size.

EBS vs EFS

Let's return to AWS. By using the following command inside a directory of the disk you want to test, you can gain an idea of the storage being used.

Here are the results of the default gp3 settings (125MB/s throughput / 3k IOPS) that people typically use without modification:

File systems are usually configured for 4k block sizes nowadays. That means your application is limited to a maximum throughput of 12MB/s for file access. For perspective, that's equivalent to the maximum data speed of USB 1.1 (1998).

How about AWS EFS? If you've ever had to deploy a multi-az cluster and needed the same directory mounted on multiple nodes, you've likely used it:

I'm sure you're wondering, "EFS is faster than EBS? Impossible, I could never run a database on EFS." You're correct. EFS is faster if you only consider IOPS and throughput, but there's another important concept: latency.

Latency

Latency refers to the time a storage system takes to respond to a read or write request. High latency can impact the speed at which data is accessed or transferred, even with optimal block sizes and read/write patterns. When evaluating storage system performance and making decisions, it's essential to consider both latency and other pertinent factors to ensure accurate assessments and meet specific workload requirements.

You can use tools like ioping or fio to test latency. In the two tests above, you can see that EBS had an average latency response of 573us, while EFS had 2ms. That means EBS responded, on average, four times faster. In the worst measurement, EBS took 848us, while EFS took 4.4ms, a staggering five times difference.

CPU time spent waiting on I/O

For a web application server running a database, 500us-1ms is typically the acceptable latency range that a regular SSD should deliver effortlessly with current technology. A hard disk drive (HDD), a mechanical device, usually has a latency between 10 to 20ms, demonstrating why it's not recommended anymore for production workloads, other than data warehousing.

How Much is Enough?

Determining the appropriate levels of throughput and IOPS largely depends on your specific requirements. Various factors, such as your workload's nature, application demands, and user expectations, should be considered.

As we've seen before, EBS gp3 with its default settings (3k IOPS) will limit your app to 12MB/s throughput for file access. While this might be sufficient for handling a web application with minimal traffic without significant delays, it's far from ideal if your application relies heavily on file access or if you start dealing with more traffic.

ChatGPT's recommendation for a web application with minimal traffic is:

IOPS: An upper limit around 1,000 IOPS. This provides a buffer for potential traffic spikes or increased workload while still keeping resource usage within reasonable limits.

Throughput: An upper limit typically in the range of 100 Mbps to 1 Gbps (gigabit per second). This range allows for efficient data transfer and ensures that your application can handle increased traffic or data-intensive operations if required.

1k IOPS equates to 4MB/s throughput for 4k block sizes, so the above suggestions is far from accurate. That's why you should double check what AI models tell you.

One way to gauge how much IOPS and throughput your application might need is by measuring what a general-purpose cloud instance typically offers, such as AWS Lightsail, DigitalOcean, and Linode (Akamai):

The results are quite enlightening:

  • AWS Lightsail: 3k IOPS, 128MB/s throughput, 1ms latency. Pretty much EBS gp3 default settings, which are "adequate" for a small application but far from ideal as 3k IOPS for a 4k block size is a limiting factor (12MB/s).
  • DigitalOcean: 33k IOPS, 1.8GB/s throughput, 716us latency. Offers ten times what Lightsail does and should be more than enough for most applications.
  • Linode/Akamai: 124k IOPS, 16.3GB/s throughput, 325ms latency. These figures are so high that it's hard to know whether to be amazed or concerned; it could mean that Linode isn't rate limiting the storage of their servers, potentially allowing a neighbor instance to exhaust the host resources, or they might have state-of-the-art storage hardware and incredibly generous rate limit defaults.
Dangerously fast like Linode storage

Conclusion

This article merely scratches the surface of what storage technology and concepts mean in the real world. However, it should provide you with a better understanding of how to configure the storage for your next application.

Remember, storage is just one factor to consider when trying to maximize your application performance. Other factors like CPU frequency (for single-threaded applications), network throughput, and even memory speed can also impact your application's response times. However, well-written code will almost always outperform any poorly written software in terms of performance, security, and more.

Comments