Amazon RDS vs DIY MySQL on EC2 Benchmark

As I was researching online whether Amazon RDS was a viable option, I had a hard time finding reliable benchmarks. The authors of this good book on EC2 mention it to be a bit faster, but without further clarification. The best benchmark I could find was this one. It uses the sysbench tool to test an EC2 instance vs RDS, exactly what I need. It provides the tools for benchmarking and pointed to the difference between running 1 and 10 threads. However, for me this benchmark was missing some vital information, therefore I decided to run my own benchmark using sysbench in a very similar way, with the following adjustments:

  • I’ve used a much bigger dataset: I’ve set it to use 50 million objects, in order to create a 12GB database that will surely not fit the 1.7GB memory.
  • Some parameters like: instance disk vs EBS and MySQL configuration were unspecified

I’ve used the following setups:

  • A small EC2 instance in the USeast region, with Debian squeeze and a standard MySQL install. The database is set-up on a separate EBS volume. (named Mysql on EBS (standard) )
  • The same instance with MySQL tuned to more reasonable values: key_buffer=512M, query_cache=128MB
  • A small RDS instance, set up in the same region

Single Client Thread

First, I repeated the single thread experiment. In this case the instance is not fully utilized. The results are shown below:

System Operations/sec Times (ms)
Transactions Read/Write Other min avg. max. 95th perc.
Mysql on EBS (standard) 18 334 35 4.4 56.9 1186.5 149.1
Mysql on EBS (optimized) 52 991 104 0.0 19.2 728.6 84.4
RDS 23.2 440.6 46.4 11.1 43.1 691.4 90.0

In this experiment the difference between a standard MySQL install and the optimized one is huge. RDS seems to come in comparable to a standard MySQL install, which seems reasonable.

50 Threads

Now, in real development we don’t care about the difference between fast and faster, if your website is growing, what matters much more is performance not deteriorating when things get tougher. Therefore I tried to stretch the database much further by using 50 client threads. This is much closer to the real world with multiple Apache processes constantly hitting the database. Especially in the case where you might have multiple front-end servers connecting to a single database instance. Again the results are shown below:

System Operations/sec Times (ms)
Transactions Read/Write Other min avg. max. 95th perc.
Mysql on EBS (standard) 38 724 76 30.2 1310.7 4662.8 2179.0
Mysql on EBS (optimized) 46 871 92 27.55 1089.4 3031.43 1853.76
RDS 111 2110 222 13.47 450.0 1557.4 807.3

First, the difference between a standard install and the optimized version have been greatly reduced. The most notable result is that RDS performs so much better. This confirms the results the original benchmark but now under conditions that matter to me. Maybe even more important than the difference in query throughput is that RDS does a much better job keeping request times within reasonable bounds. 95% returns within 807ms, compared to 1854ms for MySQL on the EC2 instance.

My conclusion is that although RDS may not perform as well as you can do yourself under ideal conditions, as soon as you are going for realistic loads, RDS can be pushed much further. Of course this should also be possible with DIY optimizations. RDS is after all running MySQL, but I’m sure it’s going to take a significant amount of time and does not outweigh the other benefits of RDS: easier backup and much less management.

November 3th, 2011 Further benchmarking has shown me that it is actually quite easy to bring the throughput of your own instance running mysql much closer to RDS, by increasing the innodb_buffer_pool_size. My lack of experience with InnoDB clearly biased the benchmark above. I do however still notice the difference in response times, RDS is much more stable.

Notes: 1: I’ve also benchmarked thread-numbers in between, but there was no interesting pattern. Results on 4 threads and up are largely similar to the 50 thread one for RDS, while for MySQL the times gradually get worse as the number of threads grows. 2: I’ve also done an experiment running MySQL on the instance disk, instead of EBS, but it wasn’t better and it removes all benefits of using EBS, therefore results are not included. 3: For more reliable results this should probably be repeated at different points in time with multiple instances.

This entry was posted in Benchmarks and tagged , , , , , . Bookmark the permalink.

7 Responses to Amazon RDS vs DIY MySQL on EC2 Benchmark

  1. Mo says:

    Thank you very much for these benchmarks! You’ve helped me make an educated decision on whether I should go with RDS or MySQL on EC2 (for my current setup).

    • iDev247 says:

      Which one did you end up going with? How did it go?

      • Michiel says:

        I can’t speak for Mo, but for Observu we’ve selected DIY MySQL in combination with Galera for replication. The main reason to let RDS pass is because there is no easy way to get your data out of there without downtime. (mysqldump isn’t that much fun for big databases) Another one is that failover in the Multi-AZ setup is reported to take a few minutes (due to DNS caching, etc) Using Galera, we can make multiple servers known to our app and we can do failover in the application layer if needed.

  2. Ditto says:

    Hi,

    Can you post your 50 thread script/command please?

    Thanks

  3. If you are using only InnoDB, then – to my understanding – the settings key_buffer and query_cache should not have any effect, other than controlling how much memory is unnecessarily wasted. These settings are for MyISAM.

    • Michiel says:

      You are mostly right, hence the addendum. As you can see there is some effect in the first benchmark. As far as I know now, query_cache is beneficial for InnoDB, but only in a situation with a low number of threads and the innodb buffer pools is a much better use of available memory than the query cache.

  4. Pingback: AWS benchmark of MySQL 5.5 RDS vs EC2 | Laurence Gellert's Blog