As I was researching online whether Amazon RDS was a viable option, I had a hard time finding reliable benchmarks. The authors of this good book on EC2 mention it to be a bit faster, but without further clarification. The best benchmark I could find was this one. It uses the sysbench tool to test an EC2 instance vs RDS, exactly what I need. It provides the tools for benchmarking and pointed to the difference between running 1 and 10 threads. However, for me this benchmark was missing some vital information, therefore I decided to run my own benchmark using sysbench in a very similar way, with the following adjustments:
- I’ve used a much bigger dataset: I’ve set it to use 50 million objects, in order to create a 12GB database that will surely not fit the 1.7GB memory.
- Some parameters like: instance disk vs EBS and MySQL configuration were unspecified
I’ve used the following setups:
- A small EC2 instance in the USeast region, with Debian squeeze and a standard MySQL install. The database is set-up on a separate EBS volume. (named Mysql on EBS (standard) )
- The same instance with MySQL tuned to more reasonable values: key_buffer=512M, query_cache=128MB
- A small RDS instance, set up in the same region
Single Client Thread
First, I repeated the single thread experiment. In this case the instance is not fully utilized. The results are shown below:
|Mysql on EBS (standard)||18||334||35||4.4||56.9||1186.5||149.1|
|Mysql on EBS (optimized)||52||991||104||0.0||19.2||728.6||84.4|
In this experiment the difference between a standard MySQL install and the optimized one is huge. RDS seems to come in comparable to a standard MySQL install, which seems reasonable.
Now, in real development we don’t care about the difference between fast and faster, if your website is growing, what matters much more is performance not deteriorating when things get tougher. Therefore I tried to stretch the database much further by using 50 client threads. This is much closer to the real world with multiple Apache processes constantly hitting the database. Especially in the case where you might have multiple front-end servers connecting to a single database instance. Again the results are shown below:
|Mysql on EBS (standard)||38||724||76||30.2||1310.7||4662.8||2179.0|
|Mysql on EBS (optimized)||46||871||92||27.55||1089.4||3031.43||1853.76|
First, the difference between a standard install and the optimized version have been greatly reduced. The most notable result is that RDS performs so much better. This confirms the results the original benchmark but now under conditions that matter to me. Maybe even more important than the difference in query throughput is that RDS does a much better job keeping request times within reasonable bounds. 95% returns within 807ms, compared to 1854ms for MySQL on the EC2 instance.
My conclusion is that although RDS may not perform as well as you can do yourself under ideal conditions, as soon as you are going for realistic loads, RDS can be pushed much further. Of course this should also be possible with DIY optimizations. RDS is after all running MySQL, but I’m sure it’s going to take a significant amount of time and does not outweigh the other benefits of RDS: easier backup and much less management.
November 3th, 2011 Further benchmarking has shown me that it is actually quite easy to bring the throughput of your own instance running mysql much closer to RDS, by increasing the innodb_buffer_pool_size. My lack of experience with InnoDB clearly biased the benchmark above. I do however still notice the difference in response times, RDS is much more stable.
Notes: 1: I’ve also benchmarked thread-numbers in between, but there was no interesting pattern. Results on 4 threads and up are largely similar to the 50 thread one for RDS, while for MySQL the times gradually get worse as the number of threads grows. 2: I’ve also done an experiment running MySQL on the instance disk, instead of EBS, but it wasn’t better and it removes all benefits of using EBS, therefore results are not included. 3: For more reliable results this should probably be repeated at different points in time with multiple instances.