As Observu is all about improving uptime and removing bottlenecks, we strongly believe that we can’t do with an ad-hoc infrastructure either. Especially as the exact time you need Observu is often in case of emergency, we feel strongly about the ability to recover from outages quickly.
We’ve selected Amazon for hosting because it is both flexible, is available in multiple parts of the world and has an excellent network quality. However, earlier this year is has been shown multiple times that no datacenter has 100% uptime and that if failure occurs, it is big. Therefore we are currently working on our architecture to be able to quickly overcome such events. The actual details warrant a separate post.
Our obsession with reliability also touched an other area of development: Our initial implementation of SMS notifications proved unreliable. Therefore we changed to Nexmo as our partner. It provides us with actual delivery confirmations, allowing us to monitor delivery.
Our Growing Development Stack
I personally always like to know what people are using to create their product, therefore a listing of almost everything we use:
- Ubuntu on Amazon EC2
- MySQL
- Redis
- PHP
- Perl
- jQuery
- RaphaelJS
- boto
- Fabric
- chef-solo
In the area of 3rd party services we rely on:
- Amazon AWS
- Nexmo
- Tropo
- Sendgrid
- Github
- Uservoice
These services allow us to focus on the things that really matter: gaining insight in all parts of your deployment and staying on top of the events that will occur. To improve that insight, we are currently working with the first customers to implement monitoring as part of their stack. A great example is the need to monitor logfiles centrally as soon as there are multiple servers handling your front-end.