Ok, I can’t tell you how to do that, but here is how Instagram did it. I am not going to talk about how they understood their customers and how they created something customers loved. That is all marketese – I can’t tell you how to replicate it.
What I can tell you is that they only have three engineers supporting all their webops to take care of billions of photos, terabytes of data, and millions of users. The numbers are mind-boggling. If these numbers are thrown at any CIO of an enterprise, they would come back with a budget for 100 people and 3 year plan to implement a program to manage the data.
Here are a few simple things they did right:
- They only focused on essentials – they did not focus on keeping anything in-house that did not belong there. Yes, they are entirely cloud based. They are heavy users of Amazon EC2.
- They used open source extensively and hacked it when needed.
- They use Ubuntu 11.04 on EC2
- Django for app server (stateless web – means horizontal scaling).
- Stripped down web server (normally it is apache + mod_wsgi for python, but for their needs they needed low CPU webserver and therefore, they used 'Green Unicorn' (a Python WSGI HTTP Server)).
- PostgreSQL for database (sharded cluster with 12 replicas in different zones)
- Amazon S3 for photo storage
- Amazon CloudFront for CDN
- Redis as in-memory storage for feeds
- Memcache for caching web service support (not sure why did not use Redis here also – most likely the software already works with memcache).
- Apache Solr for searching (with JSON interface)
- Twisted for pushing billions of notifications
- Good focus on DevOps
- They used nginx for load balancing (see my proposal for earlier).
- They used Amazon Elastic Load balancer (though, they could do without it).
- Munin for monitoring
- Outsourced services for incident notifications(Pingdom for monitoring and PagerDuty for incidents)
- Sentry for App server reporting in real-time
The picture is a rough approximation (most of the information is taken from the wonderful site: http://instagram-engineering.tumblr.com)
What lessons lie for us poor enterprise developers, who are stuck using Java, and forced to use in-house resources that are neither flexible not scalable? Unfortunately, we will have to wait until the IT people let go of their cold dead-fingers off the inflexible IT.
Nevertheless, here is what an architect could do:
- Architect the systems such a way that parts of the resources (data, especially) lies outside the enterprise.
- Use open standards like REST and JSON to quickly pull together different systems
- Focus on DevOps from the beginning. Assume that your application needs to be maintained.
- Keep a consistent set of tools (most of the tools used in Instagram are popular in Python community)
- Most importantly, focus on getting the job done!