Project Titanicarus: Part 1 – Building a better web infrastructure

The generous boys at Simtronic have just given me a bunch of new server capacity to stick my personal web infrastructure on, so I thought I’d have a go at building something really scalable, fault tolerant, easy to maintain and of course wildly over spec’d for what I need :-)

My current web infrastructure is a series of virtual machines all over the place (AWS, Customer/Friends Networks etc), the goal is to build myself a series of self healing “islands” that can operate independently if required or together when everything is operating ok. I hope that this will eventually become an infrastructure blueprint for other ventures I get myself tangled up in.

The name – yup it’s a mouthful, but it means something. Titanicarus was inspired by the recent Clickfrenzy debacle in which a local web hosting provider failed to properly scale their infrastructure for the hammering of a lifetime. The name is a combination of Titanic, the unsinkable ship and Icarus, the man who flew too close to the sun, melting his wings and falling to his death.

Web infrastructure needs to be stable, and able to adapt quickly. I’m trying to build this infrastructure so it can scale up and down quickly, reacting to whatever Icebergs might come our way while maintaining a reasonable cost overhead so we don’t melt our wings.

So here goes, I’ll be publishing posts with all the gory details over the next few weeks. If you see big holes, have questions or want to poke fun, please feel free to comment or drop me an email.

Project Overview

Service Deliverables

  • DNS Hosting
  • Email Hosting – POP, IMAP, SMTP & Spam Filtering
  • Web Hosting – HTTP, HTTPS, PHP
  • Database Hosting – MySQL
  • Centralised Authentication & Key Management
  • Common home directories on all servers

Solution Requirements

  • Must be fault tolerant
  • Must scale quickly and simply
  • Security must be centrally managed
  • Service failover must be transparent to users and not require admin intervention
  • Loss of an entire island must not stop service
  • Islands must be able to load balance between each other
  • Islands may deliver one or all of the services above as local infrastructure allows

Global Load Balancer Design (Tying the islands together)

Global Load Balancing

As you can see above, I plan to have more than one Island. Its likely I’ll have three by the time the project is done. Each Island will be able to do the job I require on its own, but thats not half as fun (or fast) as having three. Having three allows for maintenance to be carried out on one and still to have redundancy available, it also allows for serious disasters to happen that might affect two sites (if they are in the same city for example) and for the third site to carry on as if nothing is wrong.

I’ve chosen Amazon’s DNS platform to run part of the load balancing solution and to host the root domains for my hosting solution. I’ve been using Amazon’s DNS for a while, its very reliable and has some cool features including health checks which I’m going to use to make the platform as reliable as possible.

Each site has a pair of load balancers and a firewall. The pair of load balancers makes for easier maintenance, meaning I can easily turn one off during the middle of the day and service will continue with minimal disruption.

And finally a very simple firewall will be installed to run VPN’s between sites and to provide NATed internet access for servers.

Single Island Design

Single Island Design

The backend of each site is going to be pretty simple, a pair of mirrored Filers, a pair of web servers and a pair of database servers. The configuration will be such that any one box being shutdown or broken will not stop the operation of other boxes.. I hope :)

Next week we kick things off with “Building Servers”.

Last updated by at .