Project Titanicarus: I hate DRBD! Time to try CEPH.


If you’ve been following Project Titanicarus, you’ll know that I’ve had a reasonably serious love/hate relationship with clustering filesystems. I’ve been using DRBD and OCFS2 in an active/active configuration for the last 6 months or so. The experience has been ok, but I’m only saying that because of the other horrendous options I’ve tried.

Long & short of it, DRBD and OCFS2 suck. They are painful, I’ve had to write scripts that help them auto recover and even after covering about 10 different failure scenarios I still have downtime on a weekly basis when things don’t go well.

CEPH is part of the Openstack project, it provides a scalable, distributed multi-node striped filesystem that can be mounted as a block level device or using CEPH-FS, CEPH’s own clustering filesystem. Continue reading “Project Titanicarus: I hate DRBD! Time to try CEPH.”

Project Titanicarus: Adding SPDY to NGINX and PageSpeed

Web Servers

Web ServersSince I posted the original post on building NGINX servers with Google PageSpeed there have been a bunch of additional shiny things to add to your NGINX build, the biggest and shiniest one being SPDY. I’m building a new development environment at the moment, so I’ve taken the opportunity to update this post with the additional instructions for including SPDY. I’ve written a script that installs most of the major stuff. The script is pretty well commented and explains what its doing throughout. To use this script, put it into a file called and run “sh” at the command line.

Web Server Install Script

# Update apt
apt-get update 
# Install VMware Tools
apt-get -y install open-vm-tools
# Install NTP 
apt-get -y install ntp 
# OS Upgrades 
# Upgrade ulimits 
echo "*  soft  no file 9000" >> /etc/security/limits.conf  
echo "* hard  no file 65000" >> /etc/security/limits.conf  
echo "session required" >> /etc/pam.d/common-session 
# Make /tmp nice & fast
echo "tmpfs /tmp tmpfs defaults,noexec,nosuid 0 0" >> /etc/fstab 
mount -a 
# Install packages required for NGINX install 
apt-get -y install build-essential zlib1g-dev libpcre3 libpcre3-dev unzip libssl-dev libxslt1-dev libgd2-xpm-dev libgeoip-dev libperl-dev postfix unzip 
# Prepare ngx_pagespeed - check for updated versions at
cd /usr/src/
unzip release-${NPS_VERSION}
cd ngx_pagespeed-release-${NPS_VERSION}-beta/
tar -xzvf ${NPS_VERSION}.tar.gz  # extracts to psol/
cd /usr/src/
# check for the latest version
tar -xvzf nginx-${NGINX_VERSION}.tar.gz
cd nginx-${NGINX_VERSION}
# Configure NGINX for local environment 
./configure --add-module=/usr/src/ngx_pagespeed-release-${NPS_VERSION}-beta --sbin-path=/usr/sbin --conf-path=/etc/nginx/nginx.conf --error-log-path=/var/log/nginx/error.log --pid-path=/var/run/ --lock-path=/var/lock/nginx.lock --http-log-path=/var/log/nginx/access.log --http-client-body-temp-path=/var/lib/nginx/body --http-proxy-temp-path=/var/lib/nginx/proxy --http-fastcgi-temp-path=/var/lib/nginx/fastcgi --with-http_dav_module --with-http_gzip_static_module --with-http_realip_module --with-http_ssl_module --with-ipv6 --with-http_spdy_module 

# Compile NGINX 
# Install NGINX 
make install 

mkdir /var/lib/nginx/ 
mkdir /var/lib/nginx/body 
# Make tmpfs mount for pagespeed data 
echo "tmpfs /var/cache/pagespeed tmpfs size=256m,mode=0775,uid=www-data,gid=www-data 0 0" >> /etc/fstab 
mkdir /var/cache/pagespeed 
chown www-data:www-data /var/cache/pagespeed 
mount /var/cache/pagespeed 

# Create init script 
cd /etc/init.d/ 
wget -Onginx 
chmod +x nginx 
update-rc.d nginx defaults 

# Install php-fpm 
apt-get install -y php5 php5-xmlrpc php5-mysql php5-mcrypt php5-intl php5-gd php5-dev php5-curl php5-common php5-cli php5-cgi php-pear php5-mysql php-apc php5-fpm php5-imap php5-memcache php5-memcached libssh2-php php5-tidy php5-json 

# Install varnish cache & memcached 
apt-get -y install varnish memcached 

# Make varnish faster 
echo "tmpfs /var/lib/varnish tmpfs size=256m,mode=0775,uid=root,gid=root 0 0" >> /etc/fstab 

# Copy default configs to BTSYNC share (Skip if this is not the first server) 
cd /data/config/ 
tar -zxvf appserverconfig.tgz 

# Create symlinks for configs 
cd /etc 
rm -rf nginx
rm -rf php5 
rm -rf varnish
ln -s /data/config/etc/nginx/ 
ln -s /data/config/etc/varnish/ 
ln -s /data/config/etc/php5/ 
cd /etc/default 
rm -rf varnish 
rm -rf memcached 
ln -s /data/config/etc/default/varnish 
ln -s /data/config/etc/default/memcached 
cd /etc/ 
rm -rf memcached.conf 
ln -s /data/config/etc/memcached.conf

Once you’ve finished that script, reboot the server and it should come up running a default install of everything.

Kernel Tuning

These are a few kernel adjustments I’ve found to be useful on web servers, your mileage may vary.

net.ipv4.ip_local_port_range = 2000 65000
net.ipv4.tcp_window_scaling = 1
net.ipv4.tcp_max_syn_backlog = 3240000
net.core.somaxconn = 3240000
net.ipv4.tcp_max_tw_buckets = 1440000
net.core.rmem_default = 8388608
net.core.rmem_max = 16777216
net.core.wmem_max = 16777216
net.ipv4.tcp_rmem = 4096 87380 16777216
net.ipv4.tcp_wmem = 4096 65536 16777216
net.ipv4.tcp_congestion_control = cubic

To apply the changes

sysctl -p /etc/sysctl.conf

Thats it!

You’ve now got a working NGINX, PHP-FPM, Google PageSpeed & SPDY web server running. You’ll probably want to create a couple of virtual hosts, a template for a virtual host with pagespeed enabled can be found here. To create a new virtual host, modify the template file for your site and add it to /data/config/etc/nginx/sites-available/ then symlink to it in /data/config/etc/nginx/sites-enabled/ and restart nginx.

How to handle a tech security incident



One of my favourite apps had a security incident this week. Buffer is a social media management tool that allows you to schedule posts into a “buffer” that posts on a predetermined schedule so you aren’t bombarding people with all your content at once. Its a great app and I’ve been using it for over a year.

One of the things that differentiates the buffer experience from competing products is that their management team are AMAZING communicators. They haven’t lost touch with who they are as things have grown and this weeks experience is no exception. Buffers handling of what could be a fatal experience for a startup is an awesome example of why they are going to be very very successful in the future. Continue reading “How to handle a tech security incident”

Ubuntu crashes due to high IO load using CFQ scheduler

So I’ve just discovered an interesting issue with an Ubuntu 12.04 server crashing under high IO load.

It appears that the default IO scheduler (CFQ) can cause a complete system lockup when its getting flogged.

When this is combined with OCFS2 it can lead to OCFS2 rebooting the system due to fencing. This appears to be the solution:

I’ve just changed two of my boxes to using deadline instead, will report back on how it works.

Project Titanicarus: Part 9 – Building the Email Servers

You've got mail

Email servers.. the bane of every sysadmins existence. The second something goes wrong with an email server, you’re guaranteed to get 100 phone calls and people dropping by your office to say “My emails aren’t working”. This is one part of your hosting infrastructure you want to get right.

I’ve decided to build my infrastructure on Postfix & Dovecot with a MySQL user database. My previous email setup was built using this howto. One of the major issues I ran into was with Courier’s inability to handle large mailboxes so I’ve decided to use a similar setup only with Dovecot in place of Courier and there are a couple of other major differences:

  1. This is going to be a highly distributed configuration (ie multiple servers in multiple datacentres)
  2. This is going to sit behind load balancers (brings interesting spam filtering and security issues)
  3. This is going to use a clustered MySQL backend

So the goal of todays blog post is to deliver:

  • Multi-server & multi-datacentre replicated mail stores
  • Fault tolerance (pull a server out at any time of the day and mail keeps flowing)
  • POP3 & IMAP user access
  • Authenticated SMTP Submission

Continue reading “Project Titanicarus: Part 9 – Building the Email Servers”

Project Titanicarus: Part 8 – Building the FTP Servers

FTP Server

I personally hate & never use FTP, but some people prefer/need it for their development tools to work. Today we’re going to install ProFTPd on our servers using MySQL based virtual users. The following instructions are adapted from this really good howto, if I’ve missed something you may want to check the original version which I’ve recreated here just in case the other one goes away.
Continue reading “Project Titanicarus: Part 8 – Building the FTP Servers”

Project Titanicarus: Part 7 – Building the Web Servers

Web Servers

I am building two app boxes per site. They will host mail, web and DNS for all applications I’m hosting. If I was building a larger implementation I’d separate those tasks out but scale doesn’t justify it just yet.

This week we’re going to take one of the app servers we built previously and install the web server components. I am using NGINX compiled from source as I want to include a plugin called Google PageSpeed that helps make things very quick.

Continue reading “Project Titanicarus: Part 7 – Building the Web Servers”

Project Titanicarus: Part 6 – Building the MySQL Cluster


Before I’d dealt with the filers, I had written this weeks tasks up as being the most difficult part of the project.

I have a bunch of experience working with standard MySQL servers using replication, but I’ve never played with MySQL Cluster server before. Learning how to make it work was made difficult by a lack of packages in the Ubuntu repositories, I also struggled to find documentation that was simple enough to understand and complete the task without having to fill in blanks that were left by those documenting their learnings.

I’ve decided to write up the process I used to build a two node MySQL Cluster, hopefully I can fill in the gaps for others trying to make this kind of project happen for themselves. I’m building one cluster per island on a pair of servers. Inter-island replication is something I’m going to have to experiment with as the MySQL cluster docco seems to say that it gets cranky when asked to replicate over the internet.

Continue reading “Project Titanicarus: Part 6 – Building the MySQL Cluster”

Last updated by at .