Skip directly to content

What Kind of Server Setup and Hosting do I Need for the best Drupal Performance?

Kelly Bell's picture
on Wed, 11/24/2010 - 3:38am

We've talked about this vaguely a couple times but I thought I would briefly summarize "how things scale" in Drupal-land (true for every database-driven site stack though, pretty much), just to give you an idea of how it works:

Vertical Scaling
Vertical Scaling is fastest and easiest from a sysadmin standpoint, since it basically means "move to a bigger machine and/or more RAM and/or faster core processors. For a general guide, see the figures below. Note that our situation is desperate in terms of needing so much space/RAM because we *do* still have a big slow query or process memory leak somewhere, and were swapping files in virtual memory something awful until we moved to a bigger server. This was the fastest, easiest solution at our disposal (and usually is - but what to do when that;s not enough? see HORIZONTAL SCALING)

Horizontal Scaling
The purpose of horizontal scaling is to split up the pieces that normally go on one box into different boxes, thereby reducing the workload on the system. In this scenario, lots of smaller boxes (but with perhaps lots of RAM and multiple fast processors) take the place of one, massive box. Consider these pieces of a typical overall setup:

  1. live site
  2. staging site
  3. development environment(s) (including issue tracking and testing applications)
  4. database(s)
  5. Search indexing
  6. files storage
  7. diagnostics and/or benchmarking/uptime monitoring systems
  8. load-balancer / failover and/or reverse-proxy cache
  9. subdomains (and THEIR accompanying live, stage and dev instances)
  10. deployment / management applications
  11. documentation
  12. site backups / archives

As you can see, this is a looong list - and previously, ALL THIS WAS ON ONE BOX, for CMI - one of the reasons for crappy performance. In fact, this is almost ALWAYS the reason for crappy performance, and is the first place one should look after eliminating coding and server configuration problems (slow queries, zombie processes, etc.).  For a personal blog site or a brochure site, this can actually be fine, but for a site like CMI, this will cause problems sooner rather than later. To address these issues, depending on the number of uncached-page serves and number of simultaneous users on the site in question, and also whether or not there are any "spikes" expected or regularly recurring, one makes different scaling "passes", or "stages of separation". In other words, there's roughly three levels of horizontal scaling, and you can either use an educated guess as to where you think the site belongs, or you can start with the first level and see how the site performs, then move on to subsequent levels until the desired target benchmarks are reached.

Pass One:
Two server setup - database(s) on one server, and everything else on the other. No load balancing or failover. This is the default Mosso Cloud Sites setup.

Pass Two:
Three server setup

  1. database(s)
  2. live (prod) site
  3. development sites. In this scenario, all the rest of the items in the above list share the development box
  4. VARIATION: add a second live/prod server, with load-balancing and failover. The two "live" servers are duplicates, and provide even balancing of user requests from box to box; also if one server goes down or otherwise becomes unavailable, the second box takes over, providing failover, and resulting in uptimes in the 5-6 "nines" range. For mission-critical sites, it is absolutely necessary that this feature be provided. I would argue that the CMI site, when in "pre-event" mode (registration, taking applications), is exactly that - mission-critical, and should have these features as a result. THE LOAD BALANCER REQUIRES ITS OWN SERVER, so this variation ultimately requires 5 servers total (3 of which are small and 2 of which are large and robust).

Pass Three
5+ servers, as below

  1. 1database(s) (can also run multiple copies of database server with load-balancing and failover as in #2)
  2. n+1 "live" servers with load balancing and failover across at least 2 but possibly many more boxes
  3. development and deployment management apps
  4. Search index
  5. Monitoring, Issue tracking, testing, diagnostics, wiki, utilities, misc apps
  6. load balancer (with failover, backup/archive and caching apps) and caching controllers
  7. file server plus backups/archive - media (video, audio) plus other content filetypes are served from a separate server; also Documentation can be served from here
  8. subdomains
  9. CDN - one may also use a CDN or content delivery network like Akamai, Kaltura or Amazon S3 to serve up content from a separate, optimized server/content provider, which server is optimized and indexing for fast serving of content, specifically. One can also create one's own CDN on a separate server using the same hosting provider as for the rest of the servers

 As you can see, one can get a lot of performance improvements by running all these processes in parallel instead of weighing down a single box or three with all this activity. Not to mention that without this isolation, one risks having to take something down that's mission-critical to fix something relatively minor. Passes 2 and 3 are organized to avoid this situation specifically, making sure that mission-critican pieces are running with as little multitasking as possible, with redundancy for their functions, and separation from non-critical pieces.

================================ ON HOSTING ==============================

Managed Hosting Showstoppers

Keep in mind that there are several "dealbreaker" lines of demarcation which make it so that going with an easier-to-use ("managed") hosting system choice like Mosso's Cloud Sites solution ($100+/month - CMI has a Cloud Sites account, but it's currently not being used for the live site) is impossible or causes certain key requirements to go unmet. In order of importance:

  1. SSH Access - there is NO SSH access (no Shell) in Cloud Sites (or in most Managed Hosting environments)
  2. PHP library changes - certain key libraries, like ffmpeg, uploadprogress, GD
  3. increased PHP memory allocation - even if you can customize your memory allocation, often the maximum is too low to be optimum for Drupal
  4. .htaccess or phpconfig customization - often the "Managed Hosting" solutions allow limited or NO customization of the environment. And, unless the servers are already pre-configured to be perfeect for Drupal, this inability makes this option a de facto non-starter, because Drupal REQUIRES certain cusomizations in order to function at all.

Mosso's Cloud Servers solution is a "VPS" or "Virtual Private Hosting" solution.  and is what we're currently using for the live site. We can stand up as many servers as we need (pretty much), in minutes, and configure them the way we want to. Standing up a new server to be ready to site-worthy can take a matter of minutes or hours, depending on the amount of customization involved. None of the issues above are obstacles when using Cloud Servers, BUT all sysadmin duties are my responsibility, and I can only use Mosso for advice up to the limits of their good will (unless the problem resides with the server itself - in which case they provide excellent support). This option is cheaper and more flexible, and can be scaled just as well or better than Cloud Sites - it just requires the expertise to do so.