What's this all about?
For those of you who are not very familiar with how servers work but are still interested I thought I'd lay out the inner workings, at least the very basics.
Marty and his servers
I got my first server back in 1999. It was a Cobalt RAQ server with a 450 Mhz CPU and 512 MB RAM with a 40 GB HD running Red Hat Linux, not bad at the time I must tell you.
The server was actually quite capable at running a couple of medium sites at reasonable speeds in those days. But of course, websites were much simpler back then, very few actually used a SQL database and most relied on news scripts that had their database stored in files and it had far from as many visitors as we do here.
Anyway, the server we have here on Juventuz is an Intel Quad Core CPU (4 x 2.8 Ghz) with 24 GB RAM and 2 x 1500 GB harddrives in RAID (mirrored). It's running Debian Linux. Setup correctly (I'll get to that later), it's very capable of coping with many more visitors than we see here.
What is a server?
A website server for a modern website has 3 main functions. It has of course a lot of other functions but I won't go into detail there. The main functions for a modern website are:
1) A webserver (for example Apache)
2( A scripting language (for example PHP)
3) A database server (for example MySQL)
Like I said before, today's websites are often advanced. Almost every modern website, from blogs to news sites, rely on an SQL database (often MySQL but there are others) where the dynamic content is stored. You setup your site to connect to the database where it can read and write its data. Here on the forum for example we use vBulletin which uses PHP functions that connect to a SQL database.
What happens when you visit the forums?
When you view the main page of the forums the server does its thingymabob (won't go into detail about vhost structure, apache modules etc there's too much to explain
) and you see the content: Apache (our type of webserver) with help from PHP reads the file forum.php and then makes a connection to MySQL (our type of database server) and display the data in your browser.
Now, say you start a thread. Once you've filled out the title, content and other options and press "Submit New Thread" it first validates the data and then updates database through SQL queries. Not only must it update the threads and posts tables, it also needs to write your session to the database along with some info about yourself (what page you last visited, how many posts you've made after this new thread etc).
Basically, any time you visit any part of the forum it has to perform a set amount of queries depending on which function of the forum you're using. vBulletin for example, on average, a good guess is it uses 15-20 SQL queries for every person, per page. So if you have 300 users online, well do the math. Granted, they don't reload the page at the same time, but still. Anyway, the database server needs to put up with a shitload of work.
How do we speed this up?
Ah now we've come to the good part.
There are many different ways of speeding up a website. There are ways to speed it up for the end user (you) and for the backend (the server). The most efficient methods are proxy, cache and compressing of files. You can cache both files and datastore items.
For this I have installed the following on the server:
1) Varnish Cache (reverse proxy)
2) libapache2-mod-rpaf (an Apache module which takes your IP and forwards it to the proxy)
3) XCache (PHP opcode cache)
4) Sphinx Search (will get to that later)
Varnish Cache
Varnish Cache is a reverse proxy HTTP cache that you put in front of your webserver (Apache). You tell it to listen to the default port where Apache normally would listen (port 80) and then tell Apache to listen to another port. Varnish checks if the data is already cached on the server and if so it serves it to you directly without getting that bloated Apache involved = very fast delivery! If the data is not cached Varnish makes a sad face and forwards you and your IP (thanks to libapache2-mod-rpaf) to Apache for processsing and at the same time if possible (depending of the rules you've given it, some data can't be cached either) caches the page/data in memory for a set amount of time. So if you view an image that'll be cached for the next person to see = speedy delivery.
XCache
XCache is a open-source opcode cacher, which means that it accelerates the performance of PHP on servers. It optimizes performance by removing the compilation time of PHP scripts by caching the compiled state of PHP scripts into the memory (RAM) and uses the compiled version straight from the RAM. This will increase the rate of page generation time by up to 5 times as it also optimizes many other aspects of php scripts and reduce serverload.
Sphinx Search
I've been through this briefly in the other thread. But what it does it so replace the built in vBulletin Search Function that stores all its data in the database which is bad when you have a large forum, it takes ages to load a search result (new posts for example). I'm still having some issues with this though as the Sphinx daemon seems to crash about once a day so I'll have to get that sorted. Instead of letting you bastards abuse the MySQL server, it takes care of it locally and serve you the results. Same as above for Varnish, this is awesome for scalabiliy (for example Craigslist use it with over 50 million searches per day).
Conclusion
This setup allows for scalability you could never achieve with just Apache + MySQL (+ Cache). For instance, Varnish speeds up delivery anywhere from 300-1000 times depending on various factors. Any huge site you see today like Facebook, Twitter, Digg etc. all use this kind of system because of scaleability. Sure, they got hundreds of servers but the principle is the same. Then you add the XCache that caches PHP pages on the fly and you have one very capable server.
I'll show this with a little diagram I've snagged off the internet, I've taken a slightly different route but the outline is the same. Here they're calling it vBBoost but it's the same thing = proxy cache.
Benchmarks
Here you need to look at Apachebench reported rps (Requests per sec), Time Taken (sec) and CPU Load.
Compare the values for Apache + PHP + XCache (what we used to have) with Apache + Varnish + PHP + XCache (what we have now)
* Note: I didn't run these benchmarks, results would be similar though.
For those of you who are not very familiar with how servers work but are still interested I thought I'd lay out the inner workings, at least the very basics.
Marty and his servers
I got my first server back in 1999. It was a Cobalt RAQ server with a 450 Mhz CPU and 512 MB RAM with a 40 GB HD running Red Hat Linux, not bad at the time I must tell you.
Anyway, the server we have here on Juventuz is an Intel Quad Core CPU (4 x 2.8 Ghz) with 24 GB RAM and 2 x 1500 GB harddrives in RAID (mirrored). It's running Debian Linux. Setup correctly (I'll get to that later), it's very capable of coping with many more visitors than we see here.
What is a server?
A website server for a modern website has 3 main functions. It has of course a lot of other functions but I won't go into detail there. The main functions for a modern website are:
1) A webserver (for example Apache)
2( A scripting language (for example PHP)
3) A database server (for example MySQL)
Like I said before, today's websites are often advanced. Almost every modern website, from blogs to news sites, rely on an SQL database (often MySQL but there are others) where the dynamic content is stored. You setup your site to connect to the database where it can read and write its data. Here on the forum for example we use vBulletin which uses PHP functions that connect to a SQL database.
What happens when you visit the forums?
When you view the main page of the forums the server does its thingymabob (won't go into detail about vhost structure, apache modules etc there's too much to explain
Now, say you start a thread. Once you've filled out the title, content and other options and press "Submit New Thread" it first validates the data and then updates database through SQL queries. Not only must it update the threads and posts tables, it also needs to write your session to the database along with some info about yourself (what page you last visited, how many posts you've made after this new thread etc).
Basically, any time you visit any part of the forum it has to perform a set amount of queries depending on which function of the forum you're using. vBulletin for example, on average, a good guess is it uses 15-20 SQL queries for every person, per page. So if you have 300 users online, well do the math. Granted, they don't reload the page at the same time, but still. Anyway, the database server needs to put up with a shitload of work.
How do we speed this up?
Ah now we've come to the good part.
For this I have installed the following on the server:
1) Varnish Cache (reverse proxy)
2) libapache2-mod-rpaf (an Apache module which takes your IP and forwards it to the proxy)
3) XCache (PHP opcode cache)
4) Sphinx Search (will get to that later)
Varnish Cache
Varnish Cache is a reverse proxy HTTP cache that you put in front of your webserver (Apache). You tell it to listen to the default port where Apache normally would listen (port 80) and then tell Apache to listen to another port. Varnish checks if the data is already cached on the server and if so it serves it to you directly without getting that bloated Apache involved = very fast delivery! If the data is not cached Varnish makes a sad face and forwards you and your IP (thanks to libapache2-mod-rpaf) to Apache for processsing and at the same time if possible (depending of the rules you've given it, some data can't be cached either) caches the page/data in memory for a set amount of time. So if you view an image that'll be cached for the next person to see = speedy delivery.
XCache
XCache is a open-source opcode cacher, which means that it accelerates the performance of PHP on servers. It optimizes performance by removing the compilation time of PHP scripts by caching the compiled state of PHP scripts into the memory (RAM) and uses the compiled version straight from the RAM. This will increase the rate of page generation time by up to 5 times as it also optimizes many other aspects of php scripts and reduce serverload.
Sphinx Search
I've been through this briefly in the other thread. But what it does it so replace the built in vBulletin Search Function that stores all its data in the database which is bad when you have a large forum, it takes ages to load a search result (new posts for example). I'm still having some issues with this though as the Sphinx daemon seems to crash about once a day so I'll have to get that sorted. Instead of letting you bastards abuse the MySQL server, it takes care of it locally and serve you the results. Same as above for Varnish, this is awesome for scalabiliy (for example Craigslist use it with over 50 million searches per day).
Conclusion
This setup allows for scalability you could never achieve with just Apache + MySQL (+ Cache). For instance, Varnish speeds up delivery anywhere from 300-1000 times depending on various factors. Any huge site you see today like Facebook, Twitter, Digg etc. all use this kind of system because of scaleability. Sure, they got hundreds of servers but the principle is the same. Then you add the XCache that caches PHP pages on the fly and you have one very capable server.
I'll show this with a little diagram I've snagged off the internet, I've taken a slightly different route but the outline is the same. Here they're calling it vBBoost but it's the same thing = proxy cache.

Benchmarks
Here you need to look at Apachebench reported rps (Requests per sec), Time Taken (sec) and CPU Load.
Compare the values for Apache + PHP + XCache (what we used to have) with Apache + Varnish + PHP + XCache (what we have now)
* Note: I didn't run these benchmarks, results would be similar though.

