Improving mod_perl Sites' Performance: Part 8
Tweaking Apache Configuration Continued
by Stas BekmanMarch 04, 2003
In this article we continue talking about how to optimize your site for performance without touching code, buying new hardware or telling casts. A few simple httpd.conf configuration changes can improve the performance tremendously.
Choosing MinSpareServers, MaxSpareServers and StartServers
With mod_perl enabled, it might take as much as 20 seconds from the
time you start the server until it is ready to serve incoming
requests. This delay depends on the OS, the number of preloaded
modules and the process load of the machine. It's best to set
StartServers and MinSpareServers to high numbers, so that if you
get a high load just after the server has been restarted, the fresh
servers will be ready to serve requests immediately. With mod_perl,
it's usually a good idea to raise all three variables higher than normal.
In order to maximize the benefits of mod_perl, you don't want to kill
servers when they are idle, rather you want them to stay up and
available to handle new requests immediately. I think an ideal
configuration is to set MinSpareServers and MaxSpareServers to
similar values, maybe even the same. Having the MaxSpareServers
close to MaxClients will completely use all of your resources (if
MaxClients has been chosen to take the full advantage of the
resources), but it'll make sure that at any given moment your system
will be capable of responding to requests with the maximum speed
(assuming that number of concurrent requests is not higher than
MaxClients).
Let's try some numbers. For a heavily loaded Web site and a dedicated machine, I would think of (note 400Mb is just for example):
Available to webserver RAM: 400Mb
Child's memory size bounded: 10Mb
MaxClients: 400/10 = 40 (larger with mem sharing)
StartServers: 20
MinSpareServers: 20
MaxSpareServers: 35
However, if I want to use the server for many other tasks, but make it capable of handling a high load, I'd try:
Available to webserver RAM: 400Mb
Child's memory size bounded: 10Mb
MaxClients: 400/10 = 40
StartServers: 5
MinSpareServers: 5
MaxSpareServers: 10
These numbers are taken off the top of my head, and shouldn't be used as a rule, but rather as examples to show you some possible scenarios. Use this information with caution.
|
Related Reading Practical mod_perl |
Summary of Benchmarking to Tune All 5 Parameters
OK, we've run various benchmarks -- let's summarize the conclusions:
- MaxRequestsPerChild
If your scripts are clean and don't leak memory, then set this variable to a number as large as possible (10000?). If you useApache::SizeLimit, then you can set this parameter to 0 (treated as infinity). You will want this parameter to be smaller if your code becomes gradually more unshared over the process' life. As well as this,Apache::GTopLimitcan help, with its shared memory limitation feature. - StartServers
If you keep a small number of servers active most of the time, then keep this number low. Keep it low especially ifMaxSpareServersis also low, as if there is no load, Apache will kill its children before they have been utilized at all. If your service is heavily loaded, then make this number close toMaxClients, and keepMaxSpareServersequal toMaxClients. - MinSpareServers
If your server performs other work besides Web serving, then make this low so the memory of unused children will be freed when the load is light. If your server's load varies (you get loads in bursts) and you want fast response for all clients at any time, then you will want to make it high, so that new children will be respawned in advance and are waiting to handle bursts of requests. - MaxSpareServers
The logic is the same as forMinSpareServers- low if you need the machine for other tasks, high if it's a dedicated Web host and you want a minimal delay between the request and the response. - MaxClients
Not too low, so you don't get into a situation where clients are waiting for the server to start serving them (they might wait, but not for very long). However, do not set it too high. With a high MaxClients, if you get a high load, then the server will try to serve all requests immediately. Your CPU will have a hard time keeping up, and if the child size * number of running children is larger than the total available RAM, then your server will start swapping. This will slow down everything, which in turn will make things even slower, until eventually your machine will die. It's important that you take pains to ensure that swapping does not normally happen. Swap space is an emergency pool, not a resource to be used routinely. If you are low on memory and you badly need it, then buy it. Memory is cheap.But based on the test I conducted above, even if you have plenty of memory like I have (1Gb), increasing
MaxClientssometimes will give you no improvement in performance. The more clients are running, the more CPU time will be required, the less CPU time slices each process will receive. The response latency (the time to respond to a request) will grow, so you won't see the expected improvement. The best approach is to find the minimum requirement for your kind of service and the maximum capability of your machine. Then start at the minimum and test as I did, successively raising this parameter until you find the region on the curve of the graph of latency and/or throughput against MaxClients where the improvement starts to diminish. Stop there and use it. When you make the measurements on a production server you will have the ability to tune them more precisely, since you will see the real numbers.Don't forget that if you add more scripts, or even just modify the existing ones, then the processes will grow in size as you compile in more code. When you do this, your parameters probably will need to be recalculated.
KeepAlive
If your mod_perl server's httpd.conf includes the following directives:
KeepAlive On
MaxKeepAliveRequests 100
KeepAliveTimeout 15
you have a real performance penalty, since after completing the
processing for each request, the process will wait for
KeepAliveTimeout seconds before closing the connection and will
therefore not be serving other requests during this time. With this
configuration, you will need many more concurrent processes on a server
with high traffic.
If you use some server status reporting tools, then you will see the
process in K status when it's in KeepAlive status.
The chances are that you don't want this feature enabled. Set it Off with:
KeepAlive Off
The other two directives don't matter if KeepAlive is Off.
You might want to consider enabling this option if the client's
browser needs to request more than one object from your server for a
single HTML page. If this is the situation, then by setting
KeepAlive On you will save
the HTTP connection overhead for all requests but the first one for each
page.
For example: If you have a page with 10 ad banners, which is not
uncommon today, then your server will work more effectively if a single
process serves them all during a single connection. However, your
client will see a slightly slower response, since banners will be
brought one at a time and not concurrently as is the case if each
IMG tag opens a separate connection.
Since keepalive connections will not incur the additional three-way TCP handshake, turning it on will be kinder to the network.
SSL connections benefit the most from KeepAlive in cases
where you haven't configured the server to cache session ids.
You have probably followed the usual advice to send all the requests for
static objects to a plain Apache server. Since most pages include
more than one unique static image, you should keep the default
KeepAlive setting of the non-mod_perl server, i.e. keep it On.
It will probably be a good idea also to reduce the timeout a little.
One option would be for the proxy/accelerator to keep the connection open to the client but make individual connections to the server, read the response, buffer it for sending to the client and close the server connection. Obviously, you would make new connections to the server as required by the client's requests.
Also, you should know that KeepAlive requests only work with
responses that contain a Content-Length header. To send this header
do:
$r->header_out('Content-Length', $length);
PerlSetupEnv Off
PerlSetupEnv Off is another optimization you might consider. This
directive requires mod_perl 1.25 or later.
mod_perl fiddles with the environment to make it appear as if the
script were being called under the CGI protocol. For example, the
$ENV{QUERY_STRING} environment variable is initialized with the
contents of Apache::args(), and the value returned by
Apache::server_hostname() is put into $ENV{SERVER_NAME}.
But %ENV population is expensive. Those who have moved to the Perl
Apache API no longer need this extra %ENV population, and can gain by
turning it Off. Scripts using the CGI.pm module require
PerlSetupEnv On because that module relies on a properly populated
CGI environment table.
By default it is "On."
Note that you can still set environment variables. For example, when you use the following configuration:
PerlSetupEnv Off
PerlModule Apache::RegistryNG
<Location /perl>
PerlSetupEnv On
PerlSetEnv TEST hi
SetHandler perl-script
PerlHandler Apache::RegistryNG
Options +ExecCGI
</Location>
and issue a request (for example http://localhost/perl/setupenvoff.pl) for this script:
setupenvoff.pl
--------------
use Data::Dumper;
my $r = Apache->request();
$r->send_http_header('text/plain');
print Dumper(\%ENV);
you should see something like this:
$VAR1 = {
'GATEWAY_INTERFACE' => 'CGI-Perl/1.1',
'MOD_PERL' => 'mod_perl/1.25',
'PATH' => '/usr/lib/perl5/5.00503:... snipped ...',
'TEST' => 'hi'
};
Notice that we have got the value of the environment variable TEST.
Pages: 1, 2 |


