Improving mod_perl Sites' Performance: Part 7
by Stas Bekman
|
Pages: 1, 2
Now we will benchmark the same script without using the mysql (code limited by perl only): (http://www.example.com/perl/access/access.cgi), it's the same script but it just returns the HTML form, without making SQL queries.
MinSpareServers 8
MaxSpareServers 16
StartServers 10
MaxClients 50
MaxRequestsPerChild 5000
NR NC RPS comment
------------------------------------------------
10 10 26.95 # not a reliable figure
100 10 30.88
1000 10 29.31
1000 50 28.01
1000 100 29.74
10000 200 24.92
100000 400 24.95
Conclusions: This time the script we executed was pure perl (not
limited by I/O or mysql), so we see that the server serves the
requests much faster. You can see the number of requests per second
is almost the same for any load, but goes lower when the number of
concurrent clients goes beyond MaxClients. With 25 RPS, the
machine simulating a load of 400 concurrent clients will be served in
16 seconds. To be more realistic, assuming a maximum of 100
concurrent clients and 30 requests per second, the client will be
served in 3.5 seconds. Pretty good for a highly loaded server.
Now we will use the server to its full capacity, by keeping all
MaxClients clients alive all the time and having a big
MaxRequestsPerChild, so that no child will be killed during the
benchmarking.
MinSpareServers 50
MaxSpareServers 50
StartServers 50
MaxClients 50
MaxRequestsPerChild 5000
NR NC RPS comment
------------------------------------------------
100 10 32.05
1000 10 33.14
1000 50 33.17
1000 100 31.72
10000 200 31.60
Conclusion: In this scenario, there is no overhead involving the parent server loading new children, all the servers are available, and the only bottleneck is contention for the CPU.
Now we will change MaxClients and watch the results: Let's reduce
MaxClients to 10.
MinSpareServers 8
MaxSpareServers 10
StartServers 10
MaxClients 10
MaxRequestsPerChild 5000
NR NC RPS comment
------------------------------------------------
10 10 23.87 # not a reliable figure
100 10 32.64
1000 10 32.82
1000 50 30.43
1000 100 25.68
1000 500 26.95
2000 500 32.53
Conclusions: Very little difference! Ten servers were able to
serve almost with the same throughput as 50. Why? My guess
is because of CPU throttling. It seems that 10 servers were serving
requests five times faster than when we worked with 50 servers. In that
case, each child received its CPU time slice five times less
frequently. So having a big value for MaxClients, doesn't mean
that the performance will be better. You have just seen the numbers!
Now we will start drastically to reduce MaxRequestsPerChild:
MinSpareServers 8
MaxSpareServers 16
StartServers 10
MaxClients 50
NR NC MRPC RPS comment
------------------------------------------------
100 10 10 5.77
100 10 5 3.32
1000 50 20 8.92
1000 50 10 5.47
1000 50 5 2.83
1000 100 10 6.51
Conclusions: When we drastically reduce MaxRequestsPerChild, the
performance starts to become closer to plain mod_cgi.
Here are the numbers of this run with mod_cgi, for comparison:
MinSpareServers 8
MaxSpareServers 16
StartServers 10
MaxClients 50
NR NC RPS comment
------------------------------------------------
100 10 1.12
1000 50 1.14
1000 100 1.13
Conclusion: mod_cgi is much slower. :) In the first test, when NR/NC was 100/10, mod_cgi was capable of 1.12 requests per second. In the same circumstances, mod_perl was capable of 32 requests per second, nearly 30 times faster! In the first test, each client waited about 100 seconds to be served. In the second and third tests, they waited 1,000 seconds!
Choosing MaxClients
The MaxClients directive sets the limit on the number of
simultaneous requests that can be supported. No more than this number
of child server processes will be created. To configure more than 256
clients, you must edit the HARD_SERVER_LIMIT entry in httpd.h
and recompile. In our case, we want this variable to be as small as
possible, so we can limit the resources used by the
server children. Since we can restrict each child's process size with
Apache::SizeLimit or Apache::GTopLimit, the calculation of
MaxClients is pretty straightforward:
Total RAM Dedicated to the Webserver
MaxClients = ------------------------------------
MAX child's process size
So if I have 400Mb left for the Web server to run with, then I can set
MaxClients to be of 40 if I know that each child is limited to 10Mb
of memory (e.g. with Apache::SizeLimit).
You will be wondering what will happen to your server if there are
more concurrent users than MaxClients at any time. This situation
is signified by the following warning message in the error_log:
[Sun Jan 24 12:05:32 1999] [error] server reached MaxClients setting,
consider raising the MaxClients setting
There is no problem -- any connection attempts over the MaxClients
limit will normally be queued, up to a number based on the
ListenBacklog directive. When a child process is freed at the end
of a different request, the connection will be served.
It is an error because clients are being put in the queue rather than getting served immediately, despite the fact that they do not get an error response. The error can be allowed to persist to balance available system resources and response time, but sooner or later you will need to get more RAM so you can start more child processes. The best approach is to try not to have this condition reached at all, and if you reach it often you should start to worry about it.
It's important to understand how much real memory a child occupies.
Your children can share memory between them when the OS supports that.
You must take action to allow the sharing to happen. We have
disscussed this in one of the previous article whose main topic was
shared memory. If you do this, then chances are that your MaxClients
can be even higher. But it seems that it's not so simple to calculate
the absolute number. If you come up with a solution, then please let us
know! If the shared memory was of the same size throughout the
child's life, then we could derive a much better formula:
Total_RAM + Shared_RAM_per_Child * (MaxClients - 1)
MaxClients = ---------------------------------------------------
Max_Process_Size
which is:
Total_RAM - Shared_RAM_per_Child
MaxClients = ---------------------------------------
Max_Process_Size - Shared_RAM_per_Child
Let's roll some calculations:
Total_RAM = 500Mb
Max_Process_Size = 10Mb
Shared_RAM_per_Child = 4Mb
500 - 4
MaxClients = --------- = 82
10 - 4
With no sharing in place
500
MaxClients = --------- = 50
10
With sharing in place you can have 64 percent more servers without buying more RAM.
If you improve sharing and keep the sharing level, let's say:
Total_RAM = 500Mb
Max_Process_Size = 10Mb
Shared_RAM_per_Child = 8Mb
500 - 8
MaxClients = --------- = 246
10 - 8
392 percent more servers! Now you can feel the importance of having as much shared memory as possible.
Choosing MaxRequestsPerChild
The MaxRequestsPerChild directive sets the limit on the number of
requests that an individual child server process will handle. After
MaxRequestsPerChild requests, the child process will die. If
MaxRequestsPerChild is 0, then the process will live forever.
Setting MaxRequestsPerChild to a non-zero limit solves some memory
leakage problems caused by sloppy programming practices, whereas a
child process consumes more memory after each request.
If left unbounded, then after a certain number of requests the children will use up all the available memory and leave the server to die from memory starvation. Note that sometimes standard system libraries leak memory too, especially on OSes with bad memory management (e.g. Solaris 2.5 on x86 arch).
If this is your case, then you can set MaxRequestsPerChild to a small
number. This will allow the system to reclaim the memory that a
greedy child process consumed, when it exits after
MaxRequestsPerChild requests.
But beware -- if you set this number too low, you will lose some of
the speed bonus you get from mod_perl. Consider using
Apache::PerlRun if this is the case.
Another approach is to use the Apache::SizeLimit or the
Apache::GTopLimit modules. By using either of these modules you
should be able to discontinue using the MaxRequestPerChild,
although for some developers, using both in combination does the
job. In addition the latter module allows you to kill any servers
whose shared memory size drops below a specified limit.
References
- The mod_perl site's URL: http://perl.apache.org/
-
Apache::GTopLimithttp://search.cpan.org/search?dist=Apache-GTopLimit

