Listen Print

Improving mod_perl Sites' Performance: Part 7

Tweaking Apache Configuration

by Stas Bekman
February 05, 2003

Correct configuration of the MinSpareServers, MaxSpareServers, StartServers, MaxClients, and MaxRequestsPerChild parameters is very important. There are no defaults. If they are too low, then you will underutilize the system's capabilities. If they are too high, then chances are that the server will bring the machine to its knees.

All the above parameters should be specified on the basis of the resources you have. With a plain Apache server, it's no big deal if you run many servers since the processes are about 1Mb and don't eat a lot of your RAM. Generally, the numbers are even smaller with memory sharing. The situation is different with mod_perl. I have seen mod_perl processes of 20Mb and more. Now, if you have MaxClients set to 50, then 50x20Mb = 1Gb. Maybe you don't have 1Gb of RAM - so how do you tune the parameters? Generally, by trying different combinations and benchmarking the server. Again, mod_perl processes can be made much smaller when memory is shared.

Related Reading

Practical mod_perl

Practical mod_perl
By Stas Bekman, Eric Cholet

Table of Contents
Index
Sample Chapter

Read Online--Safari Search this book on Safari:
 

Code Fragments only

Before you start this task, you should be armed with the proper weapon. You need the crashme utility, which will load your server with the mod_perl scripts you possess. You need it to have the ability to emulate a multiuser environment and to emulate the behavior of multiple clients calling the mod_perl scripts on your server simultaneously. While there are commercial solutions, you can get away with free ones that do the same job. You can use the ApacheBench utility that comes with the Apache distribution, the crashme script which uses LWP::Parallel::UserAgent, httperf or http_load all discussed in one of the previous articles.

It is important to make sure that you run the load generator (the client which generates the test requests) on a system that is more powerful than the system being tested. After all, we are trying to simulate Internet users, where many users are trying to reach your service at once. Since the number of concurrent users can be quite large, your testing machine must be very powerful and capable of generating a heavy load. Of course, you should not run the clients and the server on the same machine. If you do, then your test results would be invalid. Clients will eat CPU and memory that should be dedicated to the server, and vice versa.

Configuration Tuning with ApacheBench

I'm going to use the ApacheBench (ab) utility to tune our server's configuration. We will simulate 10 users concurrently requesting a very light script at http://www.example.com/perl/access/access.cgi. Each simulated user makes 10 requests.


  % ./ab -n 100 -c 10 http://www.example.com/perl/access/access.cgi

The results are:


  Document Path:          /perl/access/access.cgi
  Document Length:        16 bytes
  
  Concurrency Level:      10
  Time taken for tests:   1.683 seconds
  Complete requests:      100
  Failed requests:        0
  Total transferred:      16100 bytes
  HTML transferred:       1600 bytes
  Requests per second:    59.42
  Transfer rate:          9.57 kb/s received
  
  Connnection Times (ms)
                min   avg   max
  Connect:        0    29   101
  Processing:    77   124  1259
  Total:         77   153  1360

The only numbers we really care about are:


  Complete requests:      100
  Failed requests:        0
  Requests per second:    59.42

Let's raise the request load to 100 x 10 (10 users, each making 100 requests):


  % ./ab -n 1000 -c 10  http://www.example.com/perl/access/access.cgi
  Concurrency Level:      10
  Complete requests:      1000
  Failed requests:        0
  Requests per second:    139.76

As expected, nothing changes -- we have the same 10 concurrent users. Now let's raise the number of concurrent users to 50:


  % ./ab -n 1000 -c 50  http://www.example.com/perl/access/access.cgi
  Complete requests:      1000
  Failed requests:        0
  Requests per second:    133.01

We see that the server is capable of serving 50 concurrent users at 133 requests per second! Let's find the upper limit. Using -n 10000 -c 1000 failed to get results (Broken Pipe?). Using -n 10000 -c 500 resulted in 94.82 requests per second. The server's performance went down with the high load.

The above tests were performed with the following configuration:


  MinSpareServers 8
  MaxSpareServers 6
  StartServers 10
  MaxClients 50
  MaxRequestsPerChild 1500

Now let's kill each child after it serves a single request. We will use the following configuration:


  MinSpareServers 8
  MaxSpareServers 6
  StartServers 10
  MaxClients 100
  MaxRequestsPerChild 1

Simulate 50 users each generating a total of 20 requests:


  % ./ab -n 1000 -c 50  http://www.example.com/perl/access/access.cgi

The benchmark timed out with the above configuration. I watched the output of ps as I ran it, the parent process just wasn't capable of respawning the killed children at that rate. When I raised the MaxRequestsPerChild to 10, I got 8.34 requests per second. Very bad - 18 times slower! You can't benchmark the importance of the MinSpareServers, MaxSpareServers and StartServers with this type of test.

Now let's reset MaxRequestsPerChild to 1500, but reduce MaxClients to 10 and run the same test:


  MinSpareServers 8
  MaxSpareServers 6
  StartServers 10
  MaxClients 10
  MaxRequestsPerChild 1500

I got 27.12 requests per second, which is better but still four to five times slower. (I got 133 with MaxClients set to 50.)

Summary: I have tested a few combinations of the server configuration variables (MinSpareServers, MaxSpareServers, StartServers, MaxClients and MaxRequestsPerChild). The results I got are as follows:

MinSpareServers, MaxSpareServers and StartServers are only important for user response times. Sometimes users will have to wait a bit.

The important parameters are MaxClients and MaxRequestsPerChild. MaxClients should be not too big, so it will not abuse your machine's memory resources, and not too small, for if it is, your users will be forced to wait for the children to become free to serve them. MaxRequestsPerChild should be as large as possible, to get the full benefit of mod_perl, but watch your server at the beginning to make sure your scripts are not leaking memory, thereby causing your server (and your service) to die very fast.

Also, it is important to understand that we didn't test the response times in the tests above, but the ability of the server to respond under a heavy load of requests. If the test script was heavier, then the numbers would be different but the conclusions similar.

The benchmarks were run with:

  • HW: RS6000, 1Gb RAM
  • SW: AIX 4.1.5 . mod_perl 1.16, apache 1.3.3
  • Machine running only mysql, httpd docs and mod_perl servers.
  • Machine was _completely_ unloaded during the benchmarking.

After each server restart when I changed the server's configuration, I made sure that the scripts were preloaded by fetching a script at least once for every child.

It is important to notice that none of the requests timed out, even if it was kept in the server's queue for more than a minute! That is the way ab works, which is OK for testing purposes but will be unacceptable in the real world - users will not wait for more than five to 10 seconds for a request to complete, and the client (i.e. the browser) will time out in a few minutes.

Now let's take a look at some real code whose execution time is more than a few milliseconds. We will do some real testing and collect the data into tables for easier viewing.

I will use the following abbreviations:


  NR    = Total Number of Request
  NC    = Concurrency
  MC    = MaxClients
  MRPC  = MaxRequestsPerChild
  RPS   = Requests per second

Running a mod_perl script with lots of mysql queries (the script under test is mysqld limited) (http://www.example.com/perl/access/access.cgi?do_sub=query_form), with the configuration:


  MinSpareServers        8
  MaxSpareServers       16
  StartServers          10
  MaxClients            50
  MaxRequestsPerChild 5000

gives us:


     NR   NC    RPS     comment
  ------------------------------------------------
     10   10    3.33    # not a reliable figure
    100   10    3.94    
   1000   10    4.62    
   1000   50    4.09

Conclusions: Here I wanted to show that when the application is slow (not due to perl loading, code compilation and execution, but limited by some external operation) it almost does not matter what load we place on the server. The RPS (Requests per second) is almost the same. Given that all the requests have been served, you have the ability to queue the clients, but be aware that anything that goes into the queue means a waiting client and a client (browser) that might time out!

Pages: 1, 2

Next Pagearrow





Contact Us | Advertise with Us | Privacy Policy | Press Center | Jobs | Submissions Guidelines

Copyright © 2000-2008 O’Reilly Media, Inc. All Rights Reserved. | (707) 827-7000 / (800) 998-9938
All trademarks and registered trademarks appearing on the O'Reilly Network are the property of their respective owners.

For problems or assistance with this site, email