Sign In/My Account | View Cart  
advertisement


Listen Print Discuss

Building a 3D Engine in Perl, Part 4
by Geoff Broadwell | Pages: 1, 2, 3, 4, 5, 6, 7, 8

dprofpp is Your (Obtuse) Friend

With benchmark mode enabled, the engine runs under dprofpp. The first step is to collect the profile data:

dprofpp -Q -p step065

-p step065 tells dprofpp to profile the program named step065, and -Q tells it to quit after collecting the data. dprofpp ran step065, collected the profile data, and stored it in a specially formatted text file named tmon.out in the current directory.

To turn the profile data into human-readable output, I used dprofpp without any arguments. It crunched the collected data for a while and finally produced this:

$ dprofpp
Exporter::Heavy::heavy_export_to_level has 4 unstacked calls in outer
Exporter::export_to_level has -4 unstacked calls in outer
Exporter::export has -12 unstacked calls in outer
Exporter::Heavy::heavy_export has 12 unstacked calls in outer
Total Elapsed Time = 4.838377 Seconds
  User+System Time = 1.498377 Seconds
Exclusive Times
%Time ExclSec CumulS #Calls sec/call Csec/c  Name
 88.1   1.320  1.320      1   1.3200 1.3200  SDL::SetVideoMode
 38.1   0.571  0.774    294   0.0019 0.0026  main::draw_quad_face
 16.0   0.240  0.341      8   0.0300 0.0426  SDL::OpenGL::BEGIN
 13.0   0.195  0.195  64722   0.0000 0.0000  SDL::OpenGL::Vertex
 11.3   0.170  0.170      1   0.1700 0.1700  DynaLoader::dl_load_file
 9.34   0.140  0.020     12   0.0116 0.0017  Exporter::export
 6.67   0.100  0.100   1001   0.0001 0.0001  SDL::in
 4.00   0.060  0.060      1   0.0600 0.0600  SDL::Init
 3.34   0.050  0.847      8   0.0062 0.1059  main::BEGIN
 2.00   0.030  0.040      5   0.0060 0.0080  SDL::Event::BEGIN
 1.80   0.027  0.801     49   0.0005 0.0163  main::draw_cube
 1.47   0.022  0.022   2947   0.0000 0.0000  SDL::OpenGL::End
 1.33   0.020  0.020      1   0.0200 0.0200  warnings::BEGIN
 1.33   0.020  0.020     16   0.0012 0.0012  Exporter::as_heavy
 1.33   0.020  0.209      5   0.0040 0.0418  SDL::BEGIN

There are several problems with this output. The numbers are clearly silly (88 percent of its time spent in SDL::SetVideoMode?), the statistics for the various BEGIN blocks are inconsequential to the task and in the way, and the error messages at the top are rather disconcerting. To fix these issues, dprofpp has the -g option, which tells dprofpp to only display statistics for a particular routine and its descendants:

$ dprofpp -g main::main_loop
Total Elapsed Time = 4.952042 Seconds
  User+System Time = 0.812051 Seconds
Exclusive Times
%Time ExclSec CumulS #Calls sec/call Csec/c  Name
 70.3   0.571  0.774    294   0.0019 0.0026  main::draw_quad_face
 24.0   0.195  0.195  64722   0.0000 0.0000  SDL::OpenGL::Vertex
 3.32   0.027  0.801     49   0.0005 0.0163  main::draw_cube
 2.71   0.022  0.022   2947   0.0000 0.0000  SDL::OpenGL::End
 1.23   0.010  0.010     49   0.0002 0.0002  SDL::OpenGL::Rotate
 1.11   0.009  0.009      7   0.0013 0.0013  main::prep_frame
 1.11   0.009  0.009     70   0.0001 0.0001  SDL::OpenGL::Color
 0.25   0.002  0.002   2947   0.0000 0.0000  SDL::OpenGL::Begin
 0.00       - -0.000      1        -      -  main::action_quit
 0.00       - -0.000      2        -      -  SDL::EventType
 0.00       - -0.000      2        -      -  SDL::Event::type
 0.00       - -0.000      7        -      -  SDL::GetTicks
 0.00       - -0.000      7        -      -  SDL::OpenGL::Clear
 0.00       - -0.000      7        -      -  SDL::OpenGL::GL_NORMALIZE
 0.00       - -0.000      7        -      -  SDL::OpenGL::GL_SPOT_EXPONENT

You may have noticed that I specified main::main_loop instead of just main_loop. dprofpp always uses fully qualified names and will give empty results if you use main_loop without the main:: package qualifier.

In this exclusive times view, the percentages in the first column and the row order depend only on the runtime of each routine, without respect to its children. Using just this view, I might have tried to optimize draw_quad_face somehow, as it appears to be the most expensive routine by a large margin. That's not the best approach, however, as an inclusive view (-I) shows:

$ dprofpp -I -g main::main_loop
Total Elapsed Time = 4.952042 Seconds
  User+System Time = 0.812051 Seconds
Inclusive Times
%Time ExclSec CumulS #Calls sec/call Csec/c  Name
 100.       -  0.814      7        - 0.1163  main::do_frame
 99.9       -  0.812      1        - 0.8121  main::main_loop
 99.7       -  0.810      7        - 0.1158  main::draw_view
 99.2       -  0.806      7        - 0.1151  main::draw_frame
 98.6   0.027  0.801     49   0.0005 0.0163  main::draw_cube
 95.3   0.571  0.774    294   0.0019 0.0026  main::draw_quad_face
 24.0   0.195  0.195  64722   0.0000 0.0000  SDL::OpenGL::Vertex
 2.71   0.022  0.022   2947   0.0000 0.0000  SDL::OpenGL::End
 1.23   0.010  0.010     49   0.0002 0.0002  SDL::OpenGL::Rotate
 1.11   0.009  0.009     70   0.0001 0.0001  SDL::OpenGL::Color
 1.11   0.009  0.009      7   0.0013 0.0013  main::prep_frame
 0.25   0.002  0.002   2947   0.0000 0.0000  SDL::OpenGL::Begin
 0.00       - -0.000      1        -      -  main::action_quit
 0.00       - -0.000      2        -      -  SDL::EventType
 0.00       - -0.000      2        -      -  SDL::Event::type

In this view, draw_quad_face looks even worse, because the first column now includes the time taken by all of the OpenGL calls inside of it, including tens of thousands of glVertex calls. It seems that I should do something to speed it up, but at this point it's not entirely clear how to simplify it or reduce the number of OpenGL calls it makes (other than reducing the subdivision level of each face, which would reduce rendering quality).

Actually, there's a better option. The real problem is that draw_cube dominates the execution time, and draw_quad_face dominates that. How about not calling draw_cube (and therefore draw_quad_face) at all during normal rendering? It seems extremely wasteful to have to tell OpenGL how to render a cube face dozens of times each frame. If only there were a way to tell OpenGL to remember the cube definition once, and just refer to that definition each time the engine needs to draw it.

Pages: 1, 2, 3, 4, 5, 6, 7, 8

Next Pagearrow