Sign In/My Account | View Cart  
advertisement


Listen Print

A Refactoring Example
by Michael Schwern | Pages: 1, 2, 3, 4, 5

Getting Better All the Time

The new names are better, but they're not great. This is ok. Naming is something people often get hung up on. One can spend hours wracking their brains thinking of the perfect name for a variable or a function. If you can think of a better one than what's there right now, use it. The beauty of Refactoring is you an always improve upon it later.

This is an important lesson of Refactoring. Voltare said, "the best is the enemy of the good". We often get so wound up trying to make code great that we fail to improve it at all. In refactoring, it's not so important to make your code great in one leap, just a little better all the time (it's a little known fact John Lennon was into Refactoring.) These small improvements will build up into a clean piece of code, with less bugs, more surely than a large-scale code cleanup would.

Pull Code Out of the Loop

Now it's a simple cut and paste to pull the four prepare() statements out of the loop.

my $totalvisit_sth = $dbh->prepare('SELECT count(*) FROM lognew WHERE 
                                    pageid=? and data>=?');

my $totalvisittoday_sth = $dbh->prepare('SELECT count(*) FROM lognew WHERE
                                         (pageid=? and data=?)');

my $totalyvisit_sth = $dbh->prepare('SELECT count(*) FROM lognew WHERE
                                     (pageid=? and data=?)');

my $totalmvisit_sth = $dbh->prepare('SELECT count(*) FROM lognew WHERE
                                     (pageid=? and data<=? and data>=?)');

open (INPUT, "< $filepageid") || &file_open_error("$filepageid");

while ($riga=<INPUT>){
    $nump++;
    chop($riga);
    $pagina[$nump] = $riga;

    $totalvisit_sth->execute($pagina[$nump], $startdate);
    $totalvisit[$nump] = $totalvisit_sth->fetchrow_array();

    $totalvisittoday_sth->execute($pagina[$nump], $dataoggi);
    $totalvisittoday[$nump] = $totalvisittoday_sth->fetchrow_array();

    $totalyvisit_sth->execute($pagina[$nump], $dataieri);
    $totalyvisit[$nump] = $totalyvisit_sth->fetchrow_array();

    $totalmvisit_sth->execute($pagina[$nump], $fine30gg, $inizio30gg);
    $totalmvisit[$nump] = $totalmvisit_sth->fetchrow_array();
}

close (INPUT);

Already the code is looking better. With the SQL separated, the inner workings of the loop are much less daunting.

Test. Log. Check in.

A Place to Stop

Remember our goal, to make this code run faster. By pulling the prepare() statements outside the loop we've likely achieved this goal. Additionally, it still does exactly what it did before even though we still don't fully understand what that is. If this were a real project, you'd do some benchmarking to see if the code is fast enough and move on to another task.

Since this is an example, I'll continue with more refactorings with the goal of clarifying the code further and figuring out what it does.

Keep in mind that after every refactoring the code still does exactly what it did before. This means we can stop choose to stop after any refactoring. If a more pressing task suddenly pops up we can pause our refactoring work and attend to that feeling confident we didn't leave any broken code lying around.

Reformat SQL for Better Readability

In order to make sense of the code, we have to make sense of the SQL. The simplest way to better understand the SQL is to put it into a clearer format.

The three major parts of an SQL SELECT statement are:

  • The rows (ie. SELECT count(*))
  • The table (ie. FROM lognew)
  • The predicate (ie. WHERE pageid = ...)

I've chosen a new format that highlights these parts.

I've also removed some unnecessary parenthesis because they just serve to clutter things up rather than disambiguate an expression.

I've also decided to change the quoting style from single quotes to a here-doc. It would have also been okay to use q{}.

my $totalvisit_sth = $dbh->prepare(<<'SQL');
SELECT count(*)
FROM   lognew 
WHERE  pageid =  ? AND 
       data   >= ?
SQL

my $totalvisittoday_sth = $dbh->prepare(<<'SQL');
SELECT count(*) 
FROM   lognew 
WHERE  pageid = ? AND 
       data   = ?
SQL

my $totalyvisit_sth = $dbh->prepare(<<'SQL');
SELECT count(*) 
FROM   lognew 
WHERE  pageid = ? AND 
       data   = ?
SQL

my $totalmvisit_sth = $dbh->prepare(<<'SQL');
SELECT count(*) 
FROM   lognew 
WHERE  pageid =  ? AND
       data   <= ? AND 
       data   >= ?
SQL

open (INPUT, "< $filepageid") || &file_open_error("$filepageid");

while ($riga=<INPUT>){
    $nump++;
    chop($riga);
    $pagina[$nump] = $riga;

    $totalvisit_sth->execute($pagina[$nump], $startdate);
    $totalvisit[$nump] = $totalvisit_sth->fetchrow_array();

    $totalvisittoday_sth->execute($pagina[$nump], $dataoggi);
    $totalvisittoday[$nump] = $totalvisittoday_sth->fetchrow_array();

    $totalyvisit_sth->execute($pagina[$nump], $dataieri);
    $totalyvisit[$nump] = $totalyvisit_sth->fetchrow_array();

    $totalmvisit_sth->execute($pagina[$nump], $fine30gg, $inizio30gg);
    $totalmvisit[$nump] = $totalmvisit_sth->fetchrow_array();
}

close (INPUT);

Test. Log. Check in.

Remove Redundancy

With the SQL in a more readable format, some commonalities become clear.

  • All the statements are doing a count(*).
  • They're all using the lognew table
  • They're all looking for a certain pageid.

In fact, $totalvisittoday_sth and $totalyvisit_sth are exactly the same! Let's eliminate one of them, doesn't matter which, we're going to rename them in a moment anyway. $totalyvisit_sth goes away, making sure to change all references to it to $totalvisittoday_sth.

my $totalvisit_sth = $dbh->prepare(<<'SQL');
SELECT count(*)
FROM   lognew 
WHERE  pageid =  ? AND 
       data   >= ?
SQL

my $totalvisittoday_sth = $dbh->prepare(<<'SQL');
SELECT count(*) 
FROM   lognew 
WHERE  pageid = ? AND 
       data   = ?
SQL

my $totalmvisit_sth = $dbh->prepare(<<'SQL');
SELECT count(*) 
FROM   lognew 
WHERE  pageid =  ? AND
       data   <= ? AND 
       data   >= ?
SQL

open (INPUT, "< $filepageid") || &file_open_error("$filepageid");

while ($riga=<INPUT>){
    $nump++;
    chop($riga);
    $pagina[$nump] = $riga;

    $totalvisit_sth->execute($pagina[$nump], $startdate);
    $totalvisit[$nump] = $totalvisit_sth->fetchrow_array();

    $totalvisittoday_sth->execute($pagina[$nump], $dataoggi);
    $totalvisittoday[$nump] = $totalvisittoday_sth->fetchrow_array();

    $totalvisittoday_sth->execute($pagina[$nump], $dataieri);
    $totalyvisit[$nump] = $totalvisittoday_sth->fetchrow_array();

    $totalmvisit_sth->execute($pagina[$nump], $fine30gg, $inizio30gg);
    $totalmvisit[$nump] = $totalmvisit_sth->fetchrow_array();
}

close (INPUT);

Test. Log. Check in.

Pages: 1, 2, 3, 4, 5

Next Pagearrow