tags:

views:

186

answers:

4

Hi folks,

I'm planning to move from Class::DBI to Rose::DB::Object due to its nice structure and the jargon that RDBO is faster compares to CDBI and DBIC.

However on my machine (linux 2.6.9-89, perl 5.8.9) RDBO compiled time is much slower than CDBI:

$ time perl -MClass::DBI -e0
real    0m0.233s
user    0m0.208s
sys     0m0.024s

$ time perl -MRose::DB::Object -e0
real    0m1.178s
user    0m1.097s
sys     0m0.078s

That's a lot different...

Anyone experiences similar behaviour here?

Cheers.

+1  A: 

This looks almost as dramatic over here:

time perl -MClass::DBI -e0
real       0m0.084s
user       0m0.080s
sys        0m0.004s

time perl -MRose::DB::Object -e0
real       0m0.391s
user       0m0.356s
sys        0m0.036s

I'm afraid part of the difference can simply be explained by the number of dependencies in each module:

perl -MClass::DBI -le 'print scalar keys %INC'
46

perl -MRose::DB::Object -le 'print scalar keys %INC'
95

Of course, you should ask yourself how much compilation time really matters for your particular problem. And what source code would be easier to maintain for you.

innaM
+5  A: 

Rose::DB::Object simply contains (or references from other modules) much more code than Class::DBI. On the bright side, it also has many more features and is much faster at runtime than Class::DBI. If compile time is concern for you, then your best bet is to load as little code as possible (or get faster disks).

Another option is to set auto_load_related_classes to false in your Metadata objects. To do this early enough and globally will probably require you to make a Metadata subclass and then set that as the meta_class in your common Rose::DB::Object base class.

Turning auto_load_related_classes off means that you'd have to manually load related classes that you actually want to use in your script. That's a bit of a pain, but it lets you control how many classes get loaded. (If you have heavily interrelated classes, loading a single one can end up pulling all the other ones in.)

You could, perhaps, have an environment variable to control the behavior. Example metadata class:

package My::DB::Object::Metadata;

use base 'Rose::DB::Object::Metadata';

# New class method to handle default
sub default_auto_load_related_classes
{
  return $ENV{'RDBO_AUTO_LOAD_RELATED_CLASSES'} ? 1 : 0
}

# Override existing object method, honoring new class-defined default
sub auto_load_related_classes
{
  my($self) = shift;

  return $self->SUPER::auto_load_related_classes(@_)  if(@_);

  if(defined(my $value = $self->SUPER::auto_load_related_classes))
  {
    return $value;
  }

  # Initialize to default
  return $self->SUPER::auto_load_related_classes(ref($self)->default_auto_load_related_classes);
}

And here's how it's tied to your common object base class:

package My::DB::Object;

use base 'Rose::DB::Object';

use My::DB::Object::Metadata;

sub meta_class { 'My::DB::Object::Metadata' }

Then set RDBO_AUTO_LOAD_RELATED_CLASSES to true when you're running in a persistent environment, and leave it false (and don't forget to explicitly load related classes) for command-line scripts.

Again, this will only help if you're currently loading more classes than you strictly need in a particular script due to the default true value of the auto_load_related_classes Metadata attribute.

John Siracusa
Sorry for the long delay to response...I had to finish the implementation first and as it is now done and all test passes, I'm returning to the performance optimization.However, following your suggestion above does not improve the compile time? And the number of modules loaded in %INC stays the same.The documentation says that "auto_load_related_classes" will be automatically loaded when this class is initialize, I assume this happens when I call "sub init_db { My::DB->new() }" via My::DB::Object ?
est
No, it means that if you load class A and it has a relationship to class B, class B will automatically be loaded if auto_load_related_classes() is true. If you're already manually loading both classes A and B, then obviously this setting won't have any effect.
John Siracusa
Ahh, that starts to make sense now. I guess going to a faster disk will be the way to go for my case here...
est
+3  A: 

If compile time is an issue, there are methods to lessen the impact. One is PPerl which makes a normal Perl script into a daemon that is compiled once. The only change you need to make (after installing it, of course) is to the shebang line:

#!/usr/bin/pperl

Another option is to code write a client/server model program where the bulk of the work is done by a server that loads the expensive modules and a thin script that just interacts with the server over sockets or pipes.

You should also look at App::Persistent and this article, both of which were written by Jonathan Rockway (aka jrockway).

Chas. Owens
A: 

@manni and @john: thanks for the explanation about the modules referenced by RDBO, it surely answers why the compile-time is slower than CDBI.

The application is not running on a persistent environment. In fact it's invoked by several simultaneous cron jobs that run at 2 mins, 5 mins, and x mins interval - so yes, compile-time is crucial here...

Jonathan Rockway's App::Persistent seems interesting, however its (current) limitation to allow only one application running at a time is not suitable for my purpose. Also it has issue when we kill the client, the server process is still running...

est