ansaurus

Question

Answer 1

+20 A:

While Moose does quite some work at startup time, which sometimes makes it appear a little slow, the code it generates, especially things like accessors for attributes, are generally quite a bit faster than what the average perl programmer would be able to write. So given the runtime of your process is quite long, I doubt any overhead induced by Moose will be relevant.

However, from the code you've shown, I can't really tell what you're bottleneck is, even though I firmly believe it isn't Moose. I also want to point out that doing __PACKAGE__->meta->make_immutable, to state that you're class is now "finalised" allows Moose to do some further optimisations, but still I doubt this is what's causing you trouble.

How about you take a smaller sample of your data, so your program will finish in a reasonable time, and have a look at that in a profiler such as Devel::NYTProf. That'll be able to tell you where exactly the time in your program is spent, so you can optimise specifically those parts to get the greatest possible gain.

One possibility is that the type constraints you're using slow things down quite a bit. Actually validating instance attributes as thoroughly on every single write access to them (or on class instanciation), isn't something most programmers would usually do. You could try using simpler constraints, such as ArrayRef instead of ArrayRef[Node], if you're certain enough about the validity of your data. That way, only the type of the attribute value itself will be checked, not the value of every element in that array reference.

But still, profile your code. Don't guess.

rafl 2010-10-12 13:25:02

+1 For profiling suggestion

T.E.D. 2010-10-12 13:34:45

Is NYTProf that much better than DProf?

Paul Tomblin 2010-10-12 13:41:50

No, much more than that.

rafl 2010-10-12 13:56:06

The `__PACKAGE__->meta->make_immutable` thing made my subset go from 11 seconds down to 6 seconds. Now the profiler says it's spending 26% of its time in `Moose::Meta::TypeConstraint::ArrayRef[Node]` so I'm going to try your suggestion to loosen the constraint on that next.

Paul Tomblin 2010-10-12 14:14:37

Loosening the constraints didn't have any appreciable difference - my 20,000 line test file went from 6.4 seconds to 6.3 seconds.

Paul Tomblin 2010-10-12 15:17:51

@Paul: what does the time breakdown look like now? it should be mostly in XML processing now, not in Moose innards.

Ether 2010-10-12 18:59:49

`dprofpp -I -u` shows 64% in XML::SAX::Expat::_handle_start, of which 51% is in EADHandler::start_element (where I "new" the Node class), 26% is Moose::Meta::TypeConstrant::ArrayRef[Node].

Paul Tomblin 2010-10-12 19:09:34

@Paul: how does the percentage change if you switch to using a type constraint of just `ArrayRef`?

Ether 2010-10-12 19:12:31

@Ether, almost exactly the same run time, but now it's 43% in `XML::SAX::Expat::_handle_start`, 26.5% in `EADHandler::start_element` and only 16.8% in `Node::new`.

Paul Tomblin 2010-10-12 19:28:31

Answer 2

+2 A:

I have successfully written large XML processing apps using XML::Twig 745mb file take less then an hour to run on a reasonably sized box.

But as other users have already mentioned you need to profile your code to figure out what exactly is causing the issue.

cfaulkingham 2010-10-12 14:08:27

Answer 3

+2 A:

I highly suspect that your speed problem is not in Moose so much as it is in memory allocation and disk swapping. Even without doing ->meta->make_immutable, based on your times for the 20K subset, your script should finish in about 2 hours (((11 * (13_000_000 / 20_000)) / 60) == ~119 min). By doing ->meta->make_immutable it would have cut it down to approx. 65 min or so.

Try running your big script again and see what your memory and swap are doing, I suspect your giving your disk an awful thrashing.

Stevan Little 2010-10-12 19:40:31

Munin says I was hardly swapping at all during the first 36 hour run. See http://xcski.com/munin/xcski.com/allhats2.xcski.com-cpu.html, http://xcski.com/munin/xcski.com/allhats2.xcski.com-memory.html and http://xcski.com/munin/xcski.com/allhats2.xcski.com-swap.html from the evening of 10Oct to the mid morning of 12Oct. You can see more swap in use during the second run (starting about 10am on 12Oct) after I did make_immutable, but that may be related to other things I was doing at the same time.

Paul Tomblin 2010-10-12 20:10:10

Upvoted for being Stevan Little.

jrockway 2010-10-21 21:20:48

ansaurus

tags:

views:

answers:

Why is Moose code so slow?

related questions