views:

338

answers:

5

There's a lot of buzz about MooseX::Method::Signatures and even before that, modules such as Params::Validate that are designed to type check every argument to methods or functions. I'm considering using the former for all my future Perl code, both personal and at my place of work. But I'm not sure if it's worth the effort.

I'm thinking of all the Perl code I've seen (and written) before that performs no such checking. I very rarely see a module do this:

my ($a, $b) = @_;
defined $a or croak '$a must be defined!';
!ref $a or croak '$a must be a scalar!";
...
@_ == 2 or croak "Too many arguments!";

Perhaps because it's simply too much work without some kind of helper module, but perhaps because in practice we don't send excess arguments to functions, and we don't send arrayrefs to methods that expect scalars - or if we do, we have use warnings; and we quickly hear about it - a duck typing approach.

So is Perl type checking worth the performance hit, or are its strengths predominantly shown in compiled, strongly typed languages such as C or Java?

I'm interested in answers from anyone who has experience writing Perl that uses these modules and has seen benefits (or not) from their use; if your company/project has any policies relating to type checking; and any problems with type checking and performance.

UPDATE: I read an interesting article on the subject recently, called Strong Testing vs. Strong Typing. Ignoring the slight Python bias, it essentially states that type checking can be suffocating in some instances, and even if your program passes the type checks, it's no guarantee of correctness - proper tests are the only way to be sure.

+10  A: 

If it's important for you to check that an argument is exactly what you need, it's worth it. Performance only matters when you already have correct functioning. It doesn't matter how fast you can get a wrong answer or a core dump. :)

Now, that sounds like a stupid thing to say, but consider some cases where it isn't. Do I really care what's in @_ here?

sub looks_like_a_number { $_[0] !~ /\D/ }
sub is_a_dog            { eval { $_[0]->DOES( 'Dog' ) } }

In those two examples, if the argument isn't what you expect, you are still going to get the right answer because the invalid arguments won't pass the tests. Some people see that as ugly, and I can see their point, but I also think the alternative is ugly. Who wins?

However, there are going to be times that you need guard conditions because your situation isn't so simple. The next thing you have to pass your data to might expect them to be within certain ranges or of certain types and don't fail elegantly.

When I think about guard conditions, I think through what could happen if the inputs are bad and how much I care about the failure. I have to judge that by the demands of each situation. I know that sucks as an answer, but I tend to like it better than a bondage-and-discipline approach where you have to go through all the mess even when it doesn't matter.

I dread Params::Validate because its code is often longer than my subroutine. The Moose stuff is very attractive, but you have to realize that it's a way for you to declare what you want and you still get what you could build by hand (you just don't have to see it or do it). The biggest thing I hate about Perl is the lack of optional method signatures, and that's one of the most attractive features in Perl 6 as well as Moose.

brian d foy
+5  A: 

I basically concur with brian. How much you need to worry about your method's inputs depends heavily on how much you are concerned that a) someone will input bad data, and b) bad data will corrupt the purpose of the method. I would also add that there is a difference between external and internal methods. You need to be more diligent about public methods because you're making a promise to consumers of your class; conversely you can be less diligent about internal methods as you have greater (theoretical) control over the code that accesses it, and have only yourself to blame if things go wrong.

MooseX::Method::Signatures is an elegant solution to adding a simple declarative way to explain the parameters of a method. Method::Signatures::Simple and Params::Validate are nice but lack one of the features I find most appealing about Moose: the Type system. I have used MooseX::Declare and by extension MooseX::Method::Signatures for several projects and I find that the bar to writing the extra checks is so minimal it's almost seductive.

perigrin
+3  A: 

The counterargument I've seen presented to this is that checking parameters on every single function call is redundant and a waste of CPU time. This argument's supporters favor a model in which all incoming data is rigorously checked when it first enters the system, but internal methods have no parameter checks because they should only be called by code which will pass them data which has already passed the checks at the system's border, so it is assumed to still be valid.

In theory, I really like the sound of that, but I can also see how easily it can fall like a house of cards if someone uses the system (or the system needs to grow to allow use) in a way that was unforeseen when the initial validation border is established. All it takes is one external call to an internal function and all bets are off.

In practice, I'm using Moose at the moment and Moose doesn't really give you the option to bypass validation at the attribute level, plus MooseX::Declare handles and validates method parameters with less fuss than unrolling @_ by hand, so it's pretty much a moot point.

Dave Sherohman
+1 for Moose info
DVK
+1  A: 

Params::Validate works great,but of course checking args slows things down. Tests are mandatory(at least in the code I write).

xxxxxxx
+1  A: 

I want to mention two points here. The first are the tests, the second the performance question.

1) Tests

You mentioned that tests can do a lot and that tests are the only way to be sure that your code is correct. In general i would say this is absolutly correct. But tests itself only solves one problem.

If you write a module you have two problems or lets say two different people that uses your module.

You as a developer and a user that uses your module. Tests helps with the first that your module is correct and do the right thing, but it didn't help the user that just uses your module.

For the later, i have one example. i had written a module using Moose and some other stuff, my code ended always in a Segmentation fault. Then i began to debug my code and search for the problem. I spend around 4 hours of time to find the error. In the end the problem was that i have used Moose with the Array Trait. I used the "map" function and i didn't provide a subroutine function, just a string or something else.

Sure this was an absolutly stupid error of mine, but i spend a long time to debug it. In the end just a checking of the input that the argument is a subref would cost the developer 10 seconds of time, and would cost me and propably other a lot of more time.

I also know of other examples. I had written a REST Client to an interface completly OOP with Moose. In the end you always got back Objects, you can change the attributes but sure it didn't call the REST API for every change you did. Instead you change your values and in the end you call a update() method that transfers the data, and change the values.

Now i had a user that then wrote:

$obj->update({ foo => 'bar' })

Sure i got an error back, that update() does not work. But sure it didn't work, because the update() method didn't accept a hashref. It only does a synchronisation of the actual state of the object with the online service. The correct code would be.

$obj->foo('bar');
$obj->update();

The first thing works because i never did a checking of the arguments. And i don't throw an error if someone gives more arguments then i expect. The method just starts normal like.

sub update {
  my ( $self ) = @_;
  ...
}

Sure all my tests absolutely works 100% fine. But handling these errors that are not errors cost me time too. And it costs the user propably a lot of more time.

So in the end. Yes, tests are the only correct way to ensure that your code works correct. But that doesn't mean that type checking is meaningless. Type checking is there to help all your non-developers (on your module) to use your module correctly. And saves you and others time finding dump errors.

2) Performance

The short: You don't care for performance until you care.

That means until your module works to slow, Performance is always fast enough and you don't need to care for this. If your module really works to slow you need further investigations. But for these investigions you should use a profiler like Devel::NYTProf to look what is slow.

And i would say. In 99% slowliness is not because you do type checking, it is more your algorithm. You do a lot of computation, calling functions to often etc. Often it helps if you do completly other solutions use another better algorithm, do caching or something else, and the performance hit is not your type checking. But even if the checking is the performance hit. Then just remove it where it matters.

There is no reason to leave the type checking where performance don't matters. Do you think type checking does matter in a case like above? Where i have written a REST Client? 99% of performance issues here are the amount of request that goes to the webservice or the time for such an request. Don't using type checking or MooseX::Declare etc. would propably speed up absolutly nothing.

And even if you see performance disadvantages. Sometimes it is acceptable. Because the speed doesn't matter or sometimes something gives you a greater value. DBIx::Class is slower then pure SQL with DBI, but DBIx::Class gives you a lot for these.

Sid Burn