views:

185

answers:

6

My apologies if this is a duplicate; I may not know the proper terms to search for.

I am tasked with analyzing a Perl module file (.pm) that is a fragment of a larger application. Is there a tool, app, or script that will simply go through the code and pull out all the variable names, module names, and function calls? Even better would be something that would identify whether it was declared within this file or is something external.

Does such a tool exist? I only get the one file, so this isn't something I can execute -- just some basic static analysis I guess.

+8  A: 

Check out the new, but well recommended Class::Sniff.

From the docs:

use Class::Sniff;
my $sniff = Class::Sniff->new({class => 'Some::class'});

my $num_methods = $sniff->methods;
my $num_classes = $sniff->classes;
my @methods     = $sniff->methods;
my @classes     = $sniff->classes;

{
  my $graph    = $sniff->graph;   # Graph::Easy
  my $graphviz = $graph->as_graphviz();

  open my $DOT, '|dot -Tpng -o graph.png' or die("Cannot open pipe to dot: $!");
  print $DOT $graphviz;
}

print $sniff->to_string;
my @unreachable = $sniff->unreachable;
foreach my $method (@unreachable) {
    print "$method\n";
}

This will get you most of the way there. Some variables, depending on scope, may not be available.

Robert P
+1 Looks like I was beaten to the fastest gun award, and with a better answer. Curse my lack of CPAN knowledge!
Chris Lutz
Only reason I know was because I asked it a few months back :)
Robert P
Will this work on a file? The module in question (the one I have to analyze) has dependencies I can't satisfy, because I don't possess that code.
romandas
Good question. I haven't tried. At the very worst, you could create a pseudo-dependency that simply has the interface your file needs. Once the script compiles, Class::Sniff (or any of the other methods here) will work fine without your dependency.
Robert P
+2  A: 

There are better answers to this question, but they aren't getting posted, so I'll claim the fastest gun in the West and go ahead and post a 'quick-fix'.

Such a tool exists, in fact, and is built into Perl. You can access the symbol table for any namespace by using a special hash variable. To access the main namespace (the default one):

for(keys %main::) { # alternatively %::
  print "$_\n";
}

If your package is named My/Package.pm, and is thus in the namespace My::Package, you would change %main:: to %My::Package:: to achieve the same effect. See the perldoc perlmod entry on symbol tables - they explain it, and they list a few alternatives that may be better, or at least get you started on finding the right module for the job (that's the Perl motto - There's More Than One Module To Do It).

Chris Lutz
Wouldn't I need to load the module to use this?
romandas
If you want to do this without loading the module, you're probably going to have to use good old-fashioned `grep`. Or `ack`, a vastly extended rewrite of `grep` in Perl, using Perl regexes and having numerous improved features.
Chris Lutz
`%main::` will only have the package variables, lexical variables (i.e. the ones created with `my`) are not stored in it.
Chas. Owens
@Chas. - I'd be moderately scared of any task that required knowing every local loop variable. How would you tell apart different lexical variables with the same name?
Chris Lutz
Not all lexical variables are loop variables, obviously.. there are many variables in this module declared with 'my'.
romandas
@Chris Lutz by where they appear in the source code (information which PPI provides).
Sinan Ünür
I know not all lexical variables are loop variables, I was just saying it could be hairy. But apparently not, with Sinan's input. I really need to get more CPAN modules...
Chris Lutz
You have to do a lot more work to distinguish which types are defined in each symbol. This isn't the way to go unless you want to do a lot of work yourself. Even then, you still have to compile the code to do it this way.
brian d foy
+7  A: 

Another CPAN tools available is Class::Inspector

use Class::Inspector;

# Is a class installed and/or loaded
Class::Inspector->installed( 'Foo::Class' );
Class::Inspector->loaded( 'Foo::Class' );

# Filename related information
Class::Inspector->filename( 'Foo::Class' );
Class::Inspector->resolved_filename( 'Foo::Class' );

# Get subroutine related information
Class::Inspector->functions( 'Foo::Class' );
Class::Inspector->function_refs( 'Foo::Class' );
Class::Inspector->function_exists( 'Foo::Class', 'bar' );
Class::Inspector->methods( 'Foo::Class', 'full', 'public' );

# Find all loaded subclasses or something
Class::Inspector->subclasses( 'Foo::Class' );

This will give you similar results to Class::Sniff; you may still have to do some processing on your own.

Robert P
+6  A: 

If I understand correctly, you are looking for a tool to go through Perl source code. I am going to suggest PPI.

Here is an example cobbled up from the docs:

#!/usr/bin/perl

use strict;
use warnings;

use PPI::Document;
use HTML::Template;

my $Module = PPI::Document->new( $INC{'HTML/Template.pm'} );

my $sub_nodes = $Module->find(
    sub { $_[1]->isa('PPI::Statement::Sub') and $_[1]->name }
);

my @sub_names = map { $_->name } @$sub_nodes;

use Data::Dumper;
print Dumper \@sub_names;

Note that, this will output:

     ...
     'new',
     'new',
     'new',
     'output',
     'new',
     'new',
     'new',
     'new',
     'new',
     ...

because multiple classes are defined in HTML/Template.pm. Clearly, a less naive approach would work with the PDOM tree in a hierarchical way.

Sinan Ünür
+2  A: 

If you want to do it without executing any code that you are analyzing, it's fairly easy to do this with PPI. Check out my Module::Use::Extract; it's a short bit of code shows you how to extract any sort of element you want from PPI's PerlDOM.

If you want to do it with code that you have already compiled, the other suggestions in the answers are better.

brian d foy
A: 

I found a pretty good answer to what I was looking for in this column by Randal Schwartz. He demonstrated using the B::Xref module to extract exactly the information I was looking for. Just replacing the evaluated one-liner he used with the module's filename worked like a champ, and apparently B::Xref comes with ActiveState Perl, so I didn't need any additional modules.

perl -MO=Xref module.pm
romandas