views:

705

answers:

4

I am new to this and need a clue on how to do this task. I have a csv file with following sample data:

site,type,2009-01-01,2009-01-02,....
X,A,12,10,...
X,B,10,23,...
Y,A,20,33,...
Y,B,3,12,...

and so on....

I want to create a perl script to read data from the csv file (as per the given user input) and create XY(scatter) charts. Let's say that I want to create a chart for date 2009-01-01 and type B. The user should input something like "2009-01-01 B", and the chart should be created with the values from CSV file.

Can anyone please suggest me some code to start with?

+6  A: 

Don't start with code. Start with CPAN.

CSV and Scatter

David Dorward
+3  A: 

Here you go, some code to start with:

#!/usr/bin/perl -w
use strict;

use Text::CSV;
use GD;
use Getopt::Long;

Instead of GD you can, of course, use whatever module you'd like.

depesz
GD::Graph does not produce true scatter plots http://en.wikipedia.org/wiki/Scatter_plot see http://search.cpan.org/perldoc/GD::Graph::Cartesian http://search.cpan.org/perldoc/GD::Graph#Options_for_graphs_with_a_numerical_X_axis
Sinan Ünür
@Sinan: true, fixed. GD graphs any kind of graphs :)
depesz
http://search.cpan.org/perldoc/GD::Graph::Cartesian would be more appropriate for scatter plots.
Sinan Ünür
+2  A: 
Sinan Ünür
+1  A: 

I need to make some scatter plots of my own, so I played around with the module suggested in the other answers. For my taste, the data points produced by GD::Graph::Cartesian are far too large, and the module provides no methods to control this parameter, so I hacked my copy of Cartesian.pm (search for iconsize if you want to do the same).

use strict;
use warnings;
use Text::CSV;
use GD::Graph::Cartesian;

# Parse CSV file and convert the data for the
# requested $type and $date into a list of [X,Y] pairs.
my ($csv_file, $type, $date) = @ARGV;
my @xy_points;
my %i = ( X => -1, Y => -1 );
open(my $csv_fh, '<', $csv_file) or die $!;
my $parser = Text::CSV->new();
$parser->column_names( $parser->getline($csv_fh) );
while ( defined( my $hr = $parser->getline_hr($csv_fh) ) ){
    next unless $hr->{type} eq $type;
    my $xy = $hr->{site};
    $xy_points[++ $i{$xy}][$xy eq 'X' ? 0 : 1] = $hr->{$date};
}

# Make a graph.
my $graph = GD::Graph::Cartesian->new(
    width   => 400, # Image size (in pixels, not X-Y coordinates).
    height  => 400,
    borderx => 20,  # Margins (also pixels).
    bordery => 20,
    strings => [[ 20, 50, 'Graph title' ]],
    lines => [
        [ 0,0, 50,0 ], # Draw an X axis.
        [ 0,0,  0,50], # Draw a Y axis.
    ],
    points => \@xy_points, # The data.
);
open(my $png_file, '>', 'some_data.png') or die $!;
binmode $png_file;
print $png_file $graph->draw;
FM
Can you please explain it more briefly (so I hacked my copy of Cartesian.pm (search for iconsize if you want to do the same).). Thanks.
Space
@Virus After you install `GD::Graph::Cartesian`, find the module (on my system it was in `perl/site/lib/GD/Graph/Cartesian.pm`) and search for `iconsize`. You'll see that the default size is 7. You can just change the default so some other value. Even better would be to modify the `initialize()` method to allow different sizes to be specified when you call `new()`.
FM
Thanks FM, can you please also explain me this code "$xy_points[++ $i{$xy}][$xy eq 'X' ? 0 : 1] = $hr->{$date};"
Space
@Virus The `@xy_points` array is a list of points to be graphed, with each point stored an array reference: `[X, Y]` (the X coordinate for the 1st point would be stored in `$xy_points[0][0]` and the Y coordinate in `$xy_points[0][1]`). If you're still having trouble, step through the code using Perl's debugger. Also, you can print out the `@xy_points` data structure using the `Data::Dumper` module. If you don't understand how to work with complex data structures in Perl, look here: http://perldoc.perl.org/index-tutorials.html (especially `perlreftut`, `perldsc`, and `perllol`).
FM