views:

181

answers:

1

Does anyone know how to prepare data to plot a CDF (I have a bunch of floating point numbers)? I was planning on using gnuplot and on first look, the Statistics::Descriptive module seemed the best fit but looks like I might need some help here.

+3  A: 

You question is a bit vague, but this might get you started:

use strict;
use warnings;

use Statistics::Descriptive;
my $stat = Statistics::Descriptive::Full->new;

# Generate some data.
my @data = map { rand 100 } 1 .. 10000;
$stat->add_data(@data);

# Put the data into a frequency distribution with 10 bins.
# The distribution will be represented as a hash, where a hash
# key represents the max value within a bin and the hash value
# is the frequency count for that bin (I'm fudging this a bit;
# see the documentation for more accurate details).
my $n_bins = 10;
my %dist = $stat->frequency_distribution($n_bins);
my @bin_maxes = sort {$a <=> $b} keys %dist;

# Check it out.    
for my $m (@bin_maxes) {
    printf "%6.3f %4d\n", $m, $dist{$m};
}
FM
Thanks... Exactly what I was looking for...
Legend