tags:

views:

229

answers:

4

Is there an actual package in CPAN to convert such string:

my $string = "54.4M"
my $string2 = "3.2G"

into the actual number in bytes:

54,400,000
3,200,000,000

And vice versa.

In principle what I want to do at the end is to sum out all the memory size.

+4  A: 

For your first problem, I did not find a CPAN package, but this code snippet might do:

sub convert_human_size {
    my $size = shift;
    my @suffixes = ('', qw(k m g));
    for my $index (0..$#suffixes) {
        my $suffix = $suffixes[$index];
        if ( $size =~ /^([\d.]+)$suffix\z/i ) {
            return int($1 * (1024 ** $index));
        }
    }
    # No match
    die "Didn't understand human-readable file size '$size'";  # or croak
}

Wrap the number through Number::Format's format_number function if you'd like pretty semi-colons (e.g. "5,124" instead of "5124")

CPAN solves the second part of your problem:

Number::Bytes::Human

For example:

  use Number::Bytes::Human qw(format_bytes);
  $size = format_bytes(54_400_000);

You may provide an optional bs => 1000 parameter to change the base of the conversion to 1000 instead of 1024.

rjh
Sounds like you should package the first case as Number::Bytes::Inhuman :-)
justintime
+1  A: 

Basically, to go from strings to numbers, all you need is a hash mapping units to multipliers:

#!/usr/bin/perl

use strict; use warnings;
my $base = 1000;

my %units = (
    K => $base,
    M => $base ** 2,
    G => $base ** 3,
    # etc
);

my @strings = qw( 54.4M 3.2G 1K 0.1M .);
my $pattern = join('|', sort keys %units);

my $total;

for my $string ( @strings ) {
    while ( $string =~ /(([0-9]*(?:\.[0-9]+)?)($pattern))/g ) {
        my $number = $2 * $units{$3};
        $total += $number;
        printf "%12s = %12.0f\n", $1, $number;;
    }
}

printf "Total %.0f bytes\n", $total;

Output:

       54.4M =     54400000
        3.2G =   3200000000
          1K =         1000
        0.1M =       100000
Total 3254501000 bytes
Sinan Ünür
Are you sure you would use powers of 10?
Alex Reynolds
@Alex I would not. IMNSHO, there is only one true measure of a kilobyte and that is 1024 bytes. I don't need no `KiB` nonsense ;-) However, the OP is using powers of 10 so I left it that way.
Sinan Ünür
A small improvement is to add a $base variable so you can set it once and still have the units all work out. :)
brian d foy
+1  A: 

This should get you started. You could add other factors, like kilobytes ("K") on your own, as well as formatting of output (comma separators, for example):

#!/usr/bin/perl -w

use strict;
use POSIX qw(floor);

my $string = "54.4M";

if ( $string =~ m/(\d+)?.(\d+)([M|G])/ ) {
    my $mantissa = "$1.$2";
    if ( $3 eq "M" ) {
        $mantissa *= (2 ** 20);
    }
    elsif ( $3 eq "G" ) {
        $mantissa *= (2 ** 30);
    }
    print "$string = ".floor($mantissa)." bytes\n";
}

Output:

54.4M = 57042534 bytes
Alex Reynolds
This is sorta the inverse of the problem I discuss in "Eliminate Needless Loops and Branching" (http://www.effectiveperlprogramming.com/blog/23), but you can make the same improvements. :)
brian d foy
+3  A: 

To get the exact output you asked for, use Number::FormatEng and Number::Format:

use strict;
use warnings;

use Number::FormatEng qw(:all);
use Number::Format qw(:subs);

my $string = "54.4M" ;
my $string2 = "3.2G" ;

print format_number(unformat_pref($string))  , "\n";
print format_number(unformat_pref($string2)) , "\n";

__END__
54,400,000
3,200,000,000             

By the way, only unformat_pref is needed if you are going to perform calculations with the result.

Since Number::FormatEng was intended for engineering notation conversion (not for bytes), its prefix is case-sensitive. If you want to use it for kilobytes, you must use lower case k.

Number::Format will convert these strings into actual bytes (kinda, almost).

use Number::Format qw(:subs);

my $string = "54.4M" ;
my $string2 = "3.2G" ;

print round(unformat_number($string) , 0), "\n";
print round(unformat_number($string2), 0), "\n";

__END__
57042534
3435973837

The reason I said "kinda, almost" is that Number::Format treats 1K as being equal to 1024 bytes, not 1000 bytes. That's probably why it gives a weird-looking result (with fractional bytes), unless it is rounded.

toolic
What is 0.4 of a byte?
Alex Reynolds
@Alex Reynolds: I updated my Answer with rounded numbers.
toolic
There's a module for everything. Sorry I doubted you, CPAN.
rjh