views:

86

answers:

4

What is the best way to receive data from a socket in Perl, when the data length is unknown?

Right now, I read one character at a time in a loop, until I reach the '\0' character.

Is there a better way to do this?

+3  A: 

What is the best way to receive data from a socket in Perl, when the data length is unknown?

A sound solution to this is impossible, in any language. If you don't know how long the data length is, then you can't possibly know when you've finished receiving all of it from the socket.

Your only hope is to use some kind of a metric to determine if it's been "long enough" since data started coming in, to make the decision that data flow has stopped. But it won't be perfect.

Shaggy Frog
I know that each message is ended with '\o', does that help?
Gal Goldman
As long as you know *for sure* that character can't be sent as part of your data stream, then it functions like an End-Of-Data marker, in which case you don't need to know the data length. In which case your solution is valid.
Shaggy Frog
@Gal: What is `\o`? Do you mean `\0`?
Svante
Yes... typo :-)
Gal Goldman
+2  A: 

The answer depends on the protocol. Since your protocol uses '\0' as a separator, you're doing the right thing. I'm pretty sure Perl handles buffering for you, so reading one character at a time is not inefficient.

Many network oriented protocols precede strings with a length. To read a protocol like this, you read the length (usually one or two bytes, depending on the protocol spec), then read that many bytes into a string.

slim
PerlIO certainly does handle buffering, so 1-char reads don't incur a *syscall* overhead, but they still waste time in the Perl op loop (not to mention the number of string concatenations that might be happening, depending on the code). Not to micro-optimize, but the `$/` + `getline` approach is far more efficient and abundantly clear, so it wins :)
hobbs
+5  A: 

Set your line ending to \x{00} (\0), be sure to localise it, and getline on the handle, like so:

{
    local $/ = "\x{00}";
    while (my $line = $sock->getline) {
       print "$line\n"; # do whatever with your data here
   }
}
MkV
+2  A: 

You could use FIONREAD with ioctl. The program below connects to the SSH server on localhost and waits on its greeting:

#! /usr/bin/perl

use warnings;
use strict;

use subs 'FIONREAD';
require "sys/ioctl.ph";
use Socket;
socket my $s, PF_INET, SOCK_STREAM, getprotobyname "tcp"
  or die "$0: socket: $!";
connect $s, sockaddr_in 22, inet_aton "localhost"
  or die "$0: connect: $!";

my $rin = "";
vec($rin, fileno($s), 1) = 1;
my $nfound = select my$rout=$rin, "", "", undef;
die "$0: select: $!" if $nfound < 0;

if ($nfound) {
  my $size = pack "L", 0;
  ioctl $s, FIONREAD, $size
    or die "$0: ioctl: $!";

  print unpack("L", $size), "\n";
  sysread $s, my $buf, unpack "L", $size
    or die "$0: sysread: $!";

  my $length = length $buf;
  $buf =~ s/\r/\\r/g;
  $buf =~ s/\n/\\n/g;
  print "got: [$buf], length=$length\n";
}

Sample run:

$ ./howmuch
39
got: [SSH-2.0-OpenSSH_5.3p1 Debian-3ubuntu4\r\n], length=39

But you'll probably prefer using the IO::Socket::INET and IO::Select modules as in the code below that talks to Google:

#! /usr/bin/perl

use warnings;
use strict;

use subs "FIONREAD";
require "sys/ioctl.ph";
use IO::Select;
use IO::Socket::INET;

my $s = IO::Socket::INET->new(PeerAddr => "google.com:80")
  or die "$0: can't connect: $@";

my $CRLF = "\015\012";
print $s "HEAD / HTTP/1.0$CRLF$CRLF" or warn "$0: print: $!";

my @ready = IO::Select->new($s)->can_read;
die "$0: umm..." unless $s == $ready[0];

my $size = pack "L", 0;
ioctl $s, FIONREAD, $size
  or die "$0: ioctl: $!";

print unpack("L", $size), "\n";
sysread $s, my $buf, unpack "L", $size
  or die "$0: sysread: $!";

my $length = length $buf;
$buf =~ s/\r/\\r/g;
$buf =~ s/\n/\\n/g;
print "got: [$buf], length=$length\n";

Output:

573
got: [HTTP/1.0 200 OK\r\nDate: Sun, 18 Jul 2010 12:03:48 GMT\r\nExpires: -1\r\nCache-Control: private, max-age=0\r\nContent-Type: text/html; charset=ISO-8859-1\r\nSet-Cookie: PREF=ID=6742ab80dd810a95:TM=1279454628:LM=1279454628:S=ewNg64020FbnGzHR; expires=Tue, 17-Jul-2012 12:03:48 GMT; path=/; domain=.google.com\r\nSet-Cookie: NID=36=kn2wtTD4UJ3MYYQ5uvA4iAsrS2wcrb_W781pZ1hrVUhUDHrIJTMg_kOgVKhjQnO5SM6MdC_jrRdxFRyXwyyv5N3Xja1ydhVLWWaYqpMHQOmGVi2K5qRWAKwDhCVRd8WS; expires=Mon, 17-Jan-2011 12:03:48 GMT; path=/; domain=.google.com; HttpOnly\r\nServer: gws\r\nX-XSS-Protection: 1; mode=block\r\n\r\n], length=573
Greg Bacon