tags:

views:

36

answers:

1

This is the weirdest thing ever. So I can see these files and cat them:

[jchen@host hadoop-0.20.2]$ bin/hadoop fs -ls /users/jchen/                         
Found 3 items
-rw-r--r--   1 jchen supergroup   26553445 2010-07-14 21:10 /users/jchen/20100714T192827^AS17.data
-rw-r--r--   1 jchen supergroup  461957962 2010-07-14 21:10 /users/jchen/20100714T192857^AS1.data
-rw-r--r--   1 jchen supergroup   14026972 2010-07-14 21:10 /users/jchen/20100714T192949^AS311.data

[jchen@q01-ba-sas01 hadoop-0.20.2]$ bin/hadoop fs -cat /users/jchen/20100714T192949^AS311.data | head
SOME DATA

When I ls the file specifically:

[jchen@q01-ba-sas01 hadoop-0.20.2]$ bin/hadoop fs -ls /users/jchen/20100714T192949^AS311.data | head
ls: Cannot access /users/jchen/20100714T192949^AS311.data: No such file or directory

What the frack is going on here? The only thing I can think of is that I used a custom method in org.apache.hadoop.fs.FileSystem to post these files:

public boolean writeStreamToFile(boolean overwrite, 
                  InputStream src, Path dst)
    throws IOException {
    Configuration conf = getConf();
    return FileUtil.writeStream(src, this, dst, overwrite, conf);
}
//which calls this static method in org.apache.hadoop.fs.FileUtil:
public static boolean writeStream(InputStream src, 
                       FileSystem dstFS, Path dst,
                       boolean overwrite,
                       Configuration conf) throws IOException {

    dst = checkDest(dst.getName(), dstFS, dst, overwrite);

    OutputStream out=null;
    try{
        System.out.println("Started file creation");
        out = dstFS.create(dst, overwrite);
        System.out.println("completed file creation. starting stream copy");
        IOUtils.copyBytes(src, out, conf, true);
        System.out.println("completed stream copy.");
    } catch (IOException e) {
        IOUtils.closeStream(out);
        IOUtils.closeStream(src);
        throw e;
    }

    return true;
}

I'm kind of at a total loss here.

A: 

According to this page the cat command takes URIs and the ls command just takes paths.

Make sure that the path you are entering into the ls command is correct. As suggested by matt b, ensure all potentially invalid characters are escaped where possible.

You could try using the hadoop support for the wildcard like so:

/bin/hadoop rs -ls '/users/jchen/*AS311.data'

I assume the cat command you are able to execute is allowing you to verify the data is being written correctly and therefore writeStreamToFile is OK?

Syntax