ansaurus

Question

Changing the block size of a dfs file in Hadoop

Answer 1

A:

I change my answer! You just need to set the fs.local.block.size configuration setting appropriately when you use the command line.

hadoop fs -D fs.local.block.size=134217728 -put local_name remote_location

Original Answer

You can programatically specify the block size when you create a file with the Hadoop API. Unfortunately, you can't do this on the command line with the hadoop fs -put command. To do what you want, you'll have to write your own code to copy the local file to a remote location; it's not hard, just open a FileInputStream for the local file, create the remote OutputStream with FileSystem.create, and then use something like IOUtils.copy from Apache Commons IO to copy between the two streams.

Bkkbrad 2010-04-20 23:22:03

ansaurus

tags:

views:

answers:

Changing the block size of a dfs file in Hadoop

related questions