views:

30

answers:

1

Basically, I need to route data to the right Reducer. Each Reducer is going to be a TableReducer.

I have a the following file

venodor1, user1, xxxx=n venodor1, user1, xxxx=n venodor2, user2, xxxx=n venodor2, user2, xxxx=n

I need to insert that in the following hbase tables

Table vendor1: [user1] => {data:xxxx = n} [user2] => {data:xxxx = n}

Table vendor2: [user1] => {data:xxxx = n} [user2] => {data:xxxx = n}

Format is [ROW_ID] => {[FAMILY]:[COLUMN] = [VALUE]}

  • each vendor has a different hbase table
  • rows need to go to different hbase tables base on a value in the line.

Is there a way to do that ? With Cascading ? Is there another work around this?

Thanks, Federico

A: 

I found the way... Letting the reducer handling the tables. Instead of using a TableReducer, just use a Reducer. On setup load the tables (tables should be properties) set auto flush to false and set a buffer size. On cleanup flushCommit() on all the tables. Reducer output should be NullWritable for Key and Value (unless you do want to output something). On reduce Just do table1.put tabe2.put etc

TableReducer implementation it's doing something like this under the hood for one table.

Federico