hbase

web Interface for HBase data access

I want a web interface for viewing data in HBase (like phpMyAdmin is for mySQL but for HBase). Are there any existing solutions/tools out there? ...

facing problems while updating rows in hbase

Hello i've just started exploring hbase i've run samples : SampleUploader,PerformanceEvaluation and rowcount as given in hadoop wiki: http://wiki.apache.org/hadoop/Hbase/MapReduce The problem im facing is : table1 is my table with the column family column create 'table1','column' put 'table1','row1','column:address'...

python hbase exception

when i use client.mutateRow(self.tableName, row, mutations) to write data to hbase . there is a exception, IOError: IOError(message="Trying to contact region server Some server, retryOnlyOne=true, index=0, islastrow=true, tries=9, numtries=10, i=0, listsize=1, region=test,,1276665207312 for region test,,1276665207312, row 'hello', but f...

What is the best components stack for building distributed log aggregator (like Splunk)?

I'm trying to find the best components I could use to build something similar to Splunk in order to aggregate logs from a big number of servers in computing grid. Also it should be distributed because I have gigs of logs everyday and no single machine will be able to store logs. I'm particularly interested in something that will work wi...

Do you know batch log processing tools for hadoop (zohmg alternatives)?

Hello, Since the zohmg project seems to be dead (no new commits since nov 2009), I would like to know if any of you used/uses it (with successful results). Or if you know anything about future of this project. And if not, is there any alternative for this project. I'm looking for tool that will help to extract data from (apache) logs ...

Adjacency List structure in HBase

I'm trying to implement the following graph reduction algorithm in The graph is an undirected weighted graph I want to strip away all nodes with only two neighbors and update the weights Have a look at the following illustration: The algorithm shall transform the upper graph into the lower one. Eliminate node 2 and update the weig...

Referential Integrity and HBase

One of the first sample schemas you read about in the HBase FAQ is the Student-Course example for a many-many relationship. The schema has a Courses column in the Student table and a Students column in the Course table. But I don't understand how in HBase you guarantee integrity between these two objects. If something were to crash bet...

Problem with zohmg data import into hbase

Hello, I have used zohmg and successfully created mapper, table in HBase and test-imported my data (using --local switch). But I have problem inserting my data after mapping into HBase, this is error I get: Exception in thread "main" java.lang.RuntimeException: class org.apache.hadoop.hbase.mapreduce.TableOutputFormat not org.apache.had...

What happens when HBase node fails?

HBase is wonderful, but... What will happen to the data when a node (explodes/burns down/gets stolen/Torn into pieces by mad IT on the loose)? Is the data lost? Can the cluster auto recover? Can I add new nodes without downtime? Thanks guys, Maxim. ...

Sequential Row IDs in Column Oriented DBs (HBase, Cassandra)?

I've seen two contradictory pieces of advice when it comes to designing row IDs in HBase, (specifically, but I think it applies to Cassandra as well.) Group keys that you'll be aggregating together often to take advantage of data locality. (White, Hadoop: The Definitive Guide and I recall seeing it on the HBase site, but can't find it....

Java ORM for Hbase.

Anyone knows a good Java ORM implementation for HBase. This one looks really nice for Ruby http://www.stanford.edu/~sqs/rhino/doc/ But could not find one for Java. Thanks. ...

a Reducer per HBase table

Basically, I need to route data to the right Reducer. Each Reducer is going to be a TableReducer. I have a the following file venodor1, user1, xxxx=n venodor1, user1, xxxx=n venodor2, user2, xxxx=n venodor2, user2, xxxx=n I need to insert that in the following hbase tables Table vendor1: [user1] => {data:xxxx = n} [user2] => {data:xx...

send Hbase batch Puts compressed ?

Is there a way to send a batch of hbase puts, compressed. I'm setting auto flush to false and I'm using a bufferSize of 2MG. Is there a way to compress the puts in order to send more puts in one batch. Thank, ...

Retrieving information from aggregated weblogs data, how to do it?

Hello, I would like to know how to retrieve data from aggregated logs? This is what I have: - about 30GB daily of uncompressed log data loaded into HDFS (and this will grow soon to about 100GB) This is my idea: - each night this data is processed with Pig - logs are read, split, and custom UDF retrieves data like: timestamp, url, user_id...

What do you recommend for learning HBase?

What are recommended resources for learning HBase? The only one I can think of is HBase wiki and one chapter in book "Hadoop: The Definitive Guide", are there any other good resources? I'm looking for links, books, wikis, etc. Stuff about BigTable is also welcome. Thanks. ...

How HBase partitions table across regionservers?

Hello, Please tell me how HBase partitions table across regionservers. For example, let's say my row keys are integers from 0 to 10M and I have 10 regionservers. Does this mean that first regionserver will store all rows with keys with values 0 - 10M, second 1M - 2M, third 2M-3M , ... tenth 9M - 10M ? I would like my row key to be ti...

Problem with HBase in classpath

Hello, I'm having problem with using external jars in my file. I always get Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/hadoop/hbase/HBaseConfiguration at this line: Configuration config = HBaseConfiguration.create(); And this file is on the classpath, I'm setting options when running jar: java -jar hbase.jar -c...

multiple or bulk fetch or get in HBase Java client?

Does the HBase Java client support multi-get (aka bulk fetch), kind of like memcached multi-get? I've scoured the docs and the Internet for at least an hour and couldn't find anything definitive. The HTable API offers multiple Delete and multiple Put, but does not have any obvious analogous call for Get. To clarify, I am trying to Get...

how to create a copy of a table in HBase on same cluster? or, how to serve requests using original state while operating on a working state

Is there an efficient way to create a copy of table structure+data in HBase, in the same cluster? Obviously the destination table would have a different name. What I've found so far: The CopyTable job, which has been described as a tool for copying data between different HBase clusters. I think it would support intra-cluster operati...

Cassandra or Hadoop Hive or MYSQL?

Hey. I am Developing a Web Crawler,Which is Good for storing data? Cassandra or Hadoop Hive or MySQL?and why?i am having 1TB of Data from past 6 Months in my MySQL DB,i need to index them and i need to get the out put in my search ASAP,and as i think,it will store more amount of DATA,like 10 Peta Byes as my crawler are working fast,i nee...