I'm interested in finding out how the recently-released (http://mirror.facebook.com/facebook/hive/hadoop-0.17/) Hive compares to HBase in terms of performance. The SQL-like interface used by Hive is very much preferable to the HBase API we have implemented.
...
I have been given a .Hive file from a registry which i have to parse and use the contents as part of a html report(from this i assume i have to convert to text somehow). The whole thing must be done within the program so i cant just convert the hive file and then run it through my program. I currently have no idea how to even start this ...
Hi all
suppose that I have this RDBM table (Entity-attribute-value_model):
col1: entityID
col2: attributeName
col3: value
and I want to use HBASe sue to scaling issues.
I know that the only way to access Hbase table is using a primary key (cursor). you can get a cursor for a specific key, and iterate the rows one-by-one .
The issue...
I'm trying to create tables pragmatically using JDBC. However, I can't really see the table I created from the hive shell. What's worse, when i access hive shell from different directories, i see different result of the database.
Is any setting i need to configure?
Thanks in advance.
...
I 'm trying to close the connection after executing a query. Before, I just create a CacheRowSetImpl instance and it will take care of release the resources for me. However, I am using hive database driver from hadoop project. It doesn't support CachedRowSetImpl.execute(). I'm wondering is there any other way that allow me to copy the re...
Hello,
Does anyone have a smal example of how to programmatically, in c/c++, load a users registry hive? I would loike to load a hive set some values and close the hive.
Thanks in advance for any help.
Tony
...
In the past I used to build WebAnalytics using OLAP cubes running on MySQL.
Now an OLAP cube the way I used it is simply a large table (ok, it was stored a bit smarter than that) where each row is basically a measurement or and aggregated set of measurements. Each measurement has a bunch of dimensions (i.e. which pagename, useragent, ip,...
I am trying out the Apache Hive as per http://wiki.apache.org/hadoop/Hive/GettingStarted and am getting this error from Ivy:
Downloaded file size doesn't match expected Content Length for http://archive.apache.org/dist/hadoop/core/hadoop-0.19.0/hadoop-0.19.0.tar.gz. Please retry.
This error repeats 4 times for 4 different versions of ...
I would like to execute a Hive query on the server in an asynchronous manner. The Hive query will likely take a long time to complete, so I would prefer not to block on the call. I am currently using Thirft to make a blocking call (blocks on client.execute()), but I have not seen an example of how to make a non-blocking call. Here is the...
I've been experimenting with Hive for some data mining activities and would like to make it easily available to less command line orientated colleagues.
Hive does now ship with a web interface (http://wiki.apache.org/hadoop/Hive/HiveWebInterface) but it's very basic at this stage.
My question is does a visually polished and fully featu...
I'm running a hadoop job (using hive actually) which is supposed to uniq lines in a lot of text file. More specifically it chooses the most recently timestamped record for each key in the reduce step.
Does hadoop guarantee that every record with the same key, output by the map step, will go to a single reducer, even if there are many r...
To start with I'll say I've read the post here and I'm still having trouble.
I'm trying to create a CE6 image with a hive-based registry that actually stores results through a reboot.
I've ticked the hive settings in the catalog items.
In common.reg, I've set the location of the hive ([HKEY_LOCAL_MACHINE\init\BootVars] "SystemHive") ...
I have a lot of data in an Amazon Simple DB Domain. I want to start Hive on Elastic Map Reduce (on top of hadoop) and somehow, either import data from simpledb or, connect to simpledb and run hiveql queries on it. I have having issues importing the data. Any pointers?
...
I am trying a small hadoop setup (for experimentation) with just 2 machines. I am loading about 13GB of data, a table of around 39 million rows, with a replication factor of 1 using Hive. My problem is hadoop always stores all this data on a single datanode. Only if I change the dfs_replication fatcor to 2 using setrep, hadoop copies dat...
Hi
My background - 4 weeks old in the Hadoop world. Dabbled a bit in Hive, Pig and Hadoop using Cloudera's Hadoop VM. Have read Google's paper on Map-Reduce and GFS.
I understand that-
Pig's language Pig Latin is a shift
from(suits the way programmers think)
SQL like declarative style of
programming and Hive's query language closely
...
I am working on a solution where I will have a Hadoop cluster with Hive running and I want to send jobs and hive queries from a .NET application to be processed and get notified when they are done. I can't find any solutions for interfacing with Hadoop other than directly from a Java app, is there an API I can access that I am just not f...
I'm looking into replacing a bunch of Python ETL scripts that perform a nightly / hourly data summary and statistics gathering on a massive amount of data.
What I'd like to achieve is
Robustness - a failing job / step should be automatically restarted. In some cases I'd like to execute a recovery step instead.
The framework must be ab...
So I have 5 rows like this
userid, col
--------------
1, a
1, b
2, c
2, d
3, e
How would I do query so it will look like this
userid, combined
1, a b
2, c d
3, e
...
Hi,
My hive query has multiple outer joins and takes very long to execute. I was wondering if it would make sense to break it into multiple smaller queries and use pig to work the transformations.
Is there a way I could query hive tables or read hive table data within a pig script?
Thanks
...
I've been thinking of how to do it in Hive.
For e.g. i've a specific field in a log file that I want to extract (this is already possible in Hive) and then I want to map this field's value to something else. This mapping is determiened by own customic business logic that is coded up in a Java Class.
How can I use this Java class in Hive...