I'm trying to run a hadoop job (version 18.3) on my windows machine but I get the following error:
Caused by: javax.security.auth.login.LoginException: Login failed: CreateProcess: bash -c groups error=2
at org.apache.hadoop.security.UnixUserGroupInformation.login(UnixUserGroupInformation.java:250)
at org.apache.hadoop.s...
Hello i've just started exploring hbase
i've run samples : SampleUploader,PerformanceEvaluation and rowcount as given in
hadoop wiki: http://wiki.apache.org/hadoop/Hbase/MapReduce
The problem im facing is : table1 is my table with the column family column
create 'table1','column'
put 'table1','row1','column:address'...
I have hadoop job with tasks that are expected to run for significant length of fime (few minues). However hadoop starts speculative execution too soon. I do not want to turn speculative execution completely off but I want to increase duration of time hadoop waits before considering job for speculative execution. Is there a config option...
How to read video frames in hadoop?
...
I'm trying to create a simple project with hadoop. I am new to IntelliJ and am trying to set the classpath to org.apache.hadoop.io. But what jar has this class?
...
Hi all.
I need to split my Map Reduce jar file in two jobs in order to get two different output file, one from each reducers of the two jobs.
I mean that the first job has to produce an output file that will be the input for the second job in chain.
I read something about ChainMapper and ChainReducer in hadoop version 0.20 (currently ...
I'm just getting started with learning Hadoop, and I'm wondering the following: suppose I have a bunch of large MySQL production tables that I want to analyze.
It seems like I have to dump all the tables into text files, in order to bring them into the Hadoop filesystem -- is this correct, or is there some way that Hive or Pig or what...
Can any one provide me with some sample Java code as how to go about storing the Lucene index in a HDFS( Hadoop File Sytem ), using Katta.
...
I have a requirement of parsing both Apache access logs and tomcat logs one after another using map reduce. Few fields are being extracted from tomcat log and rest from Apache log.I need to merge /map extracted fields based on the timestamp and export these mapped fields into a traditional relational db ( ex. MySQL ).
I can parse and e...
Hello,
Since the zohmg project seems to be dead (no new commits since nov 2009), I would like to know if any of you used/uses it (with successful results). Or if you know anything about future of this project.
And if not, is there any alternative for this project. I'm looking for tool that will help to extract data from (apache) logs ...
I'm beginning to learn some Hadoop/MapReduce, coming mostly from a PHP background, with a little bit of Java and Python.
But, it seems like most implementations of MapReduce out there are in Java, Ruby, C++ or Python.
I've looked, and it looks like there are some Hadoop/MapReduce in PHP, but the overwhelming body of the literature se...
I'm trying to take a list and serialize each item and put it into a CSV file with a key to create a text file with key/value pairs. Ultimately this is going to run through Hadoop streaming so before you ask, I think it really does need to be in a text file. (but I'm open to other ideas) This all seemed seemed pretty straight forward at f...
I'm trying to implement the following graph reduction algorithm in
The graph is an undirected weighted graph
I want to strip away all nodes with only two neighbors
and update the weights
Have a look at the following illustration:
The algorithm shall transform the upper graph into the lower one. Eliminate node 2 and update the weig...
I'm desperate to find any DFS which supports Windows. The only such DFS is Hadoop HDFS but it's very hard to deploy it other big number of Windows machines because it requires Cygwin + SSH.
Almost all DFS systems work only on Linux and only one (HDFS) runs on Windows.
I would be very grateful if somebody will be able to point me to oth...
I'm working on a Hadoop streaming workflow for Amazon Elastic Map Reduce and it involves serializing some binary objects and streaming those into Hadoop. Does Hadoop have a maximum line length for streaming input?
I started to just test with larger and larger lines but figured I would ask here first.
...
I have a job in Hadoop 0.20 that needs to operate on large files, one at a time. (It's a pre-processing step to get file-oriented data into a cleaner, line-based format more suitable for MapReduce.)
I don't mind how many output files I have, but each Map's output can be in at most one output file, and each output file must be sorted.
...
Hello,
I would like to know what yours Hadoop development environment looks like?
Do you deploy jars to test cluster, or run jars in local mode?
What IDE do you use and what plugins do you use?
How do you deploy completed projects to be run on servers?
What are you other recommendations about setting my own Hadoop development/test enviro...
Is there a way to work with HDFS Api using Ruby? As I can understand there is no multilanguage file Api and the only way is to use native Java Api. I tried using JRuby but this solution is to unstable and not very native. Also I looked at HDFS Thrift Api but it's not complete and also lacks many features (like writing to indexed files).
...
I'm interested in using Avro to save and read files from Hadoop HDFS and I saw some Jira's in Hadoop issue tracker regarding implementing support for Avro but there were no examples how to enable Avro support in Hadoop. Also I'm not completely sure that current 0.20 has support for Avro because some Jira's were closed for 0.21. Is it pos...
Hello,
I have used zohmg and successfully created mapper, table in HBase and test-imported my data (using --local switch).
But I have problem inserting my data after mapping into HBase, this is error I get:
Exception in thread "main" java.lang.RuntimeException: class org.apache.hadoop.hbase.mapreduce.TableOutputFormat not org.apache.had...