questions about hadoop | ansaurus

hadoop

javax.security.auth.login.LoginException: Login failed

I'm trying to run a hadoop job (version 18.3) on my windows machine but I get the following error: Caused by: javax.security.auth.login.LoginException: Login failed: CreateProcess: bash -c groups error=2 at org.apache.hadoop.security.UnixUserGroupInformation.login(UnixUserGroupInformation.java:250) at org.apache.hadoop.s...

facing problems while updating rows in hbase

Hello i've just started exploring hbase i've run samples : SampleUploader,PerformanceEvaluation and rowcount as given in hadoop wiki: http://wiki.apache.org/hadoop/Hbase/MapReduce The problem im facing is : table1 is my table with the column family column create 'table1','column' put 'table1','row1','column:address'...

Is there a way to configure timeout for speculative execution in Hadoop?

I have hadoop job with tasks that are expected to run for significant length of fime (few minues). However hadoop starts speculative execution too soon. I do not want to turn speculative execution completely off but I want to increase duration of time hadoop waits before considering job for speculative execution. Is there a config option...

image processing with hadoop

How to read video frames in hadoop? ...

image-processing

Trying to find org.apache.hadoop.io.LongWritable

I'm trying to create a simple project with hadoop. I am new to IntelliJ and am trying to set the classpath to org.apache.hadoop.io. But what jar has this class? ...

Map Reduce: ChainMapper and ChainReducer

Hi all. I need to split my Map Reduce jar file in two jobs in order to get two different output file, one from each reducers of the two jobs. I mean that the first job has to produce an output file that will be the input for the second job in chain. I read something about ChainMapper and ChainReducer in hadoop version 0.20 (currently ...

Pulling data from MySQL into Hadoop

I'm just getting started with learning Hadoop, and I'm wondering the following: suppose I have a bunch of large MySQL production tables that I want to analyze. It seems like I have to dump all the tables into text files, in order to bring them into the Hadoop filesystem -- is this correct, or is there some way that Hive or Pig or what...

Working with Katta ( Lucene, Hadoop )

Can any one provide me with some sample Java code as how to go about storing the Lucene index in a HDFS( Hadoop File Sytem ), using Katta. ...

Storage of parsed log data in hadoop and exporting it into relational DB

I have a requirement of parsing both Apache access logs and tomcat logs one after another using map reduce. Few fields are being extracted from tomcat log and rest from Apache log.I need to merge /map extracted fields based on the timestamp and export these mapped fields into a traditional relational db ( ex. MySQL ). I can parse and e...

Do you know batch log processing tools for hadoop (zohmg alternatives)?

Hello, Since the zohmg project seems to be dead (no new commits since nov 2009), I would like to know if any of you used/uses it (with successful results). Or if you know anything about future of this project. And if not, is there any alternative for this project. I'm looking for tool that will help to extract data from (apache) logs ...

PHP vs. Other Languages in Hadoop/MapReduce implementations, and in the Cloud generally.

I'm beginning to learn some Hadoop/MapReduce, coming mostly from a PHP background, with a little bit of Java and Python. But, it seems like most implementations of MapReduce out there are in Java, Ruby, C++ or Python. I've looked, and it looks like there are some Hadoop/MapReduce in PHP, but the overwhelming body of the literature se...

R: Creating a CSV out of serialized objects

I'm trying to take a list and serialize each item and put it into a CSV file with a key to create a text file with key/value pairs. Ultimately this is going to run through Hadoop streaming so before you ask, I think it really does need to be in a text file. (but I'm open to other ideas) This all seemed seemed pretty straight forward at f...

Adjacency List structure in HBase

I'm trying to implement the following graph reduction algorithm in The graph is an undirected weighted graph I want to strip away all nodes with only two neighbors and update the weights Have a look at the following illustration: The algorithm shall transform the upper graph into the lower one. Eliminate node 2 and update the weig...

Is there any distributed file system which runs on Windows except Hadoop?

I'm desperate to find any DFS which supports Windows. The only such DFS is Hadoop HDFS but it's very hard to deploy it other big number of Windows machines because it requires Cygwin + SSH. Almost all DFS systems work only on Linux and only one (HDFS) runs on Windows. I would be very grateful if somebody will be able to point me to oth...

Hadoop streaming maximum line length

I'm working on a Hadoop streaming workflow for Amazon Elastic Map Reduce and it involves serializing some binary objects and streaming those into Hadoop. Does Hadoop have a maximum line length for streaming input? I started to just test with larger and larger lines but figured I would ask here first. ...

Can I get invidually sorted Mapper outputs from Hadoop when using zero Reducers?

I have a job in Hadoop 0.20 that needs to operate on large files, one at a time. (It's a pre-processing step to get file-oriented data into a cleaner, line-based format more suitable for MapReduce.) I don't mind how many output files I have, but each Map's output can be in at most one output file, and each output file must be sorted. ...

Hadoop development environment, what yours looks like?

Hello, I would like to know what yours Hadoop development environment looks like? Do you deploy jars to test cluster, or run jars in local mode? What IDE do you use and what plugins do you use? How do you deploy completed projects to be run on servers? What are you other recommendations about setting my own Hadoop development/test enviro...

How to write and read files in/from Hadoop HDFS using Ruby?

Is there a way to work with HDFS Api using Ruby? As I can understand there is no multilanguage file Api and the only way is to use native Java Api. I tried using JRuby but this solution is to unstable and not very native. Also I looked at HDFS Thrift Api but it's not complete and also lacks many features (like writing to indexed files). ...

Is it possible to use Avro with Hadoop 0.20?

I'm interested in using Avro to save and read files from Hadoop HDFS and I saw some Jira's in Hadoop issue tracker regarding implementing support for Avro but there were no examples how to enable Avro support in Hadoop. Also I'm not completely sure that current 0.20 has support for Avro because some Jira's were closed for 0.21. Is it pos...

Problem with zohmg data import into hbase

Hello, I have used zohmg and successfully created mapper, table in HBase and test-imported my data (using --local switch). But I have problem inserting my data after mapping into HBase, this is error I get: Exception in thread "main" java.lang.RuntimeException: class org.apache.hadoop.hbase.mapreduce.TableOutputFormat not org.apache.had...

1
...
9
10
11
12
13
...
18