I have a User Defined Function (UDF) written in Java to parse lines in a log file and return information back to pig, so it can do all the processing.
It looks something like this:
public abstract class Foo extends EvalFunc<Tuple> {
public Foo() {
super();
}
public Tuple exec(Tuple input) throws IOException {
try {
// do stuff with input
} catch (Exception e) {
throw WrappedIOException.wrap("Error with line", e);
}
}
}
My question is: if it throws the IOException, will it stop completely, or will it return results for the rest of the lines that don't throw an exception?
Example: I run this in pig
REGISTER myjar.jar
DEFINE Extractor com.namespace.Extractor();
logs = LOAD '$IN' USING TextLoader AS (line: chararray);
events = FOREACH logs GENERATE FLATTEN(Extractor(line));
With this input:
1.5 7 "Valid Line"
1.3 gghyhtt Inv"alid line"" I throw an exceptioN!!
1.8 10 "Valid Line 2"
Will it process the two lines and will 'logs' have 2 tuples, or will it just die in a fire?