Design patterns for managing results of large operations

views:

147

answers:

+1 Q:

Design patterns for managing results of large operations

We frequently have objects that perform multi-part operations/tasks. An example would be refreshing internal state of a list of objects based on a query of the file system and reading data from the found files. These things are also often long running, and tend to be performed in background threads (we do a lot of Swing work, but I think this sort of thing would apply to multi-tier apps as well).

In this sort of operation, it's quite possible that part of the operation would fail (i.e. unable to read 5 or 6 of the 5000 files that we are processing). In this case, it would be inappropriate to throw an exception, but we still want to provide the user with feedback about what didn't happen, etc...

We've always used ad-hoc approaches here (e.g. logging, or have the operation return a list of exceptions), but I've decided to crack down and really think this through.

One motivating concept is Validation (jgoodies has a good implementation). The idea here is that when validating the values in a form or data structure, you construct a ValidationResults object, which can contain ValidationResult objects. Each ValidationResult has a state (OK, WARN, ERROR) as well as descriptive text (and other things as necessary).

I am highly tempted to just use the Validation framework as-is for task results.

Another option would be to expose the current operation result state of the task as a bound property (probably using a GlazedLists event list).

I was wondering if anyone else has any opinions/experience with this sort of thing, or what design patterns may have evolved to deal with long running tasks that can have sub-tasks that may fail, or not quite complete perfectly (i.e. the WARN state) ?

EDIT: Clarification on purpose of the question

A lot of folks have suggested logging the problems and telling the user to check with an administrator. To clarify the problem a bit, consider an application like Eclipse. If the build process had warnings or errors, and Eclipse just gave an error dialog saying 'there were problems, check the logs', I think the user would be less than satisfied.

Presenting the user with a list of the current warnings and errors is a much more interactive solution, and provides the user a mechanism for quickly evaluating and addressing the issues.

The two approaches that we'd tried so far have been:

Logging and tossing a dialog box at the end of the process
Exposing a property containing the exceptions that have occured

We are trying to move away from option 1 b/c of the user experience.

Option 2 feels extremely ad-hoc to me. It certainly works, but I'm trying to find a good, standardized methodology for addressing this situation.

Exceptions are good for routines that have 'stop the world' failures (the routine either succeeds or fails in some way), and throwing exceptions is a very nice cross-cutting mechanism for handling those situations.

But when a routine doesn't fall into a simple pass/fail categorization, things get a bit fuzzy - the routine itself can partially succeed.

So, is it better to pass in a 'ProblemListener' that we report issues to? Or is it better to have the routine return a list of problems? Or have the object expose a property of issues from the last run of the routine?

There are many potential solutions - I'm wondering if anyone has experience with trying these, and have they found a best practice (and what reasons they accept or reject a given approach)?

+1 A:

Why not just log the errors, and when the operation is done let the user know that an error occurred and that the sysadmin should look at the event log?

Any technical errors will probably not be understood by a casual user. Non-casual users should be savvy enough to look at a log file and interpret the results.

Larry Watanabe 2009-06-14 01:36:36

Unfortunately, warnings and errors from this sort of thing are rarely technical in nature - they are things that users could and should be dealing with. Eclipse build warnings are a good example.

Kevin Day 2009-06-15 14:45:49

Why not just have a public property that has a list that contains the errors, and return a boolean, so if there is an error then the application knows to get the list of errors.

You can also just send out an event on each error that is caught by the subscriber and display the error, so they would see the errors as it happens, so, if the application takes an hour to process then they don't have to wait that long to know that early on there was an error.

What would be nice is to allow the user to then fix the problem and add it back to the queue to be reprocessed. :)

James Black 2009-06-14 02:29:10

I agree - it would seem like the object that represents the error could have sufficient info in it that it could actually *help* the user to fix the problem, right? But what are the pros and cons of the property w/ boolean response or the listener approach?

Kevin Day 2009-06-15 14:47:12

The listener approach would allow the application to display information as errors arise, but the application may be interrupted at times due to being event-driven.The boolean one is simple to implement, but if there is a long time of processing then the user may not know for quite a while that there is a problem.I like the idea of a user being able to fix a problem and resubmitting it during a long processing.Ideally, the best approach would be to use something like a lambda expression, IMO, which is simpler than using listeners.

James Black 2009-06-16 03:55:06

What you are trying to do is essentially logging and for that reason I agree with the response that says you should simply log the results.

I would go one step further to say if you use log4j, you can create a custom appender that would know how to handle this type of log events and act accordingly (if it is a WARN don't do anything, if it is an ERROR alert the user, etc). What's more, if you design things the right way, you can keeps things configurable via log4j.properties (or log4j.xml)

neesh 2009-06-14 02:30:31

hmmm. I'm not entirely sure that this is logging, but it does feel that way. So we try to use an appender to build out a list of problems for the user in the UI? How would we remove those items once they've been resolved? I'm not certain that a logging framework is appropriate here...

Kevin Day 2009-06-15 14:48:26

the way I understand it: logging the error is just one side of your problem. The other side of the problem is what you do once the error has been logged. I suggested using logging so you can utilize log4j's logging functionality for what essentially is a logging function. The customer appender can be used to solve the problem of what happens once the errors have been logged - put them in a queue to be displayed to the users who can then look at them and delete them as appropriate(?)

neesh 2009-06-15 15:18:54

Exceptions are usually the right way of signalizing that some operation failed in Java. I think that whatever your approach is, throwing an exception in case something wrong is the way to go. In our company we use this 3rd library to save objects in batch, and in case something goes wrong the entire operation fails, and the exception that they throw contain the details for each error and the list of objects that were not imported. We then try again with the healthy objects. If you can't do that because your operation is not transactional, I would go with one of the options:

Fail with an Exception that contain a list of exceptions and a list of the faulty objects;
Fail with a list of objects, each object containing:
- The error level (ERROR, WARN);
- The error message;
- The faulty object.

I don't like the approach of not throwing an exception (returning an error value, for example), because it relies on developers checking for a return value in your method, which is not the standard way of treating errors in Java. I would also not go with the logging approach as it would force the user to dig into the log files to find what went wrong.

I worked once in this company that had client devices that had to perform synchronization tasks. Each client device had a record on a "devices" table, and this table contained information about the last synchronization - if it succeeded or not, and if it didn't, what went wrong. This is a nice approach if your task is performed by a service, and not as the result of a method invocation - or, better saying, if your task is asynchronous. Someone asks for the task to start, may or may not receive an event about its completion, and later checks for its results.

Ravi Wallau 2009-06-14 04:31:04

I need to give some thought to this. I'm pretty sure it runs counter to Joshua Block's 'Only use Exceptions for exceptional conditions'. In the current case, we completely expect that some of the operations may fail. I'm beginning to lean towards passing in a ProblemListener to the operation...That explicitly tells the developer that problem could occur (no return value checking), and abstracts the problem handling from the core object. This would work for both synchronous and asynchronous scenarios.

Kevin Day 2009-06-15 14:54:38

Where you say "Objects" I read Batches (at least as one implication), and I agree that logging is the appropriate conventional way to handle this, but with a Batch ID, either as a field in the log record, or maybe more likely a separate log file per batch, if the monitoring process, batch size, and object count make that workable.

le dorfier 2009-06-14 04:48:02

Yup - kind of like batches (as much as an Eclipse build is considered to be a 'batch' of build operations on each individual .java file). But in our app anyway, this is something initiated by the user and usually takes less than 5 seconds to complete (it's running on a background thread, but just to keep the EDT free).

Kevin Day 2009-06-15 14:57:00

ansaurus

tags:

views:

answers:

Design patterns for managing results of large operations

related questions