views:

273

answers:

8

Imagine I have a document (word document).

I have an enumeration which will indicate how to extract data from the document. So if I want just text, the images, or both (3 members of the enumeration).

I have a case statement based on this enumeration, but without falling into a code smell, how can I write code which isn't too repetitive? For every condition in the switch, should I have a seperate method (the easiest way), or a method accepting a paremeter (like the value of the enumeration), and then use if statements to say if(xyz) do abc, and so on.

Or is there a quicker, more efficient way?

Thanks

+6  A: 

I would use a Strategy pattern coupled with a factory to create the appropriate strategy based on the value of the enumeration. EDIT As others have pointed out you could also determine the correct strategy via a Map as well. Factory is my choice because it only encapsulates the logic and doesn't require any data storage.

public interface IExtractionStrategy
{
    object Extract( Document doc );  // or what ever result is best
}

public class TextExtractionStrategy : IExtractionStrategy
{
    public object Extract( Document doc )
    {
     .... algorithm for extracting text...
    }
}

public class ImageExtractionStrategy : IExtractionStrategy
{
    public object Extract( Document doc )
    {
     .... algorithm for extracting images...
    }
}


public static class StrategyFactory
{
     IExtractionStrategy GetStrategy( ExtractionEnum strategyType )
     {
         switch (strategyType)
         {
             case ExtractionEnum.Text:
                 return new TextExtractionStrategy();
                 break;
             case ExtractionEnum.Image:
                 return new ImageExtractionStrategy();
                 break;

             ...
         }
     }
}
tvanfosson
Always like to see the use of a design pattern.
Chris
A: 

A solution could be having one class for every way you want to extract the data. Each class would do all the extraction work and they could have a common IExtract interface. You would only have to return the interface based on the enumeration and then just call the methods on the interface.

gcores
+1  A: 

Use a strategy pattern to decouple the method of data processing from its selection.

You could then use a straight Map<Enum, Strategy> to hold the relationship from enumeration to extraction strategy, or even have each instance of the enumeration hold a reference to its own strategy.

Dan Vinton
+2  A: 

Hi,

Since it is hard to give a general statement without knowing your exact code I can only recommend you to read this article:

Back to Basics - Life After If, For and Switch - Like, a Data Structures Reminder by Scott Hanselman.

Since you only have 3 values in your enum any pattern like the strategy pattern or a map might be an architectural overkill (don't get me wrong, these patterns can be very useful on larger enums). It's all about choosing the right solution for a specific problem.

Regards, divo

0xA3
Disagree on overkill. Strategy separates out the algorithms in to separate containers and allows you to implement via an interface. Both good practices to decrease coupling and improve testability. Factory vs. Map is an open question, but factory is more flexible.
tvanfosson
And two months down the line his boss says "can you write something to strip Footnotes|Bibliography|URLs|etc from the document too. And any combination thereof, in order of appearance in the document. Not that this might break all of the individual strategy patterns above.
JeeBee
At that point, I refactor them to make them composable, perhaps using Decorator, and change my factory to return the correctly composed strategy. Why is the fact that future requirements might break the current solution a problem if you are using a good pattern?
tvanfosson
JeeBee, Google for YAGNI and stop stressing about the unknown. :)
Mike Woodhouse
YAGNI, exaclty that was my point, nothing else ;-) Thanks.
0xA3
+1  A: 

How many classes does it take to change a lightbulb?

Perhaps it's just a terminology difference, but this seems to me a simple case of needing a dispatch table:

use constant EXTRACT_TEXT => 1, EXTRACT_IMAGES => 2, EXTRACT_BOTH => 3;
my %extractor = (
    (EXTRACT_TEXT) => \&extract_text,
    (EXTRACT_IMAGES) => \&extract_images,
    (EXTRACT_BOTH) => \&extract_both,
);
...
die "no extractor found for $enum_value" if ! $extractor{ $enum_value };
$extractor{ $enum_value }->( $document_info );
ysth
Assuming he's using Perl (?) of course, which he didn't say he was. His other Qs and As tend to suggest he's a C# guy, though.
Mike Woodhouse
No, I wasn't assuming that. I was following others' lead in picking an arbitrary language for an example. Adjust to suit (for languages where that's possible).
ysth
+1  A: 

I would do (pseudocode):


switch(enum)
   case images:
       extractImages();
       break;
   case text:
       extractText();
       break;
   case both:
        extractImages();
        extractText();
        break;
Kevin
A: 

Thanks for the advice guys, that definately answers my question and helps my design! :)

dotnetdev
A: 

If you're using Java enumerations, then you can make use of the fact that they are actually objects:

enum ExtractionEnum {
    IMAGE {
        public byte[] extract(InputStream is) { ... }
    },

    TEXT {
        public byte[] extract(InputStream is) { ... }
    };

    public abstract byte[] extract(InputStream is);
}

// ...
public void doSomething(ExtractionEnum type) {
    byte[] data = type.extract(getInputStream());
    ...
}

I'm not sure if I like this better than the strategy pattern or good-ol' switch/case, but it works.

Daniel Spiewak