If I read it right, you're talking about the problem addressed by the Interpreter pattern, but sort of going in both directions.
There are some easy ways to get nice generic interfaces, so you can get the rest of the thing running. My recommendation on that is something like:
public interface Interpreter<OutputType> {
public void setCode(String coding);
public OutputType decode(String formattedData);
public String encode(OutputType rawData); }
However, there are a couple of hurdles with concrete implementations. For your date example, you might need to deal with "9/9/09", "9 SEP 09", "September 9th, 2009". The first "kind" of date is straightforward - numbers and set divider symbols, but either of the other two is pretty nasty. Honestly, doing something totally generic (which could already be canned) probably isn't reasonable, so I recommend the following.
I'd attack it on two levels, the first of which is pretty straightforward with regex and format string: chomping up the data string into the things that are going to become raw data. You'd supply something like "D*/M*/YY" (or "M*/D*") for the first one, "D* MMM YY" for the second, and "Mm+ D*e*, YYYY" for the last, where you've defined in your data some reserved symbols (D, M, Y, obvious interpretations) and for all data types (* multiple characters possible, + "full" output, e defined extraneous characters) - these symbols obviously being specific to your application. Then your regex stuff would chomp the string up, feeding everything associated with each reserved character to the individual data fields, and saving the decoration part (commas, etc) in some formatting string.
This first level can all be fairly generic - each data type (e.g., date, coordinate, address) has reserved symbols (which don't overlap with any formatting characters), and all data types have some shared symbols. Perhaps the Interpreter interface would also have public List<Character> reservedSymbols()
and public void splitCode(List<String> splitcodes)
methods, or perhaps guaranteed fields, so that you can make the divider an external class and pass in the results.
The second level is less easy, because it gets at the part that can't be generic. Based on the format of the reserved symbols, the individual fields need to know how to present themselves. To the date example, MM would tell the month to print as (01, 02, ... 12), M* as (1, 2, ... 12), MMM as (JAN, FEB, ... DEC), Mmm as (Jan, Feb, ...Dec), etc. If your company has been somewhat consistent or doesn't venture too far from standard representations of stuff, then hand coding each of these shouldn't be too bad (and in fact, there are probably smart ways within each data type to reduce replicated code). But I don't think it's practical to generify all this stuff - I mean, practically representing that something that can be presented as a number or characters (like months) or whole data that can be inferred from partial data (e.g., century from year) or how to get truncated representations from the data (e.g., the truncation for year is to the last two digits vice most normal numbers truncating to two leading digits) is probably going to take as long as handwriting those cases, though I guess I can imagine cases of your application the trade-off might be worth it. Date is really tricky example, but I can certainly see equally tricky things coming up for other sorts of data.
Summary:
-there's an easy generic face you can put on your problem, so the rest of your app can be coded around it.
-there's a fairly easy and generic first pass parsing, by having universal reserved symbols, and then reserved symbols for each data type; make sure these don't collide with symbols that will appear in formatting
-there's a somewhat tedious final coding stage for individual data bits