views:

50

answers:

2

I'm currently writing a Roman Numeral Converter for the fun of it. The problem works up to the aforementioned character precedence.

Since Roman Numerals are not positional, i.e. III does not symbolize 1*whatever base^2 + 1*whatever base^1 + 1*whatever base^0.

That of course makes it hard when somebody types in XIV and I need to make sure that the I is not added in this case, but rather subtracted. I'm not sure how to do this. What would be the best approach to tackle this?

I have both the Roman Symbols and their respective decimal numbers stored in arrays:

const char cRomanArray[] = "IVXLCDM";
const int romanArray[]   = { 1, 5, 10, 50, 100, 500, 1000 };

so it wouldn't be too hard for me to brute-force the damn thing by simply checking the precedence within the array, i.e. if the symbol is smaller than the next symbol, i.e. in the exampe 'XIV' if 'I' is smaller than 'V', in which case it would be because I have ordered them in the array, then I could make it subtract the value instead of add.

But this seems like a very ugly solution. Are there perhaps any better ones? I was thinking something along the lines of Regular Expressions maybe( forgive me if that sounds like a horrible idea, I haven't used RegExp yet, but it sounds like it could do what I need, and that's to determine the characters in a string.)

+1  A: 

Start from the right. Moving to the left, add the values as long as they increase (or remain the same), and subtract when they decrease.

e.g. for XLIV

Start at V, add 5.
Move to I, it's less, so subtract 1.
Move to L, it's greater, add 50.
Move to X, it's less, so subtract 10.

And you get 44, which is correct.


Alternatively, you can actually treat it as base 10, except exchange 1...9 with I, II, III, IV... IX and 10...90 with X, XX, XXX, LX....XL, L and so on.

Read in the I&V characters, convert them to 1-9, then read in the X&L characters, covert them to 10-90, and so on.

Peter Alexander
Gauthier
Oops, yeah you're right. I suppose you could just read from the right until it doesn't match a pattern, but imo my first suggestion is by far the easiest to code.
Peter Alexander
A: 

Roman numerals are positional in a sense, it is merely that each position may have multiple characters representing the decimal digit, the symbols change with decimal magnitude, and that zeroes as place holders are not used.

So use look-up tables thus:

// Decimal digit lookup tables
static const char* thousands[] = { "", "M", "MM", "MMM" } ;
static const char* hundreds[] = { "", "C", "CC", "CCC", "CD", "D", "DC", "DCC", "DCCC", "CM" } ;
static const char* tens[] = { "", "X", "XX", "XXX", "XL", "L", "LX", "LXX", "LXXX", "XC" } ;
static const char* units[] = { "", "I", "II", "III", "IV", "V", "VI", "VII", "VIII", "IX" } ;
static const char** digits[] = { thousands, hundreds, tens, units } ;

Then for each decimal digit left to right, (starting from thousands), lookup 'digits[magnitude][decimal]' in the above lookup table and append teh substring to teh output. Note that there is no Roman numeral for zero, so zeroes are simply omitted.

So for example 1234 will be translated to "M" + "CC" + "XXX" + "IV" = "MCCXXXIV".

To understand how this works see the explanation of Roman numerals at Wikipedia: Roman Numerals - Symbols specifically the last table in the "Symbols" section (i.e. do the research - even if you think you understand a subject!). Note that numbers greater than 3999 require non-ASCII symbols, so my tables are limited to 1 to 3999, but you might fix that with a Unicode solution perhaps.

Having done this for a job interview once, I have a fully worked implementation based on the above tables, but have omitted it since you are doing it for fun, and there would be no fun perhaps in having it done for you. However if you want any more pointers or hints, or even the whole solution, just ask.

Clifford
I think he's going Roman Numerals -> Decimal, rather than the other way round, but yeah, that's how you'd do it that way.
Peter Alexander
@Peter Alexander: Doh! Should have read the post more carefully. Oh well, you can still use these tables and a maximal substring search on each decade; but that may be the brute force approach he is trying to avoid.
Clifford