Roman numerals are composed of letters, so it makes sense that their “values” are input and displayed as strings. To translate from that string into an integer, your program must convert each character into its corresponding decimal value. Sounds simple, right?
Well, it is rather simple! For each valid Roman numeral character, you increment an integer value by a given amount. Assume that ch
is a character in a string and rn
is an int variable initialized to zero:
if( ch == 'I' ) rn += 1;
if( ch == 'V' ) rn += 5;
if( ch == 'X' ) rn += 10;
if( ch == 'L' ) rn += 50;
if( ch == 'C' ) rn += 100;
if( ch == 'D' ) rn += 500;
if( ch == 'M' ) rn += 1000;
From my books, you learn that a series of single character comparisons translates into a switch-case statement. So the basis of a roman2arabic() function would be a giant switch-case structure that examines each character in a Roman numeral string and creates a decimal tally. Here is just such a function:
int roman2arabic(char *roman) { int value; char *r; r = roman; value = 0; while(*r != '\0') { switch(*r) { case 'M': value += 1000; break; case 'D': value += 500; break; case 'C': value += 100; break; case 'L': value += 50; break; case 'X': value += 10; break; case 'V': value += 5; break; case 'I': value += 1; break; /* terminate on whitespace */ case '\n': case ' ': case '\t': return(value); break; default: /* invalid character */ return(0); } r++; } return(value); }
The roman2arabic() function accepts a string, roman
, as input. The int variable value
holds the accumulated total, the return value. The variable value
is initialized to zero and a pointer r
is created for use inside the function.
The while loop processes string r
. For valid Roman numerals (which are assumed to be upper case only), variable value
is increased. If the string contains any whitespace, value
is returned immediately. If the string contains any other character, zero is returned, which is an error condition. (Remember, a Roman numeral zero doesn’t exist.)
This function works for all valid strings of Roman numerals. The characters can also be in any order. I could modify the function to ensure that the format is proper, but this code is more about translation than syntax. The only thing that the function doesn’t handle are the abbreviations: CM, CD, XC, XL, IX, and IV. To deal with those abbreviations, the case statements for characters I, X, and C require additional processing, such as:
case 'C': if( *(r+1) == 'M') { value += 900; r++; } else if( *(r+1) == 'D') { value += 400; r++; } else { value += 100; } break;
For each character, such as C
above, the following character in the string is checked, *(r+1)
. Characters M
and D
(for value C
) hold special meaning, so the if-else if-else structure properly increments variable value
accordingly. And pointer r
is incremented if the proper suffix character is found.
Adding three if-else if-else structures to the roman2arabic() function increases the code’s length substantially. A main() function, is also required (duh): Its job is to read the string and then output the result. You can view the entire code here.
Here is sample output:
Type a Roman Numeric value: MMCCCLXII
The value is: 2362
And:
Type a Roman Numeric value: MCMLXXIV
The value is: 1974
I also wrote a smaller version (89 lines versus 109), which uses a specific function to process the CM, CD, XC, XL, IX, and IV values. You can view that code here.
Just as you can translate Roman numerals to decimal, you can also translate the other way. That topic is covered in next week’s Lesson.