From Roman to Decimal

Roman numerals are composed of letters, so it makes sense that their “values” are input and displayed as strings. To translate from that string into an integer, your program must convert each character into its corresponding decimal value. Sounds simple, right?

Well, it is rather simple! For each valid Roman numeral character, you increment an integer value by a given amount. Assume that ch is a character in a string and rn is an int variable initialized to zero:

if( ch == 'I' ) rn += 1;
if( ch == 'V' ) rn += 5;
if( ch == 'X' ) rn += 10;
if( ch == 'L' ) rn += 50;
if( ch == 'C' ) rn += 100;
if( ch == 'D' ) rn += 500;
if( ch == 'M' ) rn += 1000;

From my books, you learn that a series of single character comparisons translates into a switch-case statement. So the basis of a roman2arabic() function would be a giant switch-case structure that examines each character in a Roman numeral string and creates a decimal tally. Here is just such a function:

int roman2arabic(char *roman)
{
    int value;
    char *r;

    r = roman;
    value = 0;
    while(*r != '\0')
    {
        switch(*r)
        {
            case 'M':
                value += 1000;
                break;
            case 'D':
                value += 500;
                break;
            case 'C':
                value += 100;
                break;
            case 'L':
                value += 50;
                break;
            case 'X':
                value += 10;
                break;
            case 'V':
                value += 5;
                break;
            case 'I':
                value += 1;
                break;
            /* terminate on whitespace */
            case '\n':
            case ' ':
            case '\t':
                return(value);
                break;
            default:
                /* invalid character */
                return(0);
        }
        r++;
    }
    return(value);
}

The roman2arabic() function accepts a string, roman, as input. The int variable value holds the accumulated total, the return value. The variable value is initialized to zero and a pointer r is created for use inside the function.

The while loop processes string r. For valid Roman numerals (which are assumed to be upper case only), variable value is increased. If the string contains any whitespace, value is returned immediately. If the string contains any other character, zero is returned, which is an error condition. (Remember, a Roman numeral zero doesn’t exist.)

This function works for all valid strings of Roman numerals. The characters can also be in any order. I could modify the function to ensure that the format is proper, but this code is more about translation than syntax. The only thing that the function doesn’t handle are the abbreviations: CM, CD, XC, XL, IX, and IV. To deal with those abbreviations, the case statements for characters I, X, and C require additional processing, such as:

            case 'C':
                if( *(r+1) == 'M')
                {
                    value += 900;
                    r++;
                }
                else if( *(r+1) == 'D')
                {
                    value += 400;
                    r++;
                }
                else
                {
                    value += 100;
                }
                break;

For each character, such as C above, the following character in the string is checked, *(r+1). Characters M and D (for value C) hold special meaning, so the if-else if-else structure properly increments variable value accordingly. And pointer r is incremented if the proper suffix character is found.

Adding three if-else if-else structures to the roman2arabic() function increases the code’s length substantially. A main() function, is also required (duh): Its job is to read the string and then output the result. You can view the entire code here.

Here is sample output:

Type a Roman Numeric value: MMCCCLXII
The value is: 2362

And:

Type a Roman Numeric value: MCMLXXIV
The value is: 1974

I also wrote a smaller version (89 lines versus 109), which uses a specific function to process the CM, CD, XC, XL, IX, and IV values. You can view that code here.

Just as you can translate Roman numerals to decimal, you can also translate the other way. That topic is covered in next week’s Lesson.

Leave a Reply