The Ghost of Octal

Programmers become exposed to multiple counting bases as well as ways of representing values. As a human, you work in base 10, probably thanks to your 10 fingers. When you program, you learn about base 2 binary and base 16 hexadecimal. You study exponential notation as well, especially for crunching very large or very small values.

Somewhere in the mire, you encounter base 8, octal. You nod appropriately at the information as it’s glossed over, then you move on and never return to discover what was once a huge deal.

Programmers of the 1960s and 1970s ate octal for breakfast. They knew octal codes the way today’s nerds have memorized hexadecimal values and their binary bit patterns. That’s because computers of the era, called minicomputers, didn’t use 8-bit bytes. They used 36-bit bytes or words.

A 36-bit word can be broken up into 12 chunks of 3-bits each. And in binary, 3 bits represent values from 0 to 7 — base 8 octal:

Bits Octal Decimal
000 00 0
001 01 1
010 02 2
011 03 3
100 04 4
101 05 5
110 06 6
111 07 7

What comes after 7? Well, 8 of course! But in base 8, the value 8 is written as 10. (In base 2, the value 2 is written as 10; in base 10 the value 10 is written as 10; in base 16, the value 16 is written as 10.)

To avoid confusing octal with decimal values, in the C language octal values are prefixed with a zero: 010 is octal 10. If you use a color-coded editor, the initial zero in an octal value is given a different color to help express and identify the value as octal, not decimal. (If the initial zero is followed by a period, then the value is a decimal real number, not octal.)

Rules regarding octal values were established in the B programming language and inherited by C when it was developed in the 1970s. Because those programmers worked in 36-bit minicomputer environments, octal was handy and used frequently. In fact, because C is the ancestor of nearly every popular programming language today, octal remains as a valid number format. Some languages even borrow the leading zero for octal notation.

This ancestry also explains why single-character escape sequence uses octal instead of decimal or hex. For example:

a = '\100';

The above statement assigns octal value 100 (or 0100) to variable a. That’s the '@' character, decimal value 64.

The following code generates a table of the first 64 octal values, along with their decimal and hexadecimal equivalents:

#include <stdio.h>

int main()
{
    int a;

    puts("Oct Dec Hex\tOct Dec Hex\tOct Dec Hex\tOct Dec Hex");
    for(a=00;a<=017;a++)
    {
        printf("%3o %3d %3X\t",a,a,a);
        printf("%3o %3d %3X\t",a+020,a+020,a+020);
        printf("%3o %3d %3X\t",a+040,a+040,a+040);
        printf("%3o %3d %3X\n",a+060,a+060,a+060);
    }

    return(0);
}

All values in the above code are expressed as octal; they feature a leading zero. The %o placeholder in the printf() statements generates octal output.

Here’s a sample run:

Oct Dec Hex	Oct Dec Hex	Oct Dec Hex	Oct Dec Hex
  0   0   0	 20  16  10	 40  32  20	 60  48  30
  1   1   1	 21  17  11	 41  33  21	 61  49  31
  2   2   2	 22  18  12	 42  34  22	 62  50  32
  3   3   3	 23  19  13	 43  35  23	 63  51  33
  4   4   4	 24  20  14	 44  36  24	 64  52  34
  5   5   5	 25  21  15	 45  37  25	 65  53  35
  6   6   6	 26  22  16	 46  38  26	 66  54  36
  7   7   7	 27  23  17	 47  39  27	 67  55  37
 10   8   8	 30  24  18	 50  40  28	 70  56  38
 11   9   9	 31  25  19	 51  41  29	 71  57  39
 12  10   A	 32  26  1A	 52  42  2A	 72  58  3A
 13  11   B	 33  27  1B	 53  43  2B	 73  59  3B
 14  12   C	 34  28  1C	 54  44  2C	 74  60  3C
 15  13   D	 35  29  1D	 55  45  2D	 75  61  3D
 16  14   E	 36  30  1E	 56  46  2E	 76  62  3E
 17  15   F	 37  31  1F	 57  47  2F	 77  63  3F

At this point, octal is purely a curiosity. Unlike days of yore, you don’t have any distinct advantage to using octal in a program over using hexadecimal, which fits in better with today’s computer architecture. But back in the day, using octal helped code 9 bits of information or let you easily slice up a chunk of 36-bit data. It was the default lingo of the programmers and hackers of the era.

Leave a Reply