Integer Size Values – Again

In a post from 2020, I wrote about various integer size values. Seems they don’t consistently sport the same bit width. Just recently, I’ve obtained some updated information and valuable insight into the bit width question.

As a review, the integer data types in C are:

  • char
  • short
  • int
  • long
  • long long

Each of these types has a bit width and therefore signed and unsigned values in a certain range. But as I pointed out in the earlier post, these values have changed over the years. I’ve come to learn that they’re inconsistent on various machines even to this day.

A very knowledgeable coder and mathematician recently reviewed my upcoming book, Tiny C Projects. In the book, I pontificated about integer data type sizes. The coder, Diane, had some marvelous insights into the topic of bit widths, which are shared here:

In Chapter 9 of the book, there is a discussion about data sizes. It is, alas, incorrect in its representation of word size and type. The only two types with a mandated size are char (8-bits) and long (the size of a word) Even getting char to 8 bits was a point of controversy at the time since there was at least one committee member whose employer’s product had 6-bit bytes in a 36-bit word.

It’s fascinating for me to read Diane confirm that a char (a byte) is 8-bits. I’ve read for years never to assume that a char is a byte, let alone 8-bits. But she confirms that at least one machine out there features a 36-bit word size which is chopped into 6-bit bytes. It’s a fascinating fact, something I wouldn’t consider, but obviously a reality in the realm of C programming.

The data types are implementation-dependent but must adhere to the value range if specified. That is why longs and ints were often the same size. A supercomuter of the time, say, a Cray-XMP, had 64-bit word and 128 bit doubles. Other supercomputer makers such as ETA Systems and come unnamed computer being developed at some unnamed facility of the US Department of Energy had similar sizes although there was at least one system with 128-bit words in hardware at least in the design stage. IBM mainframes, on the other hand, had 32-bit words which is why longs were guaranteed to be word size while integers ended-up as values.

A short exists because DEC — especially — was insistent that the notion of a short integer be maintained and microprocessors were commonly 8-bit machines. Since it saved a lot of data on those extremely expensive hard disks, the other manufacturers’ members were quick to agree. It also made porting Unix a lot easier. That was a very big deal in the late 80’s/early 90’s.

I find all these details to be fascinating history. Diane’s comments were appreciated and I felt the need to share them as her insight is valuable when it comes to understanding data types, their history, and rationale behind the values chosen.

Leave a Reply