To Form a More Perfect union

Next to enum, one of the more curious C language keywords is union. It’s tremendously unpopular. I would offer that it’s also not needed, but no one is talking about deprecating it.

The union keyword has its roots in the Pascal language. Pascal used something called a variant record, which was a container into which you could place different types of variables. The union keyword lets you create a similar type of container, although for the life of me I can’t describe a situation where it’s the only solution.

When you create a union variable, you specify multiple variable types it can contain. For example, a union could contain a float or an int value. Or it could contain two or more structures. The compiler allocates space for the widest variable. After that, it’s up to the programmer to determine which variable type is used and access it appropriately.

The C language union is yet another example of how C is considered dangerous by modern standards. In other languages, you can’t plop down storage for an any-type variable. The compiler requires that the variable types always be known and strictly adhered to.

As you know, variables in C are allocated storage based on their type. A char is one byte, an int could be 4 bytes on some machines, and so on. The sizeof operator returns the actual variable size.

When you define a union in your code, it lists several potential variable types. The compiler allocates storage for only one variable by using the widest type. So if the union consists of a single char and a float, the union will be the size of the float, the wider variable.

If that’s not confusing enough, unions are declared similar to a structure. In fact, I could argue that a structure would be a better tool and I’m sure many other programmers simply come to that conclusion.

The following declaration is for a union that can hold either a char or a float:

union jack {
    char c;
    float f;
} u;

The union is named jack, similar to how a structure type is declared. The jack union can hold either a char or float variable, named c or f, respectively. The variable u is the name of the jack union.

Unlike a structure, the union doesn’t contain both the char and float. Storage is available for only one. So you can use either u.c for the char variable or u.f for the float, but not both.

So what’s the point of a union?

Seriously, you got me.

Back in the dark days of DOS, programmers used a union of structures to set the PC’s CPU registers. The union allowed access to the registers as both 8-bit bytes and 16-bit words, which came in handy. Since then, I’ve not seen any code where a union was required.

I’ve concocted the following source code as an example of how a union could be used:

#include <stdio.h>

int main()
{
    union length {
        float fsize;
        int isize;
    };
    union length l;

    l.fsize = 2.09;
    printf("The table is exactly %.2f meters long.\n",
            l.fsize);
    l.isize = 2;
    printf("The table is about %d meters long.\n",
            l.isize);

    return(0);
}

I split the union declaration into two parts, similar to how I declare structures. The union length is defined at Lines 5 through 8. Line 9 declares the length union variable l.

At Line 11, the union’s float member is accessed, l.fsize. It’s used in the printf() function at Lines 12 and 13.

At Line 14, the union’s int member is accessed, l.isize. It’s the same memory as l.fsize, but a different type thanks to the union.

Here’s sample output:

The table is exactly 2.09 meters long.
The table is about 2 meters long.

I confess that this example is silly, mostly because this isn’t the only solution available. If I discover an absolute scenario where a C language union is either indispensable or used in a wildly inventive way, I’ll pass it along in a future Lesson.

Leave a Reply