Finding Those Pesky Null Characters!

You’ve crafted a brilliant function, ensuring that it properly processes words and generates needed output. Is the code perfect? Well, it looks perfect. But how do you know for certain?

The following program outputs the word pneumonoultramicroscopicsilicovolcanoconiosis, reportedly the longest word in the English language. It uses a clever while statement where the condition is neatly obfuscated — something I avoid doing in my own code.

2025_09_20-Lesson.c

#include <stdio.h>

int main()
{
    char word[] = "pneumonoultramicroscopicsilicovolcanoconiosis";
    int x = 0;

    while( putchar( word[x++] ) )
        ;
    putchar('\n');

    return 0;
}

Here is the output, exactly as planned — or is it?

pneumonoultramicroscopicsilicovolcanoconiosis

Well, you know that the output doesn’t appear exactly as planned or I wouldn’t have asked the question. The point is that I’ve often written code where the output looks great, but upon closer examination flaws in the code go unnoticed. Especially with text output, it’s easy to overlook things you cannot see, specifically null characters that are part of output but do not print. For the code above, this extra output is exactly what happens.

This month’s Exercise provides a good example. One of my first solutions looked great! But when I examined the output more closely, it was pockmarked with null characters.

One way to check whether the program’s output is proper is to run it through the hexdump filter, a Linux command line utility. For example:

$ ./a.out | hexdump -C

The program’s name is a.out, the default. It’s prefixed with ./, which directs the command interpreter to look for and run the file found in the current directory. The -C switch directs the hexdump utility to use the Canonical format, which includes the ASCII portion (the third column). Here is the command’s output.

00000000  70 6e 65 75 6d 6f 6e 6f  75 6c 74 72 61 6d 69 63  |pneumonoultramic|
00000010  72 6f 73 63 6f 70 69 63  73 69 6c 69 63 6f 76 6f  |roscopicsilicovo|
00000020  6c 63 61 6e 6f 63 6f 6e  69 6f 73 69 73 00 0a     |lcanoconiosis..|
0000002f

Pay attention to the last word of output. Byte 0x73 is the final ‘s’ in pneumonoultramicroscopicsilicovolcanoconiosis. Then comes a null character, 0x00, then finally the newline 0x0a. The null character doesn’t print, so you don’t see it in the output &mdash but it’s there!

When you code a program that outputs text, always double-check the output to confirm that it’s free from unwanted null characters. For this month’s Exercise, an early solution looked good, but it was polka dotted with null characters. I noticed the flaw in my logic, and fixed the code to prevent the null characters from appearing.

I can think of a number of ways to fix the sample code, most of which involve deconstructing the while loop’s clever condition. This alternative works:

while( word[x] )
    putchar(word[x++]);

These statements show the while loop as just two lines, which retains the incrementing operator inside the brackets. My preferred way to write this code is to set the x++ as its own statement, which is more readable, but whatever.

Any way you code it, always confirm that the output is free from null characters. Yes, you may not see them, and users wouldn’t notice them, but they may have unintentional consequences for non-obvious uses of the program.

One thought on “Finding Those Pesky Null Characters!

  1. It would be an interesting project to write a home-made version of hexdump. A useful enhancement would be to display the descriptions of “invisible” characters, eg “Space” for decimal 32. Just use an array indexed with the codes.

Leave a Reply