Tending Toward Obfuscation

I’m always mindful of the beginner when I write code for this blog. Even with my own code, I often go to pains to write things “the long way” just because I’m in the habit. Not every coder is so thoughtful. The reason is that C tends toward obfuscation. We must revel in this capability.

Of course, I can write delightfully cryptic code. It’s compact. But many programmers know that it’s a tough knot to unwind in the future when you may need to revisit your obfuscated code. It’s the “what the heck was I thinking?” phenomenon. But for me, what happens is often the opposite as I initially write code unwound.

Consider this code from an earlier post on a filter I wrote to convert spaces to the non-breakable space HTML code,  :

2023_04_01-Lesson.c

#include <stdio.h>

int main()
{
    int ch;

    while(1)
    {
        ch = getchar();
        if( ch==EOF )
            break;
        if( ch==' ' )
            printf("&nbsp;");
        else
            putchar(ch);
    }

    return(0);
}

This code works and is understandable by most beginning C programmers, but it can be shortened. The key for me is that the while loop terminates upon the EOF character being encountered. Therefore, the loop can be re-written with the EOF as its terminating condition, saving several lines of code.

You can write while(ch!=EOF) to start the loop, providing that you first initialize variable ch:

2024_05_04-Lesson-a.c

#include <stdio.h>

int main()
{
    int ch = '\0';

    while(ch!=EOF)
    {
        ch = getchar();
        if( ch==' ' )
            printf("&nbsp;");
        else
            putchar(ch);
    }

    return(0);
}

In this update, variable ch is initialized to the null character. The while condition works because ch already holds a value, which isn’t the EOF. Then the rest of the loop continues, converting space characters from standard input into the HTML unbreakable space code. Yet more compacting is possible.

The getchar() function returns a single character from standard input. The value is an integer, allowing the EOF constant to be read. But the expression ch = getchar() also generates a value, the character returned. Therefore, this statement can be used as the while loop’s condition:

2024_05_04-Lesson-b.c

#include <stdio.h>

int main()
{
    int ch;

    while( (ch=getchar()) != EOF )
    {
        if( ch==' ' )
            printf("&nbsp;");
        else
            putchar(ch);
    }

    return(0);
}

The while loop’s condition now both reads a character from standard input and compares the character with the EOF. If true, the loop stops.

Reading the expression from the inside out: ch=getchar() reads a character from standard input and stores it in variable ch. This expression is enclosed in parentheses, where it evaluates as the character returned, the value in variable ch. This value is then compared with the EOF. If it’s not equal, the loop repeats: The character read is compared with a space, replaced if true, output otherwise. When the character is the EOF, the loop stops.

This approach is how I believe most experienced C programmers would write the loop. First, it takes fewer statements. Second, it’s cryptic. When teaching, I would need to explain all the mechanics, which is what I did above. It’s easier for me, and better for the beginner, to see things done the long way and then worry about tightening the code later.

Leave a Reply