Properly Padding Spaces

I’ve written two programs specifically for this blog. The first converts a C source code file into HTML. The second translates program output into HTML. Both of these programs are time-savers, helping me prepare and present the code and output without having to hand-code everything.

The source code to HTML program is called chtml, for C-to-HTML. It scans for line breaks, tabs, spaces, comments, ampersands, and less-than/greater-than characters in the text. Each of these is converted to an HTML code. The program also converts the filename and adds the GitHub link for me. The result is output as text, which I copy-paste into the post.

The chtml program saves me from having to convert all the coding myself and it catches things such as <stdio.h>, which is often consumed as an HTML tag.

I wrote chtml back in 2019 and have updated it a few times since, most recently just last year. But I was missing its companion program, which I coded only a few months back. This program, sconvert, is a filter that consumes a program’s text output and converts it into HTML code.

This issue sconvert resolves are spaces, two or more of which are ignored in HTML. For example, if the code outputs a matrix as a table, spaces line up the columns. Here is how such output looks when pasted into an HTML document:

5 8 3 8 10 9 13 18 12
5 10 4 6 2 9 11 12 13
6 2 1 1 4 9 7 6 10

Here is how it looks when I covert it to HTML, replacing the spaces with &nbsp; (non-breakable space) tags:

  5   8   3       8  10   9      13  18  12 
  5  10   4       6   2   9      11  12  13 
  6   2   1       1   4   9       7   6  10 

Big difference!

Before I wrote sconvert, I had to manually replace the spaces with &nbsp; tags to ensure everything lined up. The sconvert program saves me oodles of time performing the conversion for me. Here it is, presented and converted thanks to the chtml program I wrote:

2023_04_01-Lesson.c

#include <stdio.h>

int main()
{
    int ch;

    while(1)
    {
        ch = getchar();
        if( ch==EOF )
            break;
        if( ch==' ' )
            printf("&nbsp;");
        else
            putchar(ch);
    }

    return(0);
}

This filter is a simple replacement. The while loop spins eternally, while(1). The statement ch = getchar() fetches characters from standard input.

Remember that variable ch must be an int in order to detect the end-of-file marker, EOF. This test is performed to terminate the while loop:

if( ch==EOF )
    break;

The second if test checks for the space character, if( ch==' ' ). When found, the &nbsp; HTML character code text is output. Otherwise, the character read is output, putchar(ch);.

To capture the program’s output, I use a command like:

$ a.out | sconvert

The properly-formatted text, with the spaces converted for me, is then copied and pasted into a blog post.

After using this program for a few weeks, I discovered a flaw. Can you guess what it is? It’s not that the output can contain less-than, greater-than, or ampersand characters. It’s more subtle than that. In next week’s Lesson when I reveal the flaw and the updated sconvert utility.

2 thoughts on “Properly Padding Spaces

  1. Is the bug that it doesn’t add br tags? Or that it doesn’t pick up anything written to stderr? Or maybe it gets confused by Unicode?

  2. The code is evolving. I wrote it originally as a quick hack. But then I started to refine it, as you’ll see over the next few lessons.

Leave a Reply