The task for last week’s Lesson was to convert tabs as well as spaces. The problem is that tab stops aren’t considered: On the terminal, a tab character generates a variable number of spaces based on where the next tab stop position is located. It isn’t a fixed value.
The default tab stop on a Linux terminal is set to eight spaces. Like on a typewriter or word processor, when a tab character is encountered, the terminal generates spaces necessary to move the cursor to the next tab stop. In Figure 1, you see how the tab stops are set, and how spacing is calculated to position the cursor when a tab character is encountered.
It’s possible to alter the terminal’s tab stops. The tabs command sets the interval, which means a user can adjust the tab stops to a value other than eight. For my sconvert utility, however, I assume the default value. Various trickery can be employed to obtain the current tab stop value, but that isn’t the point of this exercise.
Two updates are required for the code to calculate the proper number of spaces to output to line up text at a tab stop. The first is to track the current cursor position or offset from the start of the line. The second is to determine how many spaces are necessary to move the cursor to the next tab stop.
For the cursor offset, I use int variable offset
. It’s initialized to zero at the start of each line, then incremented each time a character is output, which includes the
HTML code for a non-breakable space. The newline ('\n'
) and carriage return ('\r'
) characters are also intercepted to reset the offset
value to zero.
The expression I use to calculate spaces to the next tab stop is:
spaces = TAB_STOP-(offset%TAB_STOP);
The offset
MOD TAB_STOP
calculation must be subtracted from the TAB_STOP
value to obtain the number of spaces to insert for the cursor to reach the next tab stop. The spaces
value is then used to generate space characters (
) to fill the gap. Here is my updated sconvert code:
2023_04_15-Lesson.c
#include <stdio.h>
/* assumptions */
#define TAB_STOP 8
#define LINE_LEN 80
int main()
{
int ch,offset,spaces,x;
offset = 0;
while(1)
{
ch = getchar();
if( ch==EOF )
break;
switch(ch)
{
case ' ':
printf(" ");
offset++;
break;
case '\t':
spaces = TAB_STOP-(offset%TAB_STOP);
for( x=0; x<spaces; x++ )
{
offset++;
printf(" ");
}
break;
case '\n':
case '\r':
putchar(ch);
offset = 0;
break;
default:
putchar(ch);
offset++;
}
if( offset==LINE_LEN )
offset = 0;
}
return(0);
}
The two defined constants, TAB_STOP
and LINE_LEN
, are assumptions based on standard terminal settings: eight-character tab stops and an 80-character line width.
Variable offset
is initialized before the endless while loop.
The code’s switch-case structure is now more involved. Each character output modifies the offset
value, with the newline and carriage return characters resetting the value.
For the tab, variable spaces
represents the spaces to move forward the cursor by using the expression outlined earlier in this post. It’s important to use this value and not offset
directly as variable offset
is modified within the for loop that outputs the spaces (or
HTML codes).
After the switch-case structure, a test is made to see whether offset is greater than the line length, which happens for a long line that overflows. If so, variable offset
is again reset to zero.
Here’s a sample run, filtering output from the days program used in last week’s Lesson:
Monday 0
Tuesday 1
Wednesday 2
Thursday 3
Friday 4
Saturday 5
Sunday 6
For reference, Figure 1 is shown nearby. The output is properly converted, which means the sconvert program is one step closer to being complete. All that remains is accounting for the special characters: <
, >
, &
. I cover this final update in next week’s Lesson.