Safe Coding Practices – String Handling I

C offers a smattering of string-manipulation functions, but it leaves many of the critical issues up to you. Specifically, you must ensure that a string doesn’t overflow its buffer and that all strings are capped with the null character, '\0'.

I’m guilty of violating both these rules and I’m not alone. Unsafe code often runs just fine under some circumstances. The compiler doesn’t check for buffer overflows and sometimes you’re lucky that a null character is lurking in memory right where a string ends. Fortune’s winds don’t always blow in a favorable direction, which is why you must avoid these two unsafe conditions.

In C, the programmer must perform the error-checking for strings, a process that’s automatic in other programming languages.

The two functions at issue here are strcpy() to copy a string and strcat() to stick one string on the end of another. Both functions require that the strings manipulated terminate with the null character, '\0'. Both functions assume in a deadly manner that the buffers manipulated can hold all the characters copied.

Consider the following code that uses the strcpy() function to copy a string from one buffer to another:

#include <stdio.h>
#include <string.h>

int main()
{
    char buf1[] = "Goodbye!";
    char buf2[3];

    strcpy(buf2,buf1);

    printf("'%s' and '%s'\n",
            buf1,
            buf2
          );

    return(0);
}

Array buf1[] holds nine characters, eight for the string Goodbye! and one for the null character at the end (which you don’t see). The buf2[] array, however, has room for only 3 characters. The strcpy() function won’t error because of the overflow.

When I run this code, I see this output:

'dbye!' and 'Goodbye!'

I’m not sure what’s going on, but I saw this output on both the PC and my Linux systems. The string dbye! is 3 characters too long for the buffer. On my Mac, however, I get the message Abort trap: 6. Obviously the Mac is more aggressive with its overflow checking.

It’s up to you to ensure that a buffer doesn’t overflow. Only when the size of buf2[] is changed to fully accommodate the string in buf1[] does the code run safely. Yet, imagine a large program where you innocently change the size of one buffer and forget to change the size of another: You’ve just written risky code.

In the preceding example, the string size is know; it won’t change at runtime. So, you could manually set both buffers to the same size or something large enough to accommodate the known text. When the string size is unknown, you must properly size the buffer at runtime. In the following code, the second buffer is a pointer. Its size is set when the code runs:

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

int main()
{
    char buf1[] = "Goodbye!";
    char *buf2;

    /* allocate storage */
    buf2 = (char *)malloc( strlen(buf1) + 1 );
    if( buf2 == NULL)
    {
        fprintf(stderr,"Unable to allocate buffer\n");
        exit(1);
    }

    strcpy(buf2,buf1);

    printf("'%s' and '%s'\n",
            buf1,
            buf2
          );

    return(0);
}

At Line 11, the statement (char *)malloc( strlen(buf1) + 1 ) allocates the proper amount of storage for the string in buf1 and assigns that memory location to the buf2 pointer. The +1 is necessary because the strlen() function counts only characters in the string, not the null character at the end.

This same type of solution must be used when concatenating strings. I cover that problem and its solution in next week’s Lesson.

Leave a Reply