The Most Curious Aspect of the scanf() Function

It’s incorrect to say that the format string for the printf() function is identical to the one used in the scanf() function. Both are similar, but scanf() has one major difference.

From last week’s Lesson, I showed how to use multiple placeholders in the scanf() input format string. In addition to those shared with the printf() function, scanf() features two additional placeholder characters: %[...] and %[^...]. These are input filters.

The for %[...] filter, the scanf() function reads text input and captures only the characters specific in the placeholder (the ellipsis). The following code demonstrates:

2020_09_12-Lesson-a.c

#include <stdio.h>

int main()
{
    char buffer[32];

    printf("Type: ");
    scanf("%[ABC]",buffer);
    printf("Input: '%s'\n",buffer);

    return(0);
}

The format string in the statement at Line 8 directs the scanf() function to accept only characters A, B, and C as input. They can appear in any order, in any quantity, but only these characters. Anything else typed terminates the input string. Here are several sample runs:

Type: ABC
Input: 'ABC'

Type: BAC
Input: BAC

Type: AAAAABBBBBCCCCC
Input: AAAAABBBBBCCCCC

Type: ABCDABC
Input: ABC

In the final example, character D is typed but it doesn’t appear in the bracketed list. So when it’s encountered the input stream terminates.

Here’s a more practical example:

2020_09_12-Lesson-b.c

#include <stdio.h>

int main()
{
    char buffer[32];

    printf("Type: ");
    scanf("%[$1234567890.,]",buffer);
    printf("Input: '%s'\n",buffer);

    return(0);
}

The format string "%[$1234567890.,]" limits input to the characters specified. Here’s are sample runs:

Type: $1,567,039.12
Input: '$1,567,039.12'

Type: 100
Input: '100'

Type: $1,000 dollars
Input: '$1,000'

Of course, the code isn’t “practical” because the scanf() function can easily overflow. This admonition holds true for all of these formatting string tricks.

The second version of this special placeholder is $[^...] where the ^ character acts as a logical NOT. The effect is that the input string is valid until one of the characters specified is encountered. Many coders use this trick to allow scanf() to input a line of text, as in:

2020_09_12-Lesson-c.c

#include <stdio.h>

int main()
{
    char buffer[128];

    printf("Type: ");
    scanf("%[^\n]",buffer);
    printf("'%s'\n",buffer);

    return(0);
}

The format string at Line 8 is "%[^\n]", which reads: Accept all characters except for the newline. Here’s a sample run:

Type: This is a test
'This is a test'

Effectively, scanf() can read strings in this format, though the strings can still overflow.

The [...] and [^...] text filter placeholders find their way into many programming languages and scripting tools. They’re common meta-character filters, used to include or include characters. It’s rare that I’ve seen them used in C code, mostly because the scanf() function itself is weak.

If you truly want valid input, I recommend that you code your own function or use an existing function like fgets() and heavily examine the input to ensure that it’s what the code needs.

2 thoughts on “The Most Curious Aspect of the scanf() Function

  1. I hardly ever write C programs that take keyboard input (apart from the occasional “hit any key to exit”) but if I did I think I would read characters into a buffer using getchar in a while loop, and then process it in a separate function.

    scanf is quite sophisticated but it potentially means hard-coding quite complex and fiddly formatting into a single line of code. I think for maintainability, readability, expandability (and a few other …abilities!) it would be better to split out the validation and parsing to a separate function.

    Maybe what we need is a function that works like main, taking and populating the arguments (int argc, char *argv[]).

  2. If only scanf() had limit on input, it might be considered safe and sane to use. But you’re correct: the best solution is to process input in a loop to ensure that it’s valid and doesn’t overflow.

Leave a Reply