String Parsing with strtok()

A handy tool for slicing up a string of text into chunks is the strtok() function. If you understand the strtok() function, it helps you better understand how more complex parsing functions work.

The strtok() function requires the string.h header file for its definition. The man page format is:

char * strtok(char *restrict str, const char *restrict sep);

The *restrict str is the string to search. The *strict sep is a string consisting of one or more separator characters. The function returns a char pointer to the first character in the string that’s not a separator character. And like most parsing functions I’ve seen, strtok() is called multiple times until the entire string is parsed.

How can you call it multiple times?

In a loop, of course: After the initial call to strtok(), you replace the *restric str argument with the NULL constant. As long as strtok() keeps returning non-NULL char pointers, you continue to call the function to search for more text.

When a chunk of the string is found (or separated), strtok() returns a pointer. The pointer references only a specific chunk of text, not the rest of the string.

Here is sample code:

#include <stdio.h>
#include <string.h>

int main()
{
    char string[] = "Hello there, peasants!";
    char *found;

    printf("Original string: '%s'\n",string);

    found = strtok(string," ");
    if( found==NULL)
    {
        printf("\t'%s'\n",string);
        puts("\tNo separators found");
        return(1);
    }
    while(found)
    {
        printf("\t'%s'\n",found);
        found = strtok(NULL," ");
    }

    return(0);
}

The strtok() function at Line 11 scans the text in variable string. It’s looking for the space character as a separator. A pointer to the first chunk of text is returned to variable found.

At Line 12, the found variable is tested to see whether any separator characters are found. If not, the entire string is displayed and the program exits.

At Line 18, it’s assumed variable found points to the first chunk of text. The while loop continues as long as found is not NULL. Inside the loop, at Line 20, the found text is displayed. At Line 21, the strtok() function is called again to fetch the next parsed chunk of text. The loop continues until the string is fully parsed.

Here’s sample output:

Original string: 'Hello there, peasants!'
	'Hello'
	'there,'
	'peasants!'

The strtok() function can also be applied as a solution to this month’s Exercise. Click here to see code modified from my pointer solution to the Exercise, which uses the strtok() function to parse the input string.

Leave a Reply