Finding Text with strstr()

The programming universe is teaming with search algorithms. In fact, searching for stuff has become so commonplace computer users casually take it for granted. Things weren’t always that way, but the programming algorithms behind the search remains the same.

The basic way to find text is to look for it one character at a time. That’s not very efficient, and I’m sure Google uses far more complex algorithms (although they might not), but it’s how I code search routines.

The standard C library tool to locate one string within another is strstr(), which I pronounce string-string. Here’s the man page format:

char * strstr(const char *s1, const char *s2);

The strstr() function locates the first occurance of string s2 within string s1. That location is returned as a char pointer — the address of s2 inside s1. It’s not an offset, but a memory location.

Like other string functions, strstr() is defined in the string.h header file. Here’s some sample code to chew on:

#include <stdio.h>
#include <string.h>

int main()
{
    char *haystack = "Was this the face that launch'd a thousand ships";
    char *needle = "face";
    char *location;

    location = strstr(haystack,needle);
    if( location == NULL)
        perror("String not found");
    else
    {
        printf("String '%s' was found at position %d in string '%s'.\n",
            needle,
            (int)location-(int)haystack+1,
            haystack);
    }

    return(0);
}

Lines 6 and 7 declare the text and search string, s1 and s2, respectively. Line 8 declares a pointer location in which the location of string needle in string haystack is found.

The strstr() function is used at Line 10. The value of location is examined at Line 11 and if it’s NULL, then the string wasn’t found. The program displays an error message and is effectively done at that point.

If a string is found, the printf() statement at Lines 15, 16, 17, and 18 is executed. I split up the arguments for printf() on three lines for readability.

Line 17 calculates the found string’s offset. The pointer variables location and haystack are typecast to integers for the math calculation. The result is incremented by one because the first offset in the string is location 0, which a typical computer user wouldn’t understand.

Here’s the output:

String 'face' was found at position 14 in string 'Was this the face that launch'd a thousand ships'.

The strstr() function has some useful brothers, which I’ll discuss in next week’s Lesson.