More String-Searching Functions

The C library is teaming with string-searching functions. The most basic is strstr(), which I discussed in last week’s Lesson. That function has some brothers and sisters.

Here are the strstr() variations:

strcasestr() This function is identical to strstr(), but it ignores case. So it matches face and Face and FACE as the same.

strnstr() This function adds a limit on the number of characters searched. So it peeks into the source string only n characters deep. This function isn’t found in every C language library.

strcasetstr_l() This is a variation of the strcasestr() function. It uses the operating system’s locale information to search non-ASCII alphabets. (That’s an L at the end of the function name, not a one.)

Collectively these are known as the substring functions. They find the offset of one string within another.

Here’s an example of strcasestr() would be used to search for matching text regardless of case:

#include <stdio.h>
#include <string.h>

int main()
{
    char *haystack = "Very secret Hidden text";
    char *needle = "hidden";
    char *location;

    location = strstr(haystack,needle);
    if(location == NULL)
        puts("Unable to find string with strstr().");
    else
        printf("strstr() found '%s' in '%s'.\n",
            needle,
            haystack);

    location = strcasestr(haystack,needle);
    if(location == NULL)
        puts("Can't find the string with strcasestr() either!");
    else
        printf("strcasestr() found '%s' in '%s'.\n",
            needle,
            haystack);

    return(0);
}

Here’s sample output:

Unable to find string with strstr().
strcasestr() found 'hidden' in 'Very secret Hidden text'.

The strnstr() function limits how deep in to the “haystack” you search for a string. Here’s the man page format:

char * strnstr(const char *s1, const char *s2, size_t n);

Basically, it’s the same as the strstr() function, where string s1 is searched for the text in s2. The n value limits the search to that many characters.

Behold some sample code:

#include <stdio.h>
#include <string.h>

#define LIMIT 15

int main()
{
    char *haystack = "Eeny meeny miny moe!";
    char *needle = "moe";
    char *location;

    location = strnstr(haystack,needle,LIMIT);
    if(location == NULL)
        printf("Can't find '%s' within %d characters of '%s'\n",
            needle,
            LIMIT,
            haystack);
    else
        printf("Found '%s' within %d characters of '%s'\n",
            needle,
            LIMIT,
            haystack);

    return(0);
}

The above code uses the strnstr() function to limit the character search. The constant LIMIT is set to 15 characters, so after that many characters the function returns NULL for no match. Here’s sample output:

Can't find 'moe' within 15 characters of 'Eeny meeny miny moe!'

One reason to use strnstr() instead of strstr() would be to save time. Obviously searching only n characters is faster than searching what could be a very long string. I can imagine other reasons to use it, but mostly I’ve used strstr() in my code.

Beyond the string-searching functions, the C library also supports various character searching functions. These are all defined in the string.h header file:

strchr() Locate a character within a string, returning a pointer to that character.

strrchr() Locate the final occurrence of a character in the string. Essentially this is the strchr function, but it starts reading at the end of the string forward.

strspn() A weird little function, strspn() returns the the number of characters found in common between two strings. Click here to view some sample code.

strcspn() This function does the reverse of strspn(), returning the number of characters not found in common between two strings.

strsep() This gordian knot of a function unravels a single string into separate strings based on a single separator character, such as a tab or comma. It’s highly useful, but hideously complex.

strtok() This function is similar to strsep(), although a string of separator characters, or “tokens,” can be used to split the string.

The most important thing to remember about these functions is that they exist. If your code needs to tear through a string, review these and other functions to ensure that you’re not trying to recreate them on your own.