{"id":892,"date":"2014-08-16T00:01:09","date_gmt":"2014-08-16T07:01:09","guid":{"rendered":"http:\/\/c-for-dummies.com\/blog\/?p=892"},"modified":"2014-08-02T08:48:50","modified_gmt":"2014-08-02T15:48:50","slug":"more-string-searching-functions","status":"publish","type":"post","link":"https:\/\/c-for-dummies.com\/blog\/?p=892","title":{"rendered":"More String-Searching Functions"},"content":{"rendered":"<p>The C library is teaming with string-searching functions. The most basic is <em>strstr()<\/em>, which I discussed in <a href=\"http:\/\/c-for-dummies.com\/blog\/?p=889\">last week&#8217;s Lesson<\/a>. That function has some brothers and sisters.<br \/>\n<!--more--><br \/>\nHere are the <em>strstr()<\/em> variations:<\/p>\n<p><strong><em>strcasestr()<\/em><\/strong> This function is identical to <em>strstr()<\/em>, but it ignores case. So it matches <code>face<\/code> and <code>Face<\/code> and <code>FACE<\/code> as the same.<\/p>\n<p><strong><em>strnstr()<\/em><\/strong> This function adds a limit on the number of characters searched. So it peeks into the source string only <em>n<\/em> characters deep. This function isn&#8217;t found in every C language library.<\/p>\n<p><strong><em>strcasetstr_l()<\/em><\/strong> This is a variation of the <em>strcasestr()<\/em> function. It uses the operating system&#8217;s locale information to search non-ASCII alphabets. (That&#8217;s an L at the end of the function name, not a one.)<\/p>\n<p>Collectively these are known as the substring functions. They find the offset of one string within another.<\/p>\n<p>Here&#8217;s an example of <em>strcasestr()<\/em> would be used to search for matching text regardless of case:<\/p>\n<pre class=\"screen\">\r\n#include &lt;stdio.h&gt;\r\n#include &lt;string.h&gt;\r\n\r\nint main()\r\n{\r\n    char *haystack = \"Very secret Hidden text\";\r\n    char *needle = \"hidden\";\r\n    char *location;\r\n\r\n    location = strstr(haystack,needle);\r\n    if(location == NULL)\r\n        puts(\"Unable to find string with strstr().\");\r\n    else\r\n        printf(\"strstr() found '%s' in '%s'.\\n\",\r\n            needle,\r\n            haystack);\r\n\r\n    location = strcasestr(haystack,needle);\r\n    if(location == NULL)\r\n        puts(\"Can't find the string with strcasestr() either!\");\r\n    else\r\n        printf(\"strcasestr() found '%s' in '%s'.\\n\",\r\n            needle,\r\n            haystack);\r\n\r\n    return(0);\r\n}<\/pre>\n<p>Here&#8217;s sample output:<\/p>\n<pre><code>Unable to find string with strstr().\r\nstrcasestr() found 'hidden' in 'Very secret Hidden text'.<\/code><\/pre>\n<p>The <em>strnstr()<\/em> function limits how deep in to the &#8220;haystack&#8221; you search for a string. Here&#8217;s the man page format:<\/p>\n<p><code>char * strnstr(const char *s1, const char *s2, size_t n);<\/code><\/p>\n<p>Basically, it&#8217;s the same as the <em>strstr()<\/em> function, where string <em>s1<\/em> is searched for the text in <em>s2<\/em>. The <em>n<\/em> value limits the search to that many characters.<\/p>\n<p>Behold some sample code:<\/p>\n<pre class=\"screen\">\r\n#include &lt;stdio.h&gt;\r\n#include &lt;string.h&gt;\r\n\r\n#define LIMIT 15\r\n\r\nint main()\r\n{\r\n    char *haystack = \"Eeny meeny miny moe!\";\r\n    char *needle = \"moe\";\r\n    char *location;\r\n\r\n    location = strnstr(haystack,needle,LIMIT);\r\n    if(location == NULL)\r\n        printf(\"Can't find '%s' within %d characters of '%s'\\n\",\r\n            needle,\r\n            LIMIT,\r\n            haystack);\r\n    else\r\n        printf(\"Found '%s' within %d characters of '%s'\\n\",\r\n            needle,\r\n            LIMIT,\r\n            haystack);\r\n\r\n    return(0);\r\n}<\/pre>\n<p>The above code uses the <em>strnstr()<\/em> function to limit the character search. The constant <code>LIMIT<\/code> is set to 15 characters, so after that many characters the function returns <code>NULL<\/code> for no match. Here&#8217;s sample output:<\/p>\n<pre><code>Can't find 'moe' within 15 characters of 'Eeny meeny miny moe!'<\/code><\/pre>\n<p>One reason to use <em>strnstr()<\/em> instead of <em>strstr()<\/em> would be to save time. Obviously searching only <em>n<\/em> characters is faster than searching what could be a very long string. I can imagine other reasons to use it, but mostly I&#8217;ve used <em>strstr()<\/em> in my code.<\/p>\n<p>Beyond the string-searching functions, the C library also supports various character searching functions. These are all defined in the <code>string.h<\/code> header file:<\/p>\n<p><strong><em>strchr()<\/em><\/strong> Locate a character within a string, returning a pointer to that character.<\/p>\n<p><strong><em>strrchr()<\/em><\/strong> Locate the final occurrence of a character in the string. Essentially this is the strchr function, but it starts reading at the end of the string forward.<\/p>\n<p><strong><em>strspn()<\/em><\/strong> A weird little function, <em>strspn()<\/em> returns the the number of characters found in common between two strings. <a href=\"http:\/\/c-for-dummies.com\/blog\/wp-content\/uploads\/2014\/08\/0816c.c\">Click here<\/a> to view some sample code.<\/p>\n<p><strong><em>strcspn()<\/em><\/strong> This function does the reverse of <em>strspn()<\/em>, returning the number of characters not found in common between two strings.<\/p>\n<p><strong><em>strsep()<\/em><\/strong> This gordian knot of a function unravels a single string into separate strings based on a single separator character, such as a tab or comma. It&#8217;s highly useful, but hideously complex.<\/p>\n<p><strong><em>strtok()<\/em><\/strong> This function is similar to <em>strsep()<\/em>, although a string of separator characters, or &#8220;tokens,&#8221; can be used to split the string.<\/p>\n<p>The most important thing to remember about these functions is that they exist. If your code needs to tear through a string, review these and other functions to ensure that you&#8217;re not trying to recreate them on your own.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>The C library offers plenty of text-finding functions. <a href=\"https:\/\/c-for-dummies.com\/blog\/?p=892\">Continue reading <span class=\"meta-nav\">&rarr;<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[2],"tags":[],"class_list":["post-892","post","type-post","status-publish","format-standard","hentry","category-main"],"_links":{"self":[{"href":"https:\/\/c-for-dummies.com\/blog\/index.php?rest_route=\/wp\/v2\/posts\/892","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/c-for-dummies.com\/blog\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/c-for-dummies.com\/blog\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/c-for-dummies.com\/blog\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/c-for-dummies.com\/blog\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=892"}],"version-history":[{"count":7,"href":"https:\/\/c-for-dummies.com\/blog\/index.php?rest_route=\/wp\/v2\/posts\/892\/revisions"}],"predecessor-version":[{"id":913,"href":"https:\/\/c-for-dummies.com\/blog\/index.php?rest_route=\/wp\/v2\/posts\/892\/revisions\/913"}],"wp:attachment":[{"href":"https:\/\/c-for-dummies.com\/blog\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=892"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/c-for-dummies.com\/blog\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=892"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/c-for-dummies.com\/blog\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=892"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}