{"id":1758,"date":"2016-02-13T00:01:32","date_gmt":"2016-02-13T08:01:32","guid":{"rendered":"http:\/\/c-for-dummies.com\/blog\/?p=1758"},"modified":"2016-02-06T09:34:07","modified_gmt":"2016-02-06T17:34:07","slug":"string-parsing-with-strtok","status":"publish","type":"post","link":"https:\/\/c-for-dummies.com\/blog\/?p=1758","title":{"rendered":"String Parsing with <em>strtok()<\/em>"},"content":{"rendered":"<p>A handy tool for slicing up a string of text into chunks is the <em>strtok()<\/em> function. If you understand the <em>strtok()<\/em> function, it helps you better understand how more complex parsing functions work.<br \/>\n<!--more--><br \/>\nThe <em>strtok()<\/em> function requires the <code>string.h<\/code> header file for its definition. The man page format is:<\/p>\n<pre><code>char * strtok(char *restrict str, const char *restrict sep);<\/code><\/pre>\n<p>The <code>*restrict str<\/code> is the string to search. The <code>*strict sep<\/code> is a string consisting of one or more separator characters. The function returns a <em>char<\/em> pointer to the first character in the string that&#8217;s not a separator character. And like most parsing functions I&#8217;ve seen, <em>strtok()<\/em> is called multiple times until the entire string is parsed.<\/p>\n<p>How can you call it multiple times?<\/p>\n<p>In a loop, of course: After the initial call to <em>strtok()<\/em>, you replace the <code>*restric str<\/code> argument with the <code>NULL<\/code> constant. As long as <em>strtok()<\/em> keeps returning non-NULL <em>char<\/em> pointers, you continue to call the function to search for more text.<\/p>\n<p>When a chunk of the string is found (or separated), <em>strtok()<\/em> returns a pointer. The pointer references only a specific chunk of text, not the rest of the string.<\/p>\n<p>Here is sample code:<\/p>\n<pre class=\"screen\">\r\n#include &lt;stdio.h&gt;\r\n#include &lt;string.h&gt;\r\n\r\nint main()\r\n{\r\n    char string[] = \"Hello there, peasants!\";\r\n    char *found;\r\n\r\n    printf(\"Original string: '%s'\\n\",string);\r\n\r\n    found = strtok(string,\" \");\r\n    if( found==NULL)\r\n    {\r\n        printf(\"\\t'%s'\\n\",string);\r\n        puts(\"\\tNo separators found\");\r\n        return(1);\r\n    }\r\n    while(found)\r\n    {\r\n        printf(\"\\t'%s'\\n\",found);\r\n        found = strtok(NULL,\" \");\r\n    }\r\n\r\n    return(0);\r\n}<\/pre>\n<p>The <em>strtok()<\/em> function at Line 11 scans the text in variable <code>string<\/code>. It&#8217;s looking for the space character as a separator. A pointer to the first chunk of text is returned to variable <code>found<\/code>.<\/p>\n<p>At Line 12, the <code>found<\/code> variable is tested to see whether any separator characters are found. If not, the entire string is displayed and the program exits.<\/p>\n<p>At Line 18, it&#8217;s assumed variable <code>found<\/code> points to the first chunk of text. The <em>while<\/em> loop continues as long as <code>found<\/code> is not NULL. Inside the loop, at Line 20, the found text is displayed. At Line 21, the <em>strtok()<\/em> function is called again to fetch the next parsed chunk of text. The loop continues until the string is fully parsed.<\/p>\n<p>Here&#8217;s sample output:<\/p>\n<pre><code>Original string: 'Hello there, peasants!'\r\n\t'Hello'\r\n\t'there,'\r\n\t'peasants!'<\/code><\/pre>\n<p>The <em>strtok()<\/em> function can also be applied as a solution to <a href=\"http:\/\/c-for-dummies.com\/blog\/?p=1742\">this month&#8217;s Exercise<\/a>. <a href=\"http:\/\/c-for-dummies.com\/blog\/wp-content\/uploads\/2016\/01\/0213b.c\"rel=\"\">Click here<\/a> to see code modified from my pointer solution to the Exercise, which uses the <em>strtok()<\/em> function to parse the input string.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>The <em>strtok()<\/em> function parses a string based on separator characters you specify. <a href=\"https:\/\/c-for-dummies.com\/blog\/?p=1758\">Continue reading <span class=\"meta-nav\">&rarr;<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[2],"tags":[],"class_list":["post-1758","post","type-post","status-publish","format-standard","hentry","category-main"],"_links":{"self":[{"href":"https:\/\/c-for-dummies.com\/blog\/index.php?rest_route=\/wp\/v2\/posts\/1758","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/c-for-dummies.com\/blog\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/c-for-dummies.com\/blog\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/c-for-dummies.com\/blog\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/c-for-dummies.com\/blog\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=1758"}],"version-history":[{"count":4,"href":"https:\/\/c-for-dummies.com\/blog\/index.php?rest_route=\/wp\/v2\/posts\/1758\/revisions"}],"predecessor-version":[{"id":1775,"href":"https:\/\/c-for-dummies.com\/blog\/index.php?rest_route=\/wp\/v2\/posts\/1758\/revisions\/1775"}],"wp:attachment":[{"href":"https:\/\/c-for-dummies.com\/blog\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=1758"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/c-for-dummies.com\/blog\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=1758"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/c-for-dummies.com\/blog\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=1758"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}