{"id":5105,"date":"2021-12-25T00:01:14","date_gmt":"2021-12-25T08:01:14","guid":{"rendered":"https:\/\/c-for-dummies.com\/blog\/?p=5105"},"modified":"2022-01-01T08:39:23","modified_gmt":"2022-01-01T16:39:23","slug":"a-tally-of-unique-words-part-iii","status":"publish","type":"post","link":"https:\/\/c-for-dummies.com\/blog\/?p=5105","title":{"rendered":"A Tally of Unique Words, Part III"},"content":{"rendered":"<p>From <a href=\"https:\/\/c-for-dummies.com\/blog\/?p=5099\">last week&#8217;s Lesson<\/a>, the text in a buffer is parsed, creating pointers to each word in the string. Alas, the addresses of these words (the pointers) aren&#8217;t saved, which is stupid. To handle the job, and to keep the Unique Words project moving forward, a dynamic array of pointers must be allocated.<br \/>\n<!--more--><br \/>\nThe C language doesn&#8217;t let you reallocate an array; you&#8217;re stuck with the size chosen at build time. So to handle an array of unknown size, you must use pointers and dynamically allocate storage. Discussing this topic often sends C programmers into a tizzy.<\/p>\n<p>For my unique words project, the number of words in the buffer is unknown. Back when I tried to avoid pointers, I would just allocate a monstrous array to handle all potential pointers:<\/p>\n<p><code>char *words[1000];<\/code><\/p>\n<p>The size is best-guess determined to handle (hopefully) anything. This approach may work. But if the quantity is too low, you&#8217;ve wasted memory. If the quantity isn&#8217;t big enough, the program crashes. Using pointers is the only way to keep track of an unknown number of values and dynamically allocate storage.<\/p>\n<p>The data type required to store pointers is one of these monsters:<\/p>\n<p><code>char **list;<\/code><\/p>\n<p>Ugh.<\/p>\n<p>An initial size must be set for the allocation. I use the value 100, which is assigned to <em>int<\/em> constant <code>size<\/code>:<\/p>\n<p><code>const int size = 100;<\/code><\/p>\n<p>Here are the statements to initially allocate storage for 100 <em>char<\/em> pointers in the <code>list<\/code> memory chunk:<\/p>\n<pre class=\"screen\">\r\nlist = malloc( sizeof(char *) * size );\r\nif( list==NULL )\r\n{\r\n    fprintf(stderr,\"Memory allocation error\\n\");\r\n    exit(1);\r\n}<\/pre>\n<p>To store the pointers, the parsed words, the code&#8217;s <em>while<\/em> loop is modified. The pointer returned from the <em>strtok()<\/em> function is saved at an offset within the <code>list<\/code> buffer. Variable <code>count<\/code>, which already exists in the code, provides the offset:<\/p>\n<pre class=\"screen\">\r\ncount = 0;\r\nword = strtok(buffer,separators);\r\nwhile( word )\r\n{\r\n    *(list+count) = word;\r\n    word = strtok(NULL,separators);\r\n    count++;\r\n}<\/pre>\n<p>As words are parsed from <code>buffer<\/code>, the <code>word<\/code> pointer is saved into the <code>list<\/code> dynamic array. To deal with overflow &mdash; because the <code>list<\/code> buffer initially stores only 100 items &mdash; the value of <code>count<\/code> is compared with constant <code>size<\/code>:<\/p>\n<p><code>count%size<\/code><\/p>\n<p>When this expression returns zero, the <code>list<\/code> buffer&#8217;s number of items is evenly-divided by 100: 100, 200, 300, and so on. Because the loop is still spinning, 100 more items must be added to storage. The <em>realloc()<\/em> function does the job:<\/p>\n<p><code>list = realloc(list,sizeof(char *)*(count+size));<\/code><\/p>\n<p>Along with the error testing that follows the reallocation, here is the full <em>while<\/em> loop:<\/p>\n<pre class=\"screen\">\r\ncount = 0;\r\nword = strtok(buffer,separators);\r\nwhile( word )\r\n{\r\n    *(list+count) = word;\r\n    word = strtok(NULL,separators);\r\n    count++;\r\n    if( count%size==0 )\r\n    {\r\n        list = realloc(list,sizeof(char *)*(count+size));\r\n        if( list==NULL )\r\n        {\r\n            fprintf(stderr,\"Unable to reallocate memory\\n\");\r\n            exit(1);\r\n        }\r\n    }\r\n}<\/pre>\n<p>The text output function is removed from the <em>while<\/em> loop because I want to prove that the word pointers are stored in the <code>list<\/code> buffer work. Therefore, a <em>for<\/em> loop now outputs the word list. These updates are found in the <a href=\"https:\/\/github.com\/dangookin\/C-For-Dummies-Blog\/blob\/master\/2021_12_25-Lesson.c\" rel=\"noopener\" target=\"_blank\">source code file on GitHub<\/a>.<\/p>\n<p>The code from this week&#8217;s Lesson has the same output as shown last week. The difference is that the words are all indexed and stored in a buffer. The next step is to find the unique words and duplicates. This topic is covered in <a href=\"https:\/\/c-for-dummies.com\/blog\/?p=5113\">next week&#8217;s Lesson<\/a>.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>This step in my unique words program involves retaining the addresses of each word in the string. <a href=\"https:\/\/c-for-dummies.com\/blog\/?p=5105\">Continue reading <span class=\"meta-nav\">&rarr;<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[2],"tags":[],"class_list":["post-5105","post","type-post","status-publish","format-standard","hentry","category-main"],"_links":{"self":[{"href":"https:\/\/c-for-dummies.com\/blog\/index.php?rest_route=\/wp\/v2\/posts\/5105","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/c-for-dummies.com\/blog\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/c-for-dummies.com\/blog\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/c-for-dummies.com\/blog\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/c-for-dummies.com\/blog\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=5105"}],"version-history":[{"count":4,"href":"https:\/\/c-for-dummies.com\/blog\/index.php?rest_route=\/wp\/v2\/posts\/5105\/revisions"}],"predecessor-version":[{"id":5143,"href":"https:\/\/c-for-dummies.com\/blog\/index.php?rest_route=\/wp\/v2\/posts\/5105\/revisions\/5143"}],"wp:attachment":[{"href":"https:\/\/c-for-dummies.com\/blog\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=5105"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/c-for-dummies.com\/blog\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=5105"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/c-for-dummies.com\/blog\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=5105"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}