{"id":1978,"date":"2016-06-25T00:01:17","date_gmt":"2016-06-25T07:01:17","guid":{"rendered":"http:\/\/c-for-dummies.com\/blog\/?p=1978"},"modified":"2016-06-18T09:49:42","modified_gmt":"2016-06-18T16:49:42","slug":"string-storage-mysteries","status":"publish","type":"post","link":"https:\/\/c-for-dummies.com\/blog\/?p=1978","title":{"rendered":"String Storage Mysteries"},"content":{"rendered":"<p>String storage is one of those frustrating things in the C language. Specifically, it&#8217;s that null character, <code>\\0<\/code>, that appears at the end of every string. Is that character counted when you input a string? Copy a string? Create storage for a string? It&#8217;s a mystery that could drive you nuts.<br \/>\n<!--more--><br \/>\nThe null character, <code>\\0<\/code>, is the final character in a string, which is really a special type of <em>char<\/em> array. The character is required. If your code manually constructs a string, you must remember to append the <code>\\0<\/code>. Plus, you must account for that character when you allocate string storage. That means if you need up to 32 characters for a string, you allocate 33 characters of storage, with the extra byte for the null character.<\/p>\n<p>As an example, the <em>fgets()<\/em> function reads <em>size<\/em> characters <em>minus one<\/em>. So if you have a 32 character buffer, you might specify the following <em>fgets()<\/em> statement to read text into that buffer:<\/p>\n<pre><code>fgets(buffer,32,stdio);<\/code><\/pre>\n<p>The above statement reads up to 31 characters into <code>buffer<\/code>, reserving the final character for the <code>\\0<\/code>.<\/p>\n<p>The <em>snprintf()<\/em> function works like <em>printf()<\/em>, but generates a string as output. The <em>n<\/em> is a size value, which is the length of the output string &mdash; <em>minus one<\/em> for the null character.<\/p>\n<p>And then you have the <em>strlen()<\/em> function.<\/p>\n<p>The <em>strlen()<\/em> function returns the length of a string, but it doesn&#8217;t count the null character. A few programmers allocate storage based on the value returned from <em>strlen()<\/em>, forgetting about the <code>\\0<\/code> tagging along like the caboose on a train.<\/p>\n<p>In the following code, the <em>strlen()<\/em> function returns a string&#8217;s length. I then use a pointer to march through the string, stopping after the null character. Subtracting this pointer&#8217;s address from the string&#8217;s base address yields how much actual storage the string uses.<\/p>\n<pre class=\"screen\">\r\n#include &lt;stdio.h&gt;\r\n#include &lt;string.h&gt;\r\n\r\nint main()\r\n{\r\n    char *string = \"This string is 34 characters long.\";\r\n    char *s;\r\n\r\n    printf(\"The string '%s' is %d characters long.\\n\",\r\n            string,\r\n            strlen(string));\r\n\r\n    <span class=\"comments\">\/* find the end of the string *\/<\/span>\r\n    s = string;\r\n    while(*s++)\r\n        ;\r\n    printf(\"Storage size is %d.\\n\",s - string);\r\n\r\n    return(0);\r\n}<\/pre>\n<p>Here&#8217;s sample output:<\/p>\n<pre><code>The string 'This string is 34 characters long.' is 34 characters long.\r\nStorage size is 35.<\/code><\/pre>\n<p>To further examine what&#8217;s going on, I ran the program through the Code::Blocks debugger. Figure 1 illustrates the string&#8217;s memory dump, showing how it&#8217;s stored inside the PC.<\/p>\n<div id=\"attachment_1979\" style=\"width: 578px\" class=\"wp-caption aligncenter\"><img loading=\"lazy\" decoding=\"async\" aria-describedby=\"caption-attachment-1979\" src=\"http:\/\/c-for-dummies.com\/blog\/wp-content\/uploads\/2016\/06\/0625-figure1.png\" alt=\"Figure 1. The string stored in memory, as viewed in the Code::Blocks debugger.\" width=\"568\" height=\"181\" class=\"size-full wp-image-1979\" srcset=\"https:\/\/c-for-dummies.com\/blog\/wp-content\/uploads\/2016\/06\/0625-figure1.png 568w, https:\/\/c-for-dummies.com\/blog\/wp-content\/uploads\/2016\/06\/0625-figure1-300x96.png 300w, https:\/\/c-for-dummies.com\/blog\/wp-content\/uploads\/2016\/06\/0625-figure1-500x159.png 500w\" sizes=\"auto, (max-width: 568px) 100vw, 568px\" \/><p id=\"caption-attachment-1979\" class=\"wp-caption-text\">Figure 1. The string stored in memory, as viewed in the Code::Blocks debugger.<\/p><\/div>\n<p>A dump of the variables, <code>string<\/code> and <code>s<\/code>, is shown in Figure 2.<\/p>\n<div id=\"attachment_1980\" style=\"width: 458px\" class=\"wp-caption aligncenter\"><img loading=\"lazy\" decoding=\"async\" aria-describedby=\"caption-attachment-1980\" src=\"http:\/\/c-for-dummies.com\/blog\/wp-content\/uploads\/2016\/06\/0625-figure2.png\" alt=\"Figure 2. The values stored in pointers string and s at Line 17 in the code.\" width=\"448\" height=\"181\" class=\"size-full wp-image-1980\" srcset=\"https:\/\/c-for-dummies.com\/blog\/wp-content\/uploads\/2016\/06\/0625-figure2.png 448w, https:\/\/c-for-dummies.com\/blog\/wp-content\/uploads\/2016\/06\/0625-figure2-300x121.png 300w\" sizes=\"auto, (max-width: 448px) 100vw, 448px\" \/><p id=\"caption-attachment-1980\" class=\"wp-caption-text\">Figure 2. The values stored in pointers <code>string<\/code> and <code>s<\/code> at Line 17 in the code.<\/p><\/div>\n<p>The memory locations shown in Figures 1 and 2 aren&#8217;t the same for all computers or even the same computer when running the code, but the calculations remain constant: Pointer <code>s<\/code> is incremented until it roosts after the location of the null character in the string. The math calculates the true storage used as opposed to the <em>strlen()<\/em> function that returns on the number of actual characters in the string.<\/p>\n<p>In Figure 2, address 0x403047 minus address 0x403024 equals 0x23, which is 35 decimal.<\/p>\n<p>The bottom line is that you must remember to account for the null character when you deal with a string. The C language I\/O functions do so automatically, but <em>strlen()<\/em> does not.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>The difference between a string&#8217;s length and its storage size in memory is +1. <a href=\"https:\/\/c-for-dummies.com\/blog\/?p=1978\">Continue reading <span class=\"meta-nav\">&rarr;<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[2],"tags":[],"class_list":["post-1978","post","type-post","status-publish","format-standard","hentry","category-main"],"_links":{"self":[{"href":"https:\/\/c-for-dummies.com\/blog\/index.php?rest_route=\/wp\/v2\/posts\/1978","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/c-for-dummies.com\/blog\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/c-for-dummies.com\/blog\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/c-for-dummies.com\/blog\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/c-for-dummies.com\/blog\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=1978"}],"version-history":[{"count":3,"href":"https:\/\/c-for-dummies.com\/blog\/index.php?rest_route=\/wp\/v2\/posts\/1978\/revisions"}],"predecessor-version":[{"id":1986,"href":"https:\/\/c-for-dummies.com\/blog\/index.php?rest_route=\/wp\/v2\/posts\/1978\/revisions\/1986"}],"wp:attachment":[{"href":"https:\/\/c-for-dummies.com\/blog\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=1978"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/c-for-dummies.com\/blog\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=1978"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/c-for-dummies.com\/blog\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=1978"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}