{"id":2816,"date":"2017-11-25T00:01:41","date_gmt":"2017-11-25T08:01:41","guid":{"rendered":"http:\/\/c-for-dummies.com\/blog\/?p=2816"},"modified":"2017-12-02T08:48:10","modified_gmt":"2017-12-02T16:48:10","slug":"safe-coding-practices-string-handling-i","status":"publish","type":"post","link":"https:\/\/c-for-dummies.com\/blog\/?p=2816","title":{"rendered":"Safe Coding Practices &#8211; String Handling I"},"content":{"rendered":"<p>C offers a smattering of string-manipulation functions, but it leaves many of the critical issues up to you. Specifically, you must ensure that a string doesn&#8217;t overflow its buffer and that all strings are capped with the null character, <code>'\\0'<\/code>.<br \/>\n<!--more--><br \/>\nI&#8217;m guilty of violating both these rules and I&#8217;m not alone. Unsafe code often runs just fine under some circumstances. The compiler doesn&#8217;t check for buffer overflows and sometimes you&#8217;re lucky that a null character is lurking in memory right where a string ends. Fortune&#8217;s winds don&#8217;t always blow in a favorable direction, which is why you must avoid these two unsafe conditions.<\/p>\n<blockquote><p>In C, the programmer must perform the error-checking for strings, a process that&#8217;s automatic in other programming languages.<\/p><\/blockquote>\n<p>The two functions at issue here are <em>strcpy()<\/em> to copy a string and <em>strcat()<\/em> to stick one string on the end of another. Both functions require that the strings manipulated terminate with the null character, <code>'\\0'<\/code>. Both functions assume in a deadly manner that the buffers manipulated can hold all the characters copied.<\/p>\n<p>Consider the following code that uses the <em>strcpy()<\/em> function to copy a string from one buffer to another:<\/p>\n<pre class=\"screen\">\r\n#include &lt;stdio.h&gt;\r\n#include &lt;string.h&gt;\r\n\r\nint main()\r\n{\r\n    char buf1[] = \"Goodbye!\";\r\n    char buf2[3];\r\n\r\n    strcpy(buf2,buf1);\r\n\r\n    printf(\"'%s' and '%s'\\n\",\r\n            buf1,\r\n            buf2\r\n          );\r\n\r\n    return(0);\r\n}<\/pre>\n<p>Array <code>buf1[]<\/code> holds nine characters, eight for the string <code>Goodbye!<\/code> and one for the null character at the end (which you don&#8217;t see). The <code>buf2[]<\/code> array, however, has room for only 3 characters. The <em>strcpy()<\/em> function won&#8217;t error because of the overflow.<\/p>\n<p>When I run this code, I see this output:<\/p>\n<p><code>'dbye!' and 'Goodbye!'<\/code><\/p>\n<p>I&#8217;m not sure what&#8217;s going on, but I saw this output on both the PC and my Linux systems. The string <code>dbye!<\/code> is 3 characters too long for the buffer. On my Mac, however, I get the message <code>Abort trap: 6<\/code>. Obviously the Mac is more aggressive with its overflow checking.<\/p>\n<p>It&#8217;s up to you to ensure that a buffer doesn&#8217;t overflow. Only when the size of <code>buf2[]<\/code> is changed to fully accommodate the string in <code>buf1[]<\/code> does the code run safely. Yet, imagine a large program where you innocently change the size of one buffer and forget to change the size of another: You&#8217;ve just written risky code.<\/p>\n<p>In the preceding example, the string size is know; it won&#8217;t change at runtime. So, you could manually set both buffers to the same size or something large enough to accommodate the known text. When the string size is unknown, you must properly size the buffer at runtime. In the following code, the second buffer is a pointer. Its size is set when the code runs:<\/p>\n<pre class=\"screen\">\r\n#include &lt;stdio.h&gt;\r\n#include &lt;stdlib.h&gt;\r\n#include &lt;string.h&gt;\r\n\r\nint main()\r\n{\r\n    char buf1[] = \"Goodbye!\";\r\n    char *buf2;\r\n\r\n    <span class=\"comments\">\/* allocate storage *\/<\/span>\r\n    buf2 = (char *)malloc( strlen(buf1) + 1 );\r\n    if( buf2 == NULL)\r\n    {\r\n        fprintf(stderr,\"Unable to allocate buffer\\n\");\r\n        exit(1);\r\n    }\r\n\r\n    strcpy(buf2,buf1);\r\n\r\n    printf(\"'%s' and '%s'\\n\",\r\n            buf1,\r\n            buf2\r\n          );\r\n\r\n    return(0);\r\n}<\/pre>\n<p>At Line 11, the statement <code>(char *)malloc( strlen(buf1) + 1 )<\/code> allocates the proper amount of storage for the string in <code>buf1<\/code> and assigns that memory location to the <code>buf2<\/code> pointer. The <code>+1<\/code> is necessary because the <em>strlen()<\/em> function counts only characters in the string, not the null character at the end.<\/p>\n<p>This same type of solution must be used when concatenating strings. I cover that problem and its solution in <a href=\"http:\/\/c-for-dummies.com\/blog\/?p=2801\">next week&#8217;s Lesson<\/a>.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>When copying strings, be mindful of buffer overflows. <a href=\"https:\/\/c-for-dummies.com\/blog\/?p=2816\">Continue reading <span class=\"meta-nav\">&rarr;<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[2],"tags":[],"class_list":["post-2816","post","type-post","status-publish","format-standard","hentry","category-main"],"_links":{"self":[{"href":"https:\/\/c-for-dummies.com\/blog\/index.php?rest_route=\/wp\/v2\/posts\/2816","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/c-for-dummies.com\/blog\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/c-for-dummies.com\/blog\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/c-for-dummies.com\/blog\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/c-for-dummies.com\/blog\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=2816"}],"version-history":[{"count":9,"href":"https:\/\/c-for-dummies.com\/blog\/index.php?rest_route=\/wp\/v2\/posts\/2816\/revisions"}],"predecessor-version":[{"id":2858,"href":"https:\/\/c-for-dummies.com\/blog\/index.php?rest_route=\/wp\/v2\/posts\/2816\/revisions\/2858"}],"wp:attachment":[{"href":"https:\/\/c-for-dummies.com\/blog\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=2816"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/c-for-dummies.com\/blog\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=2816"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/c-for-dummies.com\/blog\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=2816"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}