{"id":7156,"date":"2025-09-20T00:01:00","date_gmt":"2025-09-20T07:01:00","guid":{"rendered":"https:\/\/c-for-dummies.com\/blog\/?p=7156"},"modified":"2025-09-13T10:53:16","modified_gmt":"2025-09-13T17:53:16","slug":"finding-those-pesky-null-characters","status":"publish","type":"post","link":"https:\/\/c-for-dummies.com\/blog\/?p=7156","title":{"rendered":"Finding Those Pesky Null Characters!"},"content":{"rendered":"<p>You&#8217;ve crafted a brilliant function, ensuring that it properly processes words and generates needed output. Is the code perfect? Well, it looks perfect. But how do you know for certain?<br \/>\n<!--more--><br \/>\nThe following program outputs the word pneumonoultramicroscopicsilicovolcanoconiosis, reportedly the longest word in the English language. It uses a clever <em>while<\/em> statement where the condition is neatly obfuscated &mdash; something I avoid doing in my own code.<\/p>\n<h3><a href=\"https:\/\/github.com\/dangookin\/C-For-Dummies-Blog\/blob\/master\/2025_09_20-Lesson.c\" rel=\"noopener\" target=\"_blank\">2025_09_20-Lesson.c<\/a><\/h3>\n<pre class=\"screen\">\r\n#include &lt;stdio.h&gt;\r\n\r\nint main()\r\n{\r\n    char word[] = \"pneumonoultramicroscopicsilicovolcanoconiosis\";\r\n    int x = 0;\r\n\r\n    while( putchar( word[x++] ) )\r\n        ;\r\n    putchar('\\n');\r\n\r\n    return 0;\r\n}<\/pre>\n<p>Here is the output, exactly as planned &mdash; or is it?<\/p>\n<p><code>pneumonoultramicroscopicsilicovolcanoconiosis<\/code><\/p>\n<p>Well, you know that the output doesn&#8217;t appear exactly as planned or I wouldn&#8217;t have asked the question. The point is that I&#8217;ve often written code where the output looks great, but upon closer examination flaws in the code go unnoticed. Especially with text output, it&#8217;s easy to overlook things you cannot see, specifically null characters that are part of output but do not print. For the code above, this extra output is exactly what happens.<\/p>\n<p>This <a href=\"https:\/\/c-for-dummies.com\/blog\/?p=7132\">month&#8217;s Exercise<\/a> provides a good example. One of my first solutions looked great! But when I examined the output more closely, it was pockmarked with null characters.<\/p>\n<p>One way to check whether the program&#8217;s output is proper is to run it through the <em>hexdump<\/em> filter, a Linux command line utility. For example:<\/p>\n<p><code>$ .\/a.out | hexdump -C<\/code><\/p>\n<p>The program&#8217;s name is <code>a.out<\/code>, the default. It&#8217;s prefixed with <code>.\/<\/code>, which directs the command interpreter to look for and run the file found in the current directory. The <code>-C<\/code> switch directs the <em>hexdump<\/em> utility to use the Canonical format, which includes the ASCII portion (the third column). Here is the command&#8217;s output.<\/p>\n<p><code>00000000&nbsp;&nbsp;70&nbsp;6e&nbsp;65&nbsp;75&nbsp;6d&nbsp;6f&nbsp;6e&nbsp;6f&nbsp;&nbsp;75&nbsp;6c&nbsp;74&nbsp;72&nbsp;61&nbsp;6d&nbsp;69&nbsp;63&nbsp;&nbsp;|pneumonoultramic|<br \/>\n00000010&nbsp;&nbsp;72&nbsp;6f&nbsp;73&nbsp;63&nbsp;6f&nbsp;70&nbsp;69&nbsp;63&nbsp;&nbsp;73&nbsp;69&nbsp;6c&nbsp;69&nbsp;63&nbsp;6f&nbsp;76&nbsp;6f&nbsp;&nbsp;|roscopicsilicovo|<br \/>\n00000020&nbsp;&nbsp;6c&nbsp;63&nbsp;61&nbsp;6e&nbsp;6f&nbsp;63&nbsp;6f&nbsp;6e&nbsp;&nbsp;69&nbsp;6f&nbsp;73&nbsp;69&nbsp;73&nbsp;<span style=\"color:red\">00<\/span>&nbsp;0a&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;|lcanoconiosis..|<br \/>\n0000002f<\/code><\/p>\n<p>Pay attention to the last word of output. Byte 0x73 is the final &#8216;s&#8217; in pneumonoultramicroscopicsilicovolcanoconiosis. Then comes a null character, 0x00, then finally the newline 0x0a. The null character doesn&#8217;t print, so you don&#8217;t see it in the output &#038;mdash but it&#8217;s there!<\/p>\n<p>When you code a program that outputs text, always double-check the output to confirm that it&#8217;s free from unwanted null characters. For this month&#8217;s Exercise, an early solution looked good, but it was polka dotted with null characters. I noticed the flaw in my logic, and fixed the code to prevent the null characters from appearing.<\/p>\n<p>I can think of a number of ways to fix the sample code, most of which involve deconstructing the <em>while<\/em> loop&#8217;s clever condition. This alternative works:<\/p>\n<p><code>while( word[x] )<br \/>\n&nbsp;&nbsp;&nbsp;&nbsp;putchar(word[x++]);<\/code><\/p>\n<p>These statements show the <em>while<\/em> loop as just two lines, which retains the incrementing operator inside the brackets. My preferred way to write this code is to set the <code>x++<\/code> as its own statement, which is more readable, but whatever.<\/p>\n<p>Any way you code it, always confirm that the output is free from null characters. Yes, you may not see them, and users wouldn&#8217;t notice them, but they may have unintentional consequences for non-obvious uses of the program.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Do you think that your code is perfect? Run it through <em>hexdump<\/em> and guess again! <a href=\"https:\/\/c-for-dummies.com\/blog\/?p=7156\">Continue reading <span class=\"meta-nav\">&rarr;<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[2],"tags":[],"class_list":["post-7156","post","type-post","status-publish","format-standard","hentry","category-main"],"_links":{"self":[{"href":"https:\/\/c-for-dummies.com\/blog\/index.php?rest_route=\/wp\/v2\/posts\/7156","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/c-for-dummies.com\/blog\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/c-for-dummies.com\/blog\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/c-for-dummies.com\/blog\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/c-for-dummies.com\/blog\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=7156"}],"version-history":[{"count":5,"href":"https:\/\/c-for-dummies.com\/blog\/index.php?rest_route=\/wp\/v2\/posts\/7156\/revisions"}],"predecessor-version":[{"id":7168,"href":"https:\/\/c-for-dummies.com\/blog\/index.php?rest_route=\/wp\/v2\/posts\/7156\/revisions\/7168"}],"wp:attachment":[{"href":"https:\/\/c-for-dummies.com\/blog\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=7156"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/c-for-dummies.com\/blog\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=7156"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/c-for-dummies.com\/blog\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=7156"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}