{"id":4022,"date":"2020-03-08T00:01:46","date_gmt":"2020-03-08T08:01:46","guid":{"rendered":"https:\/\/c-for-dummies.com\/blog\/?p=4022"},"modified":"2020-03-08T08:26:35","modified_gmt":"2020-03-08T15:26:35","slug":"dump-that-file-solution","status":"publish","type":"post","link":"https:\/\/c-for-dummies.com\/blog\/?p=4022","title":{"rendered":"Dump That File! &#8211; Solution"},"content":{"rendered":"<p>I&#8217;ve been coding <em>hexdump<\/em> utilities since the microcomputer era. They&#8217;re just so handy, especially when writing structures or other formatted data to a file. The dump assists with debugging, and it helps you figure out some undocumented data structures as well.<br \/>\n<!--more--><br \/>\nThe good news is that most of the standard C library functions can easily lend themselves to writing a <em>hexdump<\/em> utility, which is the challenge for <a href=\"https:\/\/c-for-dummies.com\/blog\/?p=4005\">this month&#8217;s Exercise<\/a>. For my solution, the code works as follows:<\/p>\n<p><strong>1. Check for command line arguments and open the file.<\/strong><\/p>\n<p>If the argument count is less than two, my code outputs an error message and quits. Otherwise, the command line argument is assumed to be a file and it&#8217;s opened for reading.<\/p>\n<p><strong>2. Main loop.<\/strong><\/p>\n<p>After opening the file, my solution wends its way into a <em>while<\/em> loop that repeats until the file pointer, <code>fp<\/code>, returns true for the end-of-file:<\/p>\n<pre class=\"screen\">\r\nwhile( !feof(fp) )\r\n<\/pre>\n<p><strong>3. Read a 16-byte chunk from the file.<\/strong><\/p>\n<p>The <em>hexdump<\/em> utility consumes file data 16-bytes at a time, so I use the <em>fread()<\/em> function to pull in a 16-byte chunk:<\/p>\n<pre class=\"screen\">\r\nr = fread(buffer,sizeof(unsigned char),16,fp);\r\n<span class=\"comments\">\/* break on EOF *\/<\/span>\r\nif( r==0 )\r\n    break;<\/pre>\n<p>Data read is stored in the <em>unsigned char<\/em> <code>buffer<\/code>. It must be <em>unsigned<\/em> or full hex values (<code>0x80<\/code> to <code>0xff<\/code>) aren&#8217;t read properly.<\/p>\n<p>Variable <code>r<\/code> holds the number of bytes actually read, which comes into play when outputting the final row. Further, the value of <code>r<\/code> is tested for zero to catch a condition where the file has been fully read but the <code>EOF<\/code> isn&#8217;t caught in the <em>while<\/em> loop condition.<\/p>\n<p><strong>4. Generate output for the 16-byte row.<\/strong><\/p>\n<p>Output happens in three steps.<\/p>\n<p>Variable <code>offset<\/code> is initialized to zero, then incremented for each iteration of the loop. But first, it&#8217;s output:<\/p>\n<pre class=\"screen\">\r\n<span class=\"comments\">\/* print offset *\/<\/span>\r\nprintf(\"%04X \",offset);<\/pre>\n<p>In step two, hex values are output:<\/p>\n<pre class=\"screen\">\r\n<span class=\"comments\">\/* print hex values *\/<\/span>\r\nfor( x=0; x&lt;r; x++ )\r\n{\r\n    printf(\" %02X\",buffer[x]);\r\n    if( x==7 )\r\n        printf(\" -\");\r\n}<\/pre>\n<p>See how variable <code>r<\/code> sets the repeat count for the loop? This is how the final row is set only as long as the number of bytes read from the file.<\/p>\n<p>The <em>if<\/em> test sets the dash after the 8th hex byte is output.<\/p>\n<p>If the value of variable <code>r<\/code> is less than 16, meaning the final line of data has been read from the file, a <em>for<\/em> loop fills the balance of the hex column with whitespace:<\/p>\n<pre class=\"screen\">\r\n<span class=\"comments\">\/* fill in the rest of a blank row *\/<\/span>\r\nif( r&lt;16 )\r\n{\r\n    for( x=r; x&lt;16; x++ )\r\n    {\r\n        printf(\"   \");\r\n        if( x==7 )\r\n            printf(\"  \");\r\n    }\r\n}<\/pre>\n<p>The third step is to output the ASCII column. An <em>if<\/em> test in the <em>for<\/em> loop generates the single dot output for unprintable characters that would otherwise mess up the display:<\/p>\n<pre class=\"screen\">\r\n<span class=\"comments\">\/* print ASCII *\/<\/span>\r\nprintf(\"   \");\r\nfor( x=0; x&lt;r; x++)\r\n{\r\n    if( buffer[x]&lt;32 || buffer[x]&gt;126 )\r\n        putchar('.');\r\n    else\r\n        putchar(buffer[x]);\r\n}\r\nputchar('\\n');\r\noffset += r;<\/pre>\n<p>And, finally, the <code>offset<\/code> variable is increased by the value of <code>r<\/code>, as shown in the final statement above.<\/p>\n<p>The loop continues until all bytes are read from the file.<\/p>\n<p><strong>5. Display the byte count total and close the file.<\/strong><\/p>\n<p>The code wraps up outputting the value of variable offset and closing the open file.<\/p>\n<p><a href=\"https:\/\/github.com\/dangookin\/C-For-Dummies-Blog\/blob\/master\/2020_03-Exercise.c\" rel=\"noopener noreferrer\" target=\"_blank\">Click here<\/a> to view the full code for my solution on GitHub. I hope you were able to devise a similar solution; writing the dump part isn&#8217;t really that difficult. No, the tough part is coding that final line so that the hex and ASCII columns line up evenly.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>I&#8217;ve been coding hexdump utilities since the microcomputer era. They&#8217;re just so handy, especially when writing structures or other formatted data to a file. The dump assists with debugging, and it helps you figure out some undocumented data structures as &hellip; <a href=\"https:\/\/c-for-dummies.com\/blog\/?p=4022\">Continue reading <span class=\"meta-nav\">&rarr;<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[5],"tags":[],"class_list":["post-4022","post","type-post","status-publish","format-standard","hentry","category-solution"],"_links":{"self":[{"href":"https:\/\/c-for-dummies.com\/blog\/index.php?rest_route=\/wp\/v2\/posts\/4022","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/c-for-dummies.com\/blog\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/c-for-dummies.com\/blog\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/c-for-dummies.com\/blog\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/c-for-dummies.com\/blog\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=4022"}],"version-history":[{"count":7,"href":"https:\/\/c-for-dummies.com\/blog\/index.php?rest_route=\/wp\/v2\/posts\/4022\/revisions"}],"predecessor-version":[{"id":4041,"href":"https:\/\/c-for-dummies.com\/blog\/index.php?rest_route=\/wp\/v2\/posts\/4022\/revisions\/4041"}],"wp:attachment":[{"href":"https:\/\/c-for-dummies.com\/blog\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=4022"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/c-for-dummies.com\/blog\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=4022"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/c-for-dummies.com\/blog\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=4022"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}