Dump That File!

One of the many, useful tools a programmer must have is a hexdump utility. The utility consumes a file’s raw bytes and outputs them in a human-readable manner. By examining the dump, you can determine if file contains the proper data in the correct format, as well as do other interesting, useful, and technerd things.

The traditional Unix utility is hexdump, which outputs file in a format like this (by using the -C switch):

00000000  4f 6e 65 20 6f 66 20 74  68 65 20 74 72 61 64 69  |One of the tradi|
00000010  74 69 6f 6e 61 6c 20 55  6e 69 78 20 75 74 69 6c  |tional Unix util|
00000020  69 74 69 65 73 20 69 73  20 3c 65 6d 3e 68 65 78  |ities is <em>hex|
00000030  64 75 6d 70 3c 2f 65 6d  3e 2c 20 77 68 69 63 68  |dump</em>, which|
00000040  20 6f 75 74 70 75 74 73  20 66 69 6c 65 20 69 6e  | outputs file in|
00000050  20 61 20 66 6f 72 6d 61  74 20 6c 69 6b 65 20 74  | a format like t|
00000060  68 69 73 3a 0a                                    |his:.|
00000065

The last line lists the number of bytes in the file, 65 hex or 101 characters. The file examined was a text file, though any file can be output as raw data; for non-printable characters, a dot (period) appears in the text column. Further, the hexdump utility skips over duplicate rows of zero bytes, which keeps the output short.

Your challenge for this month’s Exercise is to write a hexdump-like utility. Have it output the contents of any file, specified on the command line, in a multi-column format with 2-digit hex bytes and an ASCII/character column. At the end of output, display a total byte count.

Here is a sample run of my solution:

Dump of 'a.out':
0000  CF FA ED FE 07 00 00 01 - 03 00 00 80 02 00 00 00   ................
0010  10 00 00 00 58 05 00 00 - 85 00 20 00 00 00 00 00   ....X..... .....
0020  19 00 00 00 48 00 00 00 - 5F 5F 50 41 47 45 5A 45   ....H...__PAGEZE
0030  52 4F 00 00 00 00 00 00 - 00 00 00 00 00 00 00 00   RO..............
0040  00 00 00 00 01 00 00 00 - 00 00 00 00 00 00 00 00   ................
0050  61 74 65 00                                         ate.
 84 total bytes

See how the final line, 0050, is terminated? Only four hex bytes are output, then a lot of whitespace, and finally the character equivalents. Also, the total byte value is given in decimal, not hex. Duplicate this presentation in your program as well, which is similar to how the hexdump utility works.

Please try this Exercise on your own before you peek at my solution.

Leave a Reply