{"id":6653,"date":"2024-11-09T00:01:14","date_gmt":"2024-11-09T08:01:14","guid":{"rendered":"https:\/\/c-for-dummies.com\/blog\/?p=6653"},"modified":"2024-11-16T09:22:41","modified_gmt":"2024-11-16T17:22:41","slug":"hexwords","status":"publish","type":"post","link":"https:\/\/c-for-dummies.com\/blog\/?p=6653","title":{"rendered":"HexWords"},"content":{"rendered":"<p>Hexadecimal, or counting base 16, uses letters A through F to represent values 11 through 15. This base &mdash; &#8220;hex&#8221; &mdash; is common in programming as it works as a shorthand for binary values. But the letters used are also letters, which means that they can spell words.<br \/>\n<!--more--><br \/>\nA recent challenge on <a href=\"https:\/\/rosettacode.org\/wiki\/Hex_words\" rel=\"noopener\" target=\"_blank\">Rosetta Code<\/a> is to use a digital dictionary to find all hex words with four or more letters. These are words like FACE or CAFE, which are hex values 64,236 and 51,966, respectively.<\/p>\n<p>The challenge went on to order the results and perform other magic, but it got me curious.<\/p>\n<p>Last year I wrote <a href=\"https:\/\/c-for-dummies.com\/blog\/?p=6065\" rel=\"noopener\" target=\"_blank\">a series<\/a> about accessing the Linux dictionary file and then plundering it for various words. I can use the same techniques to find hexwords, or those words in the electronic dictionary that are composed only of letters A through F.<\/p>\n<p>I approach this problem in two steps.<\/p>\n<p>First read every word in the digital dictionary. It&#8217;s found at <code>\/usr\/share\/dict\/words<\/code> for most Linux configurations.<\/p>\n<p>Second, scan each found word for matches with the letters A through F, both upper- and lowercase. This part may seem like a lot of work, but the <em>scanf()<\/em> function has a special mode that finds only specific letters. I <a href=\"https:\/\/c-for-dummies.com\/blog\/?p=4341\" rel=\"noopener\" target=\"_blank\">wrote about this letter filter<\/a> several years ago.<\/p>\n<p>Here is my code, which outputs all hexwords found in the Linux dictionary:<\/p>\n<h3><a href=\"https:\/\/github.com\/dangookin\/C-For-Dummies-Blog\/blob\/master\/2024_11_09-Lesson.c\" rel=\"noopener\" target=\"_blank\">2024_11_09-Lesson.c<\/a><\/h3>\n<pre class=\"screen\">\r\n<span class=\"comments\">\/* Hex Words *\/<\/span>\r\n#include &lt;stdio.h&gt;\r\n#include &lt;stdlib.h&gt;\r\n#include &lt;string.h&gt;\r\n\r\n<span class=\"comments\">\/* this code assumes the following path is valid *\/<\/span>\r\n#define DICTIONARY \"\/usr\/share\/dict\/words\"\r\n#define SIZE 32\r\n\r\nint main()\r\n{\r\n    FILE *dict;\r\n    char word[SIZE],hexword[SIZE],*r,*w;\r\n\r\n    <span class=\"comments\">\/* open the dictionary *\/<\/span>\r\n    dict = fopen(DICTIONARY,\"r\");\r\n    if( dict==NULL )\r\n    {\r\n        fprintf(stderr,\"Unable to open %s\\n\",DICTIONARY);\r\n        exit(1);\r\n    }\r\n\r\n    <span class=\"comments\">\/* read the dictionary *\/<\/span>\r\n    while( !feof(dict) )\r\n    {\r\n        <span class=\"comments\">\/* read a word from the dictionary *\/<\/span>\r\n        r = fgets(word,SIZE,dict);\r\n        if( r==NULL )    <span class=\"comments\">\/* no word, done *\/<\/span>\r\n            break;\r\n\r\n        <span class=\"comments\">\/* remove newline *\/<\/span>\r\n        w = word;\r\n        while(*w)\r\n        {\r\n            if( *w=='\\n' )\r\n            {\r\n                *w = '\\0';\r\n                break;\r\n            }\r\n            w++;\r\n        }\r\n\r\n        <span class=\"comments\">\/* pull out only hex characters *\/<\/span>\r\n        sscanf(word,\"%[ABCDEFabcdef]\",hexword);\r\n\r\n        <span class=\"comments\">\/* compare hexword with original word *\/<\/span>\r\n        if( strcmp(word,hexword)==0 )\r\n            printf(\"%s\\n\",hexword);\r\n    }\r\n\r\n\r\n    <span class=\"comments\">\/* clean-up *\/<\/span>\r\n    fclose(dict);\r\n    return(0);\r\n}<\/pre>\n<p>The code&#8217;s <em>while<\/em> loop is stolen directly from the earlier post, which scans all words in the dictionary. The found word is stored in buffer <code>word[]<\/code>. An inner <em>while<\/em> loop replaces the newline (read by the <em>fgets()<\/em> function) with a null character, which makes for better matching later in the code.<\/p>\n<p>The <code>sscanf()<\/code> function scans the dictionary word and returns only the portion that contains upper- and lowercase letters A through F.<\/p>\n<p><code>sscanf(word,\"%[ABCDEFabcdef]\",hexword);<\/code><\/p>\n<p>This result is saved in buffer <code>hexword[]<\/code>. If both <code>word[]<\/code> and <code>hexword[]<\/code> match (the result of the <em>strcmp()<\/em> function is zero), a true hexword is found. A <em>printf()<\/em> statement outputs the results.<\/p>\n<p>A sample run of the program generates 120 positive hits. Here&#8217;s a snapshot of the output:<\/p>\n<p><code>A<br \/>\nAA<br \/>\nAAA<br \/>\nAB<br \/>\nABC<br \/>\nAC<br \/>\nAF<br \/>\nAFC<br \/>\n...<br \/>\nfacade<br \/>\nface<br \/>\nfaced<br \/>\nfad<br \/>\nfade<br \/>\nfaded<br \/>\nfed<br \/>\nfee<br \/>\nfeed<\/code><\/p>\n<p>The original Rosetta Code challenge limited output to words four characters long or greater. The code above apply this restriction, which I add <a href=\"https:\/\/c-for-dummies.com\/blog\/?p=6666\">next week<\/a>, along with other updates to sate my inner nerd.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>How many words in the dictionary can be written by using hexadecimal &#8220;digits&#8221; A through F? <a href=\"https:\/\/c-for-dummies.com\/blog\/?p=6653\">Continue reading <span class=\"meta-nav\">&rarr;<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[2],"tags":[],"class_list":["post-6653","post","type-post","status-publish","format-standard","hentry","category-main"],"_links":{"self":[{"href":"https:\/\/c-for-dummies.com\/blog\/index.php?rest_route=\/wp\/v2\/posts\/6653","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/c-for-dummies.com\/blog\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/c-for-dummies.com\/blog\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/c-for-dummies.com\/blog\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/c-for-dummies.com\/blog\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=6653"}],"version-history":[{"count":5,"href":"https:\/\/c-for-dummies.com\/blog\/index.php?rest_route=\/wp\/v2\/posts\/6653\/revisions"}],"predecessor-version":[{"id":6696,"href":"https:\/\/c-for-dummies.com\/blog\/index.php?rest_route=\/wp\/v2\/posts\/6653\/revisions\/6696"}],"wp:attachment":[{"href":"https:\/\/c-for-dummies.com\/blog\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=6653"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/c-for-dummies.com\/blog\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=6653"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/c-for-dummies.com\/blog\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=6653"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}