The Linux dictionary stores its words sorted alphabetically. Therefore, the output from the program presented in last week’s Lesson shows valid hexwords (letters A through F, four letters or longer) in alphabetic order. But what if I want the words output in numerical order?
No correlation exists between a hexadecimal value’s alphabetic sort and its numeric sort. For example DECAF is 912,559 but DEED is 57,069. Alphabetically, DEED is higher but numerically DEFAC is higher. If I desire to sort the hexword output by value, I must collect all the words first, then sort them by value before outputting the results. Accomplishing this task requires some rewriting of the code from last week’s Lesson.
To build a list of valid hexwords I define a structure:
struct entry {
char hex[SIZE];
long value;
}
The entry structure contains members for both the hexword as well as its value. This second member (value
) may seem redundant because the hex[SIZE]
member is a hex value, albeit in a string. However, storing only the hexword string requires a double-pointer if I’m creating a dynamic list. No, using a structure requires fewer brain cells.
Based on last week’s Lesson, I know that just over 40 valid hexwords exist in the Linux dictionary. Even so, I opted to dynamically allocate the structures: Each time a valid hexword is found, storage for a new structure is allocated in memory. I could have worked this operation as a linked list, but opted instead to use offsets within a single memory chunk — like a dynamic array.
Here’s the code:
2024_11_23-Lesson.c
/* Hex Words */ #include <stdio.h> #include <stdlib.h> #include <string.h> /* this code assumes the following path is valid */ #define DICTIONARY "/usr/share/dict/words" #define SIZE 32 int main() { FILE *dict; char word[SIZE],hexword[SIZE],*r,*w; int count,x; struct entry { char hex[SIZE]; long value; } *list; /* open the dictionary */ dict = fopen(DICTIONARY,"r"); if( dict==NULL ) { fprintf(stderr,"Unable to open %s\n",DICTIONARY); exit(1); } /* read the dictionary */ count = 0; while( !feof(dict) ) { /* read a word from the dictionary */ r = fgets(word,SIZE,dict); if( r==NULL ) /* no word, done */ break; /* remove newline */ w = word; while(*w) { if( *w=='\n' ) { *w = '\0'; break; } w++; } /* pull out only hex characters */ sscanf(word,"%[ABCDEFabcdef]",hexword); /* compare hexword with original word */ if( strcmp(word,hexword)==0 ) { if( strlen(hexword) > 3 ) { if( count==0 ) { list = malloc( sizeof(struct entry) ); if( list==NULL ) { fprintf(stderr,"Unable to allocate memory\n"); exit(1); } strcpy( list->hex,hexword ); list->value = strtol(hexword,NULL,16); } else { list = realloc( list, sizeof(struct entry) * count+1 ); if( list==NULL ) { fprintf(stderr,"Unable to reallocate memory\n"); exit(1); } strcpy( (list+count)->hex,hexword ); (list+count)->value = strtol(hexword,NULL,16); } count++; } } } /* output the list */ for( x=0; x<count; ++x ) printf("%3d: %-10s %10ld\n", x+1, (list+x)->hex, (list+x)->value ); /* clean-up */ fclose(dict); free(list); return(0); }
The entry structure is declared as a pointer, *list
. This pointer isn’t accessed until the first valid hexword is found. (Line 55 in the code.)
An if statement tests whether variable count
is equal to zero, which means the list is just starting. When true, the malloc() function initially allocates storage for one entry structure at address list
. The hexword is copied into the list->hex
member and its value converted and stored in list->value
.
The else condition uses the realloc() function to expand the list beyond the first entry. It adds new hexwords and their values as encountered:
list = realloc( list, sizeof(struct entry) * count+1 );
The word and its value are copied into the structure at an offset (count
) within the list
memory chunk.
After the dictionary is scanned, and the list built, a for loop outputs the contents, names and values of all the hexwords in the list. The list buffer is then freed. Stuff is cleaned up. The program ends.
Here is the output, which should match the same output as shown in last week’s Lesson:
1: Bede 48862
2: Beebe 782014
3: DECed 912621
4: Dacca 896202
5: Dada 56026
...
38: face 64206
39: faced 1027309
40: fade 64222
41: faded 1027565
42: feed 2314885530818453605
The output is almost the same . . . except for that final value. Obviously, hex value FEED
isn’t equal to 2,314,885,530,818,453,605!
See if you can find the error in the code. It took me a while, but it’s in there. I’ll reveal where in next week’s Lesson. I’ll also add the code that sorts the output numerically.