Sorting the Hexwords, Part II

The problem with the code from last week’s Lesson is obvious: The decimal value of FEED is 1,044,205, not 2,314,885,530,818,453,605 as shown in the output. Before I can sort the list numerically, this error must be addressed.

My first guess was that the strtol() function overflowed — which it did so when I tried to calculate a Look-and-Say sequence. But overflow isn’t the problem.

Scoping out the values in the code, I could see that the proper string is being translated, but not stored. So it’s most likely a silly pointer error.

Check out this line from the code:

list = realloc( list, sizeof(struct entry) * count+1 );

This statement seeks to increase the size of the list buffer by count items, plus one. But due to the order of precedence, the size is increased by count items plus one byte. The proper statement is:

list = realloc( list, sizeof(struct entry) * (count+1) );

By enclosing count+1 in parentheses, the calculation properly increases the size of buffer list by count items (entry structures), plus one. This new size creates enough room to store the next item in the dynamic array. The proper value is then saved, which makes the list valid. Problem solved.

With the memory properly allocated, the next (and final) task is to sort the hexwords by their values. The value member of the entry structure is used. Here is the qsort() function from my code:

qsort(list,count,sizeof(struct entry),compare);

The items to sort are at address list. There are count items in the list. Each item is the size of an entry structure. The final argument is the compare() function, which can prove to be a bit of a pickle to code.

The qsort() compare() function accepts two arguments, but const void pointers, a and b. In the function, the trick is to convert these const void pointers into entry structure pointers for comparison. I cover this technique in a blog post from two years ago. Here’s the compare() function required:

int compare(const void *a, const void *b)
{
    struct entry *x,*y;
 
    x = (struct entry *)a;
    y = (struct entry *)b;
 
    return( x->value - y->value );
}

Two entry structure pointers are declared, *x and *y. These are initialized to the const void pointers a and b (the address arguments), but typecast to entry structure pointers. The return value is the difference between a and b, but translated into the proper data type.

Here is the full code:

2024_11_30-Lesson.c

/* Hex Words */
#include <stdio.h>
#include <stdlib.h>
#include <string.h>

/* this code assumes the following path is valid */
#define DICTIONARY "/usr/share/dict/words"
#define SIZE 32

/* move this outside */
struct entry {
    char hex[SIZE];
    long value;
};

int compare(const void *a, const void *b)
{
    struct entry *x,*y;

    x = (struct entry *)a;
    y = (struct entry *)b;

    return( x->value - y->value );
}

int main()
{
    FILE *dict;
    char word[SIZE],hexword[SIZE],*r,*w;
    int count,x;
    struct entry *list;

    /* open the dictionary */
    dict = fopen(DICTIONARY,"r");
    if( dict==NULL )
    {
        fprintf(stderr,"Unable to open %s\n",DICTIONARY);
        exit(1);
    }

    /* read the dictionary */
    count = 0;
    while( !feof(dict) )
    {
        /* read a word from the dictionary */
        r = fgets(word,SIZE,dict);
        if( r==NULL )    /* no word, done */
            break;

        /* remove newline */
        w = word;
        while(*w)
        {
            if( *w=='\n' )
            {
                *w = '\0';
                break;
            }
            w++;
        }

        /* pull out only hex characters */
        sscanf(word,"%[ABCDEFabcdef]",hexword);

        /* compare hexword with original word */
        if( strcmp(word,hexword)==0 && strlen(hexword)>3 )
        {
            if( count==0 )
            {
                list = malloc( sizeof(struct entry) );
                if( list==NULL )
                {
                    fprintf(stderr,"Unable to allocate memory\n");
                    exit(1);
                }
                strcpy( list->hex,hexword );
                list->value = strtol(hexword,NULL,16);
            }
            else
            {
                list = realloc( list, sizeof(struct entry) * (count+1) );
                if( list==NULL )
                {
                    fprintf(stderr,"Unable to reallocate memory\n");
                    exit(1);
                }
                strcpy( (list+count)->hex,hexword );
                (list+count)->value = strtol(hexword,NULL,16);
            }
            count++;
        }
    }

    /* sort the list */
    qsort(list,count,sizeof(struct entry),compare);

    /* output the list */
    for( x=0; x<count; ++x )
        printf("%3d: %-10s %10ld\n",
                x+1,
                (list+x)->hex,
                (list+x)->value
              );

    /* clean-up */
    fclose(dict);
    free(list);
    return(0);
}

And the output:

  1: abed            44013
  2: aced            44269
  3: babe            47806
  4: bade            47838
  5: bead            48813
...
 38: efface       15727310
 39: facade       16435934
 40: acceded     181202413
 41: defaced     233811181
 42: effaced     251636973

You can’t see it in the abbreviated output above, but the value for feed is properly translated. Though it’s the last value output alphabetically, numerically hexword effaced has the highest value.

I truly enjoyed the process of creating this program. Finding and fixing the various puzzles and addressing the unintended errors was fun! This delight is why I enjoy programming. I hope that you do as well.

4 thoughts on “Sorting the Hexwords, Part II

  1. I’m unsure what you mean by “functional language.” If you’re referring to its application, the issue is that other languages are better suited for more rapid production. C is a mid-level language, which means it needs more overhead to get the job done, efforts unnecessary for Java or Python.

    On the other hand, C is very well suited for low-level work, direct hardware programming, operating systems, and so on. It’s also widely available and well-documented.

    C has its uses. It can program anything. But for many tasks, it’s not the best choice.

Leave a Reply