A Better camelCase to snake_case Conversion

My solution for this month’s Exercise took some terrible assumptions. First, that the strings are merely output and not stored. Second, that the strings are perfectly formed camelCase and snake_case. In this Lesson, I address the first concern.

To store the strings instead of outputting them directly, memory must be allocated. To determine the size, I need to know the original string size and then calculate the new string size based on the type of conversion. The issue here is whether to be spot-on accurate to just provide enough room for the worst case size of any given string.

It’s easier not to be spot-on accurate.

For the snake_case to camelCase conversion, the new string is always shorter than the original. Therefore, I can allocate storage based on the original string’s size.

For the camelCase to snake_case conversion, one character is added for each capital in the camelCase name. Here my cheat is to double the original string size, which more than handles a situation where every other character is capitalized (improbable but possible).

Here is my updated solution for this month’s Exercise. Error-checking on the malloc() function is omitted to keep the source code file short:

2023_08_12-Lesson.c

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <ctype.h>

int main()
{
    const int count = 7;
    char *variable[] = {
        "readInputMeter",
        "cyclical_redundancy_check",
        "bumpyRide",
        "search_for_node",
        "string_convert",
        "divideByZeroError",
        "giveUpAndExplode"
    };
    char *v[count];
    int x,c;
    char *n;

    for(x=0; x<count; x++ )
    {
        n = variable[x];        /* initialize pointer n */
        c = 0;                    /* initialize offset */

        /* test for the underscore */
        if( strchr(variable[x],'_') )
        {
            /* name is in snake_case */
            /* camelCase will be shorter, so just allocate
               the same storage */
            v[x] = malloc( strlen(variable[x]) + 1 );
            /* error checking goes here */
            while( *n )
            {
                if( *n=='_' )
                {
                    n++;
                    *(v[x]+c) = toupper(*n);
                }
                else
                {
                    *(v[x]+c) = *n;
                }
                n++;
                c++;
            }
        }
        else
        {
            /* name is in camelCase */
            /* allocate storage for worst case */
            v[x] = malloc( strlen(variable[x]) * 2 + 1 );
            /* error checking goes here */
            while( *n )
            {
                if( isupper(*n) )
                {
                    *(v[x]+c) = '_';
                    c++;
                    *(v[x]+c) = tolower(*n);
                }
                else
                {
                    *(v[x]+c) = *n;
                }
                n++;
                c++;
            }
        }
        /* cap the string */
        *(v[x]+c) = '\0';
    }

    /* output the result */
    for(x=0; x<count; x++ )
        printf("%25s -> %s\n",
                variable[x],
                v[x]
              );

    return(0);
}

I’ve added several new variables to help build and store the new strings:

*v[] is a char pointer array to store the new, converted strings.

c is an int variable used to calculate offsets within a string. This variable is initialized to zero at each turn of the for loop: c = 0

For the snake_case to camelCase conversion, storage is allocated based on the size of the original string: v[x] = malloc( strlen(variable[x]) + 1 );

As opposed to being output (from my original solution), characters are stored instead, using variable c as the offset within the freshly-allocated buffer: *(v[x]+c) = tolower(*n); and *(v[x]+c) = *n;

Variable n is incremented through the original string. Variable c is incremented through the new string.

For the camelCase to snake_case conversion, I allocated memory based on double the size of the original string: v[x] = malloc( strlen(variable[x]) * 2 + 1 ); This is a bit of overkill, which I’ll address in next week’s Lesson.

As with the other conversion, variable n plows through the original string while variable c helps build the new string in the allocated storage.

After the strings are created, their capped with the null character as termination: *(v[x]+c) = '\0';

The output is the same as for the original solution, with the new strings are stored and not output directly. I suppose this approach is better, though more improvement is possible. I cover this step in next week’s Lesson.

Leave a Reply