From String to Binary

A function that reads a string of 1s and 0s to generate an integer value output just screams to be called binput(). Yet, because my new version of the binbin() function (to display a binary value as a string) is called binString(), I decided to call its companion function stringBin(). Sorry to disappoint you.

The stringBin() function works to process each character in an input string one after the other. The returned int value is initialized to zero, which means the processing needs only to set bits: When a '1' is encountered in the string, a bit is set; when a '0' is encountered, the bit is skipped or left at its current, zero value.

The bitwise OR operator, | (the pipe), sets the bits. I wrote a post on this topic a while back. The overall process works similarly to how bits are tested in the binString() function, which uses the bitwise AND operator, &. Here’s the meat of the function:

while( *b != '\n' )
{
    n <<= 1;
    bit = *b=='1' ? 1 : 0;
    if(bit)
        n |= 0x0001;
    b++;
    if( *b=='\0')
        break;
}

The input string is referenced by pointer b. Variable n holds the final integer value. It's shifted to the left one notch, which has no effect on the initial value (zero) for thewhile loop's first spin.

The ternary function tests the character referenced by b. If it's a '1', variable bit is assigned the value 1, zero otherwise. At this point, I could do further testing to ensure that character *b is a '1' or '0', but like the scanf() function, invalid input interprets as zero. Remember, the scanf() function is handy, but not thorough.

If the character '1' was encountered and variable bit is equal to 1, the statement n |= 0x0001 is executed and the right-most bit in n is set. Pointer b is then incremented. A test is made for the null character (string termination), then the loop repeats with n shifted left one notch.

After the while loop stops, bits in variable n are set match the '1' characters positioned in the string b.

Here's the full code:

#include <stdio.h>

int stringBin(char *bin)
{
    char *b;
    unsigned n,bit;

    b = bin;
    n = 0;

    while( *b != '\n' )
    {
        n <<= 1;
        bit = *b=='1' ? 1 : 0;
        if(bit)
            n |= 0x0001;
        b++;
        if( *b=='\0')
            break;
    }

    return(n);
}

int main()
{
    char input[18];
    int v;

    printf("Type up to a 16-digit binary value: ");
    fgets(input,18,stdin);
    v = stringBin(input);
    printf("That's %d\n",v);

    return(0);
}

And some sample runs:

Type up to a 16-digit binary value: 1111111111111111
That's 65535

Type up to a 16-digit binary value: 01110000110000001
That's 57729

Type up to a 16-digit binary value: 1111    
That's 15

Type up to a 16-digit binary value: 11110000
That's 240

The code works when the string is shorter than 16 characters, accurately reflecting the value. And for nonsense values, the code tries:

Type up to a 16-digit binary value: hello
That's 0

Type up to a 16-digit binary value: 4321
That's 1

The code can also overflow: I set the buffer and input value to account for a 16-character string, plus the newline and null character. Still, it's possible to type an 18-character string as input, in which case the answer is displayed properly, but it's technically an overflow. You can fix this issue on your own, if you like.

It's rare that I've need binary input for any of my code. But I wrote the stringBin() function anyway. You never know.

3 thoughts on “From String to Binary

  1. I find I need to convert between bases occasionally, and usually throw something together in a hurry thinking “I’ll go back later and do something more efficient and elegant” but of course I never do!

    Can I make a suggestion – how about a post on converting to/from hexadecimal? And if you’re feeling ambitious how about base 64?

  2. Sure, I can do hex, which is easier!

    I’m intrigued about base 64! I’ll read up on it; apparently my nerd reading list must be improved.

  3. I had to base 64 encode something years ago (can’t remember why) and I was mostly using C# and .NET at the time so just used the relevant .NET class.

    I remember looking into it briefly at the time and finding that although 0-9 and the 52 lower and upper case letters were used for the first 62 values, there were several “standards” in use for the last two characters.

Leave a Reply