Wide Characters and Unicode, Part III

Programming is a fun hobby because once you unlock and open a door, you immediately find another locked door. Normal humans would experience frustration, but a good programmer enjoys a challenge and even sees the humor in the situation. A case in point is learning how to program wide characters in C: Once you think you’ve cleared a huge hurdle, you find another, taller one right behind it.

For example, you think you understand wide characters and use the putwchar() function with ease. (See last week’s Lesson.) Then you try using the printf() equivalent, wprintf(), and you’re back to square one.

Don’t worry, I’ve been there, too.

The issue with the wprintf() function, as well as the wide-character input function wscanf(), is that all the text in the function must be specified as wide characters. Unless you work with wide characters all the time, the solution isn’t apparent, and the error message doesn’t help.

Just tell me the solution!

You must specify wide-character strings and wide-character placeholders to make the wide-character input and output functions work. A wide-character string is prefixed by the letter L:

L"I am a wide string"

Prefixing an L (for long) before a quoted string converts it into unsigned integer values — wide characters.

The placeholder for a wide character is %lc, with a lowercase L before lowercase C. This placeholder is used in both the wprintf() and wscanf() functions.

#include <locale.h>
#include <wchar.h>

int main()
{
    wchar_t suits[4] = {
        0x2660, 0x2665, 0x2663, 0x2666
    };
    wchar_t s;
    int x;

    setlocale(LC_CTYPE,"UTF-8");

    printf("Enter suit: ( ");
    for(x=0;x<4;x++)
    {
        wprintf(L"%lc ",suits[x]);
    }
    printf("): ");

    wscanf(L"%lc",&s);
    wprintf(L"Suit set to %lc\n",s);

    return(0);
}

The wprintf() function at Line 17 uses the formatting string "%lc" to output the wide characters stored in array suits[]. The array is defined at Line 6 as the type wchar_t, wide characters. The formatting string must be composed of wide-characters, so it’s prefixed with an L.

The wscanf() function at Line 21 reads input for a wide character, though you can also type plain ASCII text. Again, the input format string is prefixed with an L, and the %lc placeholder is used. Input variable s is of the wchar_t type.

Finally, Line 22 should make sense by now, with the wprintf() function using various wide-character doodads.

The sample run:

Enter suit: ( ♠ ♥ ♣ ♦ ): ♠
Suit set to ♠

Above, I copied and pasted the spade character because the keyboard lacks a spade character key and I don’t know any other ways to type a wide character in a terminal window.

In next week’s Lesson, I cover wide-character input functions beyond wscanf().

4 thoughts on “Wide Characters and Unicode, Part III

  1. On my Ubuntu based distro (gcc –version = (Ubuntu 5.4.0-6ubuntu1~16.04.4) 5.4.0 20160609) the code snippet gave me a few errors but the following works:

    #include
    #include

    int main()
    {
    wchar_t suits[4] = { 0x2660, 0x2665, 0x2663, 0x2666 };
    wchar_t s;
    int x;

    setlocale(LC_CTYPE,”en_US.UTF-8″);
    wprintf(L”Enter suit: ( “);
    for (x=0;x String Value, Type = REG_SZ.

  2. Bruce, a lot of your code was gobbled by HTML. The posting guidelines offer some suggestions: http://c-for-dummies.com/blog/?page_id=1274

    The biggest offender are less-than/greater-than, which must be specified using the ampersand prefixes.

    I would enjoy seeing your solution. If you have further trouble posting it, email it to me and I’ll add it myself. Thanks!

  3. One more time … On my Ubuntu based Linux (gcc –version = (Ubuntu 5.4.0-6ubuntu1~16.04.4) 5.4.0 20160609) the code snippet gave me a few errors but the following works:

    #include <wchar.h>
    #include <locale.h>

    int main()
    {
    wchar_t suits[4] = { 0x2660, 0x2665, 0x2663, 0x2666 };
    wchar_t s;
    int x;

    setlocale(LC_CTYPE,”en_US.UTF-8″);

    wprintf(L”Enter suit: ( “;);
    for (x=0;x<4;x++) {
    putwchar(suits[x]);
    putwchar(‘ ‘);
    }
    wprintf(L”): “);

    wscanf(L”%lc”, &s);
    wprintf(L”;Suit set to %lc\n”;, s);

    return(0);
    }

    For Unicode entry:
    On Linux press Shift+Ctrl+u, then enter the hexidecial code, then press Enter.

    On Windows 7 press and hold Alt key, then press ‘+’ on number pad, then enter the hexidecimal code, then release Alt.
    +++
    Surprise! This requires the following registry change 🙁
    Add HKEY_Current_User/Control Panel/Input Method,
    set EnableHexNumpad to ‘1’
    If adding this key use New > String Value, Type = REG_SZ.

  4. I can confirm that your code also works on my Ubuntu Linux. Thank you so much for the update, as well as for the Windows 7 tip.

Leave a Reply