More Than One String in a String

I’ve seen some oddball constructions in C. You may have as well, especially if you enjoy reading obfuscated C. Yet, the weirdness I just witnessed came from an online C course I was browsing. I’d never seen it before.

When you declare a string, you assign it to an array like this:

char string[] = "Hello!\n";

Array string[] contains the string literal, the characters enclosed in double quotes, complete with a terminating null character added automatically by the compiler. This statement represents one way to declare a string. It’s common.

But what about the construction in this code:

2021_10_23-Lesson.c

#include <stdio.h>

int main()
{
    char long_string[] = 
        "Hello!\n"
        "This is a long string\n"
        "How will it format?\n";

    printf("%s",long_string);

    return(0);
}

Array long_string[] contains a single string, though the string itself is defined as three separate string literals, each appearing on a line by itself, each enclosed in double quotes. This type of declaration is valid in C. I’ve never seen it before.

Check the listing again: You see that are no commas between the strings. The array is a char array, not an array of strings (pointers). It’s just that the long string is split between several lines. The compiler assembles each string, one after the other, with only a single null character at the end.

Here is the output:

Hello!
This is a long string
How will it format?

What happens if you scrunch the entire expression onto one line?

char long_string[] = "Hello!\n" "This is a long string\n" "How will it format?\n";

The thing still compiles. It runs. The output is the same. After all, what’s missing between the above statement and the way it appears in the full code earlier is just whitespace. So does this mean a single string can be composed of multiple string chunks?

Yes.

According to the C standard:

Adjacent string literal tokens are concatenated.

This rule doesn’t mean you can concatenate strings as is done in other languages, such as:

string1 = string_a + string_b

The rule does, however, mean that you can use multiple string literals to declare a single string. I don’t know what the advantage is for doing so, other than it’s possible to write a multi-line string without messing up anything. For example:

sonnet18[] = 
    "Shall I compare thee to a summer’s day?\n"
    "Thou art more lovely and more temperate:\n"
    "Rough winds do shake the darling buds of May,\n"
    "And summer’s lease hath all too short a date;\n";

The type of formatting shown above might be easier to type than trying to cram everything into a single, long string literal. Just remember not to separate the string literals with commas, and that the entire construction terminates with the required semicolon. It’s weird, yes, but it works.

Leave a Reply