The alloca() Function

Any memory allocated in a function is retained when the function leaves, unless you first free it. This rule applies to all memory allocation functions — except one.

Of course, memory allocated in the main() function is freed when it returns (the program exits). This is the reason why I’m often lazy and forget to free memory at the end of the main() function, a sin for which I repent. But an exception to the memory allocation/freeing rule is the alloca() function.

Like an auto variable, the alloca() function’s scope is local to the function it’s used in: Memory is allocated, but released when the next function is called. It’s not released upon return from the function (though this aspect may be undefined), but when you call another function the memory alloca() allocated is gone.

To demonstrate the alloca() function, I first wrote code that uses malloc() to allocate and use a buffer:

2024_03_09-Lesson-a.c

#include <stdio.h>
#include <stdlib.h>

char *astring(void)
{
    const int size = 12;
    char *a;
    int x;

    a = malloc(size+1);
    for( x=0; x<size; x++ )
        *(a+x) = 'A';
    *(a+x) = '\0';

    printf("Allocated string: '%s'\n",a);

    return a;
}

void header(void)
{
    puts("After temporary allocation");
}

int main()
{
    char *alpha;
    puts("Temporary allocation");
    alpha = astring();

    header();
    printf("Allocated string: '%s'\n",alpha);
    free(alpha);

    return 0;
}

The astring() function allocates 13 bytes of char storage, which is assigned characters and output. I don’t test the allocation for failure, which is something you must do in your own code. Here, allocating 13 bytes shouldn’t be an issue.

The main() function saves the pointer returned from astring() in variable alpha. Before outputting the string, the header() function is called. Yes, this is weird code, but it’s designed to demonstrate how the alloca() function works: A second function must be called to free the memory (according to the docs). Here’s the ouput:

Temporary allocation
Allocated string: 'AAAAAAAAAAAA'
After temporary allocation
Allocated string: 'AAAAAAAAAAAA'

To update the code with alloca() instead of malloc() these changes are made: stdlib.h is replaced by alloca.h; malloc() is replaced by alloca(); and the free() statement is removed from the main() function. You can click here to view the updated version of the code on GitHub.

Here is the output when clang builds the program:

Temporary allocation
Allocated string: 'AAAAAAAAAAAA'
After temporary allocation
Allocated string: ''

The string is zapped after the call to the header() function.

Being curious, I ran a few tests to confirm how the alloca() function works. If I don’t call the header() function, the allocated string is output:

Temporary allocation
Allocated string: 'AAAAAAAAAAAA'
After temporary allocation
Allocated string: 'AAAAAAAAAAAA'

I had assumed that the allocated storage was released when the function call returned. The man page says that alloca()‘s memory is released when another function is called, which is why the original code called the header() function.

When I tested the code with the GNU compiler, the memory doesn’t seem to be released. Here is the program’s output when built with gcc:

Temporary allocation
Allocated string: 'AAAAAAAAAAAA'
After temporary allocation
Allocated string: 'AAAAAAAAAAAA'

Though it looks like the allocated memory is intact, if you attempt to free() the pointer’s address you get an invalid pointer message as the program runs. This error message appears for programs built by both clang and gcc. So I guess the memory is released when built with gcc but something quirky is keeping the contents around. It could just be my own system.

I can’t think of an application where the alloca() function is necessary, especially given that memory can always be freed. Yet this function is available in some C libraries and must have some purpose.

5 thoughts on “The alloca() Function

  1. To understand how alloca() works (on x86) one has to look at how stack frames are set up. When the compiler sees a function like char *astring(void) it will—if the cdecl “C” calling convention is in effect—essentially generate the following assembly code (shown for 32-bit / as described on Wikipedia):

    astring:  ; function label
       ; function prologue:
       push  ebp       ; backup EBP of calling function
       mov   ebp, esp  ; EBP points to its own backup
       sub    esp, 12  ; subtract size of local variables from ESP

       ; a = alloca(12+1);
       sub    esp, 13  ; move ESP down (at least) another 13 bytes

       ; function epilogue:
       mov   esp, ebp  ; set ESP to the value it had after ‘push  ebp’
       pop    ebp      ; restore callerʼs EBP
       ret             ; jump to saved return address
    Conceptually, the only thing alloca() does is to move down the machineʼs stack pointer by the requested number of bytes and to return the resulting address (on the stack).

    By moving the stack pointer, the stack memory in between is essentially reserved—this will be undone during the functions epilogue, when ‘mov esp, ebp; pop ebp; ret’ will restore the stack pointer to the value it had before the function in question was called.

    The description given above,

    »The man page says that alloca()ʼs memory is released when another function is called«

    is wrong in this regard. This memory will be released when the function (in which alloca() was called) returns.

    To illustrate this, I added a get_stack_pointer() function to the given 2024_03_09-Lesson-b.c, and modified the astring() function as follows:
    static size_t sp_reg;  /* last read ESP/RSP register value */

    char *astring (void)
    {
      const int size = 12;
      char *a;
      int x;

      sp_reg = get_stack_pointer ();
      printf ("stack pointer == %llX (before alloca)\n",
        (unsigned long long)sp_reg);
      a = alloca(size+1);
      sp_reg = get_stack_pointer ();
      printf ("stack pointer == %llX (after alloca)\n",
        (unsigned long long)sp_reg);

      for( x=0; x<size; x++ )
        *(a+x) = ʼAʼ;
      *(a+x) = ʼ\0ʼ;

      printf("Allocated string: ʼ%sʼ @ %p\n", a, a);

      return a;
    }
    Here is a sample run of the modified code:

    user@debian:~/alloca$ bin/so-utf8-x64-debug/alloca 
    Temporary allocation
    stack pointer == 7FFCC19267A0 (before alloca)
    stack pointer == 7FFCC1926780 (after alloca)
    Allocated string: ʼAAAAAAAAAAAAʼ @ 0x7ffcc1926780
    After temporary allocation
    stack pointer == 7FFCC19267C0 (in main)
    Allocated string: ʼ�h���ʼ
    user@debian:~/alloca$
    As can be seen, the RSP register had a value of 0x7FFCC19267A0 before alloca(12+1); was invoked and 0x7FFCC1926780 afterwards. Between those two addresses all the (stack) memory lay that was reserved during the alloca() call.

    The above also illustrates that the stack pointer had a value of 0x7FFCC19267C0 back in main(). Taking into account that the stack grows downwards on x86 this shows that the alloca()-reserved memory was no longer valid at this point (which is why only garbled byte values are show for the last "Allocated string:" output).

    alloca() can be useful because—just like with VLAs in C99—the stack memory one reserves is released on return to the calling function.

    All is not roses, however, because stack space is usually quite limited. Even on x86 every thread usually only has a few MiB of stack space to play with (by default, user space threads will have a 1 MiB stack on Windows and [usually] 8 MiB on Linux). Thus the data structures on allocates relying on alloca() (or VLAs) better not be too large… otherwise, a »Segmentation fault« exception will result.

  2. Thanks for the clarification. You’re correct about releasing the memory, which was my misread of the man page. But still, in my tests, the memory was still accessible until I made a second function call. It could just be my machine.

    Thanks for the code! Excellent examples. I admire your skill and appreciate the feedback.

  3. Thank you for these kind words—getting the aforementioned get_stack_pointer(); function to work was a fun challenge⁽¹⁾!

    »But still, in my tests, the memory was still accessible until I made a second function call.«

    I think this can be explained by the fact that the stack memory reserved by alloca() continues to have the same contents even after returning to the calling function. This memory will only be overwritten after one starts to call further functions (with the stack pointer again moving towards lower addresses during their execution, overwriting previous stack contents with function arguments, return address values and whatnot).

    ⁽¹⁾ For full transparency I should probably add that the ‘#if defined(_WIN32)’ part of my get_stack_pointer() implementation was inspired by this StackOverflow discussion:

    #if defined(_WIN32)
      #define NOINLINE __declspec(noinline)
    #else /*defined(__linux__)*/
      #define NOINLINE __attribute__((noinline))
    #endif

    static NOINLINE size_t get_stack_pointer (void)
    {
      #if defined(_WIN32)
        return ((size_t)_AddressOfReturnAddress() + sizeof(void *));
                                             /* … + sizeof(<return address>) */
      #elif defined(__linux__)
        register size_t sp asm (“sp”);

        #if defined(_DEBUG) /* … + push __BP + sizeof(<return address>) */
        return (sp + 2*sizeof(void *));
        #else /* … + sizeof(<return address>) */
        return (sp + sizeof(void *));
        #endif
      #endif
    }

  4. It’s a cool way to dig deep in to the bowels of the code. I downloaded and detarred your program just to see how you did it. Most impressive!

  5. Thank you for looking at my sample code—Iʼm flattered!

    One last thing, if I may: depending on the chosen optimization level (-O0 up to -O3) GCC will generate a function prologue (push __BP; mov __BP, __SP)… or wonʼt do so. Originally I tried to factor this in by a simple ‘#if defined(_DEBUG)’ check (with _DEBUG only being set for -O0), but wasnʼt really too happy about this hack.

    I have since refactored the code one last time and came up with the following solution:

    #if defined(_WIN32)
      #define NOINLINE __declspec(noinline)
      #define OMITFRAMEPTR  /* unnecessary on Windows */
    #else
      #define NOINLINE __attribute__((noinline))
      #define OMITFRAMEPTR __attribute__((optimize("-fomit-frame-pointer")))
    #endif

    static NOINLINE OMITFRAMEPTR size_t get_stack_pointer (void);

    static size_t get_stack_pointer (void)
    {
      #if defined(_WIN32)  /* … + sizeof(<return address>) */
        return ((size_t)_AddressOfReturnAddress() + sizeof(void *));
      #elif defined(__linux__)
        register size_t sp asm ("sp");
        return (sp + sizeof(void *));
      #endif
    }

    In my tests, this should now work properly, for x86-64 (64 bit) as well as x86 (32 bit) builds, regardless of any chosen optimization level.

Leave a Reply