Weighted Random Numbers

Random numbers are useful when simulating information and they add a degree of unpredictability to computer games. The problem programmers run into, especially with simulation, is that not everything is truly random. In many cases, some numbers need to be more random than others. The solution is to generate weighted random numbers.

A weighted random number has a bias. The numbers generated are random, but some values pop up more frequently than others.

As an example, say a program recreates the three states of existence for a cat: Playing, Eating, and Sleeping.

Yeah, yeah: The cat also hunts and performs other activities. Classify them all as “play” for the sake of this Lesson.

Say you were to code a cat simulator using a standard random number generator. Over time, it would spit out values in an even distribution; the cat would play, eat, and sleep in equal quantities. That’s not really reflective of the reality you’re trying to recreate.

After observing cats for most of my life, I’ve noted that kitty spends a majority of her time sleeping. Therefore, a weighted random number generator would produce more sleeping state results and fewer playing and eating state results.

To gather data for your cat simulator program, you apply for a government grant to fund a study of cat activities during a 24 hour period. During each hour, it’s noted whether the cat was playing, eating, or sleeping. Figure 1 illustrates the results.

Figure 1. Cat statistics, courtesy of a government grant.

Figure 1. Cat statistics, courtesy of a government grant.

For your code to reflect the proper values, you need to weight the results shown in Figure 1: The cat spends 25% of its time playing, 12.5% of its time eating, and the rest of the time sleeping. The rand() function’s output cannot be changed, therefore you must represent the data in a way that makes the results of the rand() function useful.

The solution I devised is based on the hours, not the activities. I created an array of 24 elements, each representing an hour in the day. Then I populate that array with the proper quantities of playing, eating, and sleeping activities. The random number generator fetches a value from the array in the range of 0 to 23, which it does nicely, but the value fetched is weighted based on the research.

Here’s the code:

#include <stdio.h>
#include <stdlib.h>
#include <time.h>

int main()
{
    enum { PLAY, EAT, SLEEP };
    int hours[24] = {
        PLAY, PLAY, PLAY, PLAY, PLAY, PLAY,
        EAT, EAT, EAT,
        SLEEP, SLEEP, SLEEP, SLEEP, SLEEP,
        SLEEP, SLEEP, SLEEP, SLEEP, SLEEP,
        SLEEP, SLEEP, SLEEP, SLEEP, SLEEP
    };
    int x,h;

    srand((unsigned)time(NULL));
    puts("Kitty activity generator");
    puts("Five hours:");
    for(x=0;x<5;x++)
    {
        printf("Hour %d: Kitty is ",x+1);
        h = rand() % 24;
        switch(hours[h])
        {
            case PLAY:
                printf("playing\n");
                break;
            case EAT:
                printf("eating\n");
                break;
            case SLEEP:
                printf("sleeping.\n");
        }
    }

    return(0);
}

In Line 7, I used the oddball enum keyword to generate three constants, PLAY, EAT, and SLEEP, assigned the values 0, 1, and 2, respectively.

The hours[] array at Line 8 is populated with the constant values in direct proportion to the results observed in Figure 1: 6 PLAY values, 3 EAT values, and 15 SLEEP values. This distribution reflects the weighted random results needed.

At Line 23, the rand() function returns a random value between 0 and 23, matching the number of elements in the array / hours in a day. The random value stored in variable h isn’t weighted, but the array is.

The switch-case structure between Lines 24 and 34 spews out the weighted random results.

Here’s sample output:

Hour 1: Kitty is sleeping.
Hour 2: Kitty is sleeping.
Hour 3: Kitty is eating
Hour 4: Kitty is sleeping.
Hour 5: Kitty is sleeping.

These results more realistically reflect the observable data than would otherwise be generated by a purely random selection.

In next week’s Lesson, I cover a better way to represent large quantities of weighted random numbers.

Leave a Reply