Find the Duplicates – Solution

I devised two solutions for this month’s Exercise. The basic solution can be solved in two steps, with a more involved solution using three.

The first step for both solutions is to fill the array and output its values. This process can be done in a single loop:

for(a=0;a<SIZE;a++)
{
    array[a] = rand() % 50 + 10;
    printf(" %2d",array[a]);
}
putchar('\n');

For my first solution, the second step is to chomp through the array similar to a bubble sort to find duplicates.

In the following snippet, a while loop processes array[] from element zero in variable a to the value of defined constant SIZE, which is set to 15:

/* find and display duplicates */
puts("Duplicates:");
a = 0;
while( a < SIZE-1 )
{
    for(b=a+1;b<SIZE;b++)
    {
        if( array[a] == array[b] )
        {
            printf("%d\n",array[a]);
            /* no need to continue */
            break;
        }
    }
    a++;
}

Variable b scans elements from a+1 through SIZE looking for matches. When one is found, it’s displayed and the for loop is broken. Then variable a is incremented and the process continues. Here’s sample output:

Array:
 26 59 47 41 37 21 47 42 50 28 30 14 21 32 22
Duplicates:
47
21

And here’s the output when the same value is repeated more than twice:

Array:
 56 16 43 49 22 45 57 41 49 11 18 49 24 12 49
Duplicates:
49
49
49

The code doesn’t recognize previously-reported values, so the output looks silly.

For my second attempt, I decided to quicksort the array before I scanned for duplicates. With the array sorted, duplicates appear next to each other. Therefore, the loop to find repeated values works differently than the bubble-sort-like loop in my first solution:

/* sort and display duplicates */
qsort(array,SIZE,sizeof(int),compare);
puts("Duplicates:");
for(a=0;a<SIZE-1;a++)
{
    b = 1;
    while( array[a] == array[a+b] )
    {
        d = array[a];
        b++;
        a++;
        /* prevent overflow */
        if( a + b > SIZE)
            break;
    }
    /* b is >1 only for duplicates */
    if( b > 1 )
        printf("%d\n",d);
}

Variables a and b compare elements. The for loop propels variable a through the entire array. Variable b is set equal to 1 and a while loop compares array[a] and array[a+b]. If they match, the value of array[a] is saved in variable d. Both a and b are incremented to compare the next two elements, which might also be the same. Along the way, and if test confirms that a+b doesn’t reference elements beyond the end of the array.

When the while loop is completed, an if statement checks the value of variable b, which will be greater than one when a duplicate is found. If so, the value (in variable d) is output:

Array:
 36 18 23 41 55 46 36 10 18 45 32 11 28 43 10
Duplicates:
10
18
36

By sorting, I helped eliminate the problem reporting duplicate duplicates in my first solution.

I hope you coded some clever solutions. Give yourself bonus points if you also counted the number of duplicates found.

Click here to view my first solution on Github.

Click here to view my second solution on Github.

I’m using Github now to host code for this blog. The reasons is that WordPress’s latest security update prohibits me from uploading *.c files as media. I’ve tried working in some patches and permissions, but nothing has been effective. You’ll also find Lesson files on my Github page as well: https://github.com/dangookin/C-For-Dummies-Blog

3 thoughts on “Find the Duplicates – Solution

  1. A suggestion for a future post: how about a dedupe function which takes an array and returns a new one with distinct values only?

    I don’t know how your hosting works but the company I use gives me Plesk so I can bypass WordPress. Uploaded stuff goes to wp-content/uploads/[year]/[month] so I can stick anything there whether WordPress likes it or not.

    No idea why they block C files. It’s not like browsers are going to compile and run it behind your back 🙂

  2. Another suggestion – might be clearer to give source code descriptive filenames.

    Yet another one – as an alternative to backdooring C files you could zip them. WordPress hasn’t (yet) blocked .zip files.

  3. I could also upload the file as plain text, with the txt extension. Still, I’ve been meaning to get on github for a while.

    The filenames make more sense in my local file structure. 🙂
    Good suggestion on the dedupe function. Thanks!

Leave a Reply