Parsing the Command Line I

About a year ago, I wrote a post on reading the command line arguments. It’s a process that takes place all the time, not only when running programs in a terminal window but also for graphical operating systems. Knowing how to manipulate command line arguments is important.

One thing I didn’t mention in that earlier post is that the two arguments for the main() function don’t always have to be named argc and **argv. Those are merely the traditional names, probably first used by Brian Kernighan and Dennis Ritchie when they wrote their seminal book, The C Programming Language.

If you like, you can name main() function’s arguments anything, as long as the first is an integer value and the second is a pointer array. To wit:

#include <stdio.h>

int main(int c, char **s)
{
    printf("This program is named %s\n",*s);

    return(0);
}

The code compiles and it runs and it spews out the program name, argument zero on the command line.

A more devilish trick is being able to read command line arguments and determine whether or not they’re valid. This type of processing, officially known as parsing the command line, can occupy a great amount of the code. It gets really hairy, especially when dozens of options are available. Worse: The user can type them in any order, lump them together, forget some, or do all kinds of mischief. Truly, parsing a command line can be an art form.

In the following code, one argument must be specified, hello. It must match that case exactly. The code checks for the argument and further confirms that it’s the proper one.

#include <stdio.h>
#include <string.h>

int main(int argc, char *argv[])
{
    if(argc != 2)
    {
        fprintf(stderr,"Please specify a valid argument\n");
        return(1);
    }

    if( strcmp(argv[1],"hello")==0 )
    {
        puts("Valid argument specified");
        return(0);
    }
    else
    {
        fprintf(stderr,"Invalid argument specified\n");
        return(1);
    }
}

The first argument on any command line is always the program name. So when another argument is specified, the argument count (argc above) is greater than one. Specific to this code, the second argument is required, so argc must equal 2 or the program is unable to run.

The single command line argument dwells as a string referenced by variable argv[1]. Remember that argv[0] is the program’s name. One thing that throws even experienced programmers is keeping in mind that argument 2 is accessed via element one in the argv[] array.

Above, the strcmp() function checks for the proper argument. If specified, the program is pleased, otherwise an error message is output. And the error output goes to the stderr device, which is typically the display. You don’t want to use a puts() or printf() function for an error message, which can be redirected and potentially overlooked.

Of course, there’s no logic for a program to require a single, specific argument; arguments are options that affect the program’s processing or output. In next week’s Lesson I’ll cover how to parse multiple options.

Leave a Reply