Introduction
The C language has fairly standardized conventions about how to process command-line arguments, which I summarize here. I will also give some advice on the most effective ways to do this.
Conventions for command-line arguments
Here are the conventions:
-
Optional command-line arguments have a dash (
-
) before them.Optional command-line arguments are identified by placing a dash (
-
) before the optional argument’s name. For instance, thels
command in Unix will give a long form output if the command-line argument-l
is provided (the$
is the unix prompt):$ ls -l total 48 -rw-rw-r-- 1 mvanier cs11 16668 Apr 1 01:23 c_style_guide.html -rwxr-xr-x 1 mvanier cs11 2296 Apr 1 01:21 c_style_check -rw-r--r-- 1 mvanier cs11 755 Apr 1 15:46 cmdline_args.html -rw-r--r-- 1 mvanier cs11 8077 Feb 11 22:23 gdb.html -rw-rw-r-- 1 mvanier cs11 8290 Sep 25 2001 make.html
In general, if an argument doesn’t have a dash in front of it, it’s not optional unless it’s an argument to another command-line argument (see below). Note that programs in DOS or Windows typically use a forward slash (/) for command-line options. For this course, we will use the dash only (which is the Unix convention).
-
Optional command-line arguments may be located anywhere in the argument list and in any order.
Don’t assume that your user will always put optional arguments before non-optional arguments, or will put optional arguments in a particular order. Doing this invariably leads to very convoluted code which is hard to read and often doesn’t work much of the time. I’ll show you how to do it the right way later in this page.
-
Optional command-line arguments may themselves have arguments.
Sometimes an optional argument may have arguments of its own. These arguments don’t usually have dashes in front of them, and are often numbers. It’s as if you’re saying "you may not need to do this optional task at all, but if you do, you’ll need to know these other argument values as well". For instance, a sort program may have an optional argument which tells which kind of sort routine to use:
$ sort -method bubble words.txt
Here, the argument
bubble
is an argument to the-method
option and specifies a bubble sort (a particular kind of sorting algorithm). If it wasn’t included the default might be to use quicksort (another sorting algorithm):$ sort words.txt
Note that here you omit both the
-method
optional argument and its argumentbubble
. Fortunately for you, none of the programs in this track have optional arguments that themselves have arguments, but you will certainly see this and/or have to implement this eventually.
Exceptions to the conventions
Some programs don’t use the dash in front of optional arguments (the tar
program is an example; it’s often invoked as tar xvf <filename>
where xvf
are optional arguments). This is not recommended. Some programs allow several
single-letter options to be preceded by a single dash e.g. ps -elf
instead of
ps -e -l -f
. This is also not recommended, at least not for the programs in
this track.
How to process command-line arguments
Command-line arguments are always represented as an array of strings.
This array is called argv
(for "argument values") and there is
also an integer called argc
(for "argument count") which is
the number of command-line arguments. That’s why the main
function looks like this:
int main(int argc, char *argv[])
Here, argc
is declared to be an int
, whereas argv
is an array of char
*
's (i.e. strings). Remember that argv[0]
is the program’s name, so you
normally won’t want to use that except in a usage statement (see below).
Your first task in main
is to process the optional argument values, if any.
The standard way to walk through the argv
array is like this:
int i;
int quiet = 0; /* Value for the "-q" optional argument. */
for (i = 1; i < argc; i++) /* Skip argv[0] (program name). */
{
/*
* Use the 'strcmp' function to compare the argv values
* to a string of your choice (here, it's the optional
* argument "-q"). When strcmp returns 0, it means that the
* two strings are identical.
*/
if (strcmp(argv[i], "-q") == 0) /* Process optional arguments. */
{
quiet = 1; /* This is used as a boolean value. */
}
else
{
/* Process non-optional arguments here. */
}
}
Note that the "-q" optional argument could be located anywhere on the command line and the program would still work. If the optional argument has arguments of its own the code is trickier:
int i;
int opt = 0;
int optarg1 = 0;
int optarg2 = 0;
for (i = 1; i < argc; i++) /* Skip argv[0] (program name). */
{
if (strcmp(argv[i], "-opt") == 0) /* Process optional arguments. */
{
opt = 1; /* This is used as a boolean value. */
/*
* The last argument is argv[argc-1]. Make sure there are
* enough arguments.
*/
if (i + 2 <= argc - 1) /* There are enough arguments in argv. */
{
/*
* Increment 'i' twice so that you don't check these
* arguments the next time through the loop.
*/
i++;
optarg1 = atoi(argv[i]); /* Convert string to int. */
i++;
optarg2 = atoi(argv[i]); /* Ditto. */
}
else
{
/* Print usage statement and exit (see below). */
}
}
else
{
/* Process non-optional arguments here. */
}
}
In some cases, command-line processing can get quite hairy. Fortunately for you, the above examples are more than sufficient for the kinds of programs we do in this course.
Usage statements
Your program has to be able to handle the case when invalid command-line arguments are provided to it without crashing (core dumping etc.). The correct way to handle this is:
-
Display a usage statement.
-
Exit the program.
For instance, let’s say that your program expects exactly three arguments in addition to the program name, and can take another optional argument. You could write this:
if (argc < 4)
{
fprintf(stderr, "usage: %s filename word count [-w]\n", argv[0]);
exit(1);
}
There are several parts to this:
-
The usage message: it always starts with the word
usage
, followed by the program name and the names of the arguments. Argument names should be descriptive if possible, telling what the arguments refer to, likefilename
above. Argument names should not contain spaces! Optional arguments are put between square brackets, like-w
above. Do not use square brackets for non-optional arguments! Always print tostderr
, not tostdout
, to indicate that the program has been invoked incorrectly. -
The program name: always use
argv[0]
to refer to the program name rather than writing it out explicitly. This means that if you rename the program (which is common) you won’t have to re-write the code. -
Exiting the program: use the
exit
function, which is defined in the header file<stdlib.h>
. Any non-zero argument toexit
(e.g.exit(1)
) signals an unsuccessful completion of the program (a zero argument toexit
(exit(0)
) indicates successful completion of the program, but you rarely need to useexit
for this). If you’re truly anal you can useEXIT_FAILURE
andEXIT_SUCCESS
(which are defined in<stdlib.h>
) instead of 1 and 0 as arguments toexit
.
If you have to write out a usage statement more than once, make it a separate
function called (obviously) usage
and pass it the program name (argv[0]
) as
an argument. Then call it from main
whenever the program has invalid
arguments.
Dos and don’ts
-
Always print a usage message to
stderr
if the program receives incorrect arguments. -
Don’t assume that optional arguments will be located in any particular place in the argument list. (This was discussed above.)
-
Don’t try to process all the command-line arguments in a single pass if it isn’t convenient to do so.
I’ve seen a lot of C code that tied itself in knots trying to process the entire argument list in one pass. Typically, the code has a dense nest of
if
statements to handle every possible combination of arguments in every possible order. This is completely unnecessary and is simply bad programming. Most program invocations have very few command-line arguments, so even if you just process one of them per pass through the argument list you still won’t be wasting much time.Having said that, the command-line argument processing for lab 3 can be done in one pass through the
argv
array with no difficulty. -
Don’t alter the
argv
array!Some programmers do strange manipulations to the
argv
array involving pointer arithmetic, moving arguments around, trying to delete arguments, etc. The usual reason for this is to get rid of the arguments that have already been processed, particularly optional arguments. It’s really easy to screw up when doing this, and it’s never necessary, so don’t do it! It’s OK to copy some of the arguments to a separate array and/or separate variables if you need to.