CS 11 C track: coding style guide

Introduction

The C programming language offers a number of ways to format code. Many programmers abuse this freedom and write unreadable (and thus incomprehensible and unmaintainable) code. While there is more than one way to properly format code, here is a set of guidelines which have been found useful in practice. Note that marks will be taken off for poor formatting (more marks deducted as the term goes on). Some of these guidelines may seem amazingly anal, but they really make a difference when reading code. Remember: you are writing code not just for the compiler, but for other people to read as well. The other person reading your code will most likely be you six months from now, so making sure that your code is readable is extremely important.

In the following, each item has a code associated with it to the left of the description of the item. This code will be used to specify the problem when correcting your code. It is up to you to match the code with the item. Hopefully this will encourage you to read this style guide :-) As a general rule, the earlier items in a section are more important than the later ones and/or represent more common errors.

In order to make life easier on you, we are supplying you with an automatic C code style checker. It won’t catch all of these errors by any means, but it will catch a lot of them. You will be asked to run your code through the style checker before you submit it; if it fails, you will lose marks. The C style checker is written in Python, which should be available on any modern computer (or which is easily installed if not).

How to get the style checker working

Requirements

The style checker is an executable Python script which is intended to be run from a Unix-style terminal. Examples of Unix-style terminals are:

The MacOS X terminal applications (and also improved versions of this such as iTerm2)
Any Linux terminal program.
Any Windows terminal which is running WSL (Windows Subsystem for Linux), which has to be installed first. A good terminal for Windows is Windows Terminal.

Also, you can run a terminal inside the Visual Studio Code (VS Code) editor. If you are on Windows and have WSL installed, you can use WSL as one of your terminal options inside VS Code.

Instructions

Getting the style checker

Download the style checker from this file onto your computer. The way to do this is to select the "Save Page As" option in the "File" menu (this works for Firefox; other browsers have similar commands). It will suggest you save the file under the name c_style_check and you should agree. This will put the style checker program into one of your directories (probably your home directory). DO NOT CUT AND PASTE THE CODE INTO A FILE; THAT WILL NOT WORK! Alternatively, you can right-click on the link to the file and select "Save Link As", which is simpler.

Installing the style checker

Start up a terminal program and use the cd (change directory) command to the directory containing the style checker you just downloaded. Make it into an executable program by typing:

$ chmod +x ./c_style_check

(where $ is the terminal prompt; yours may be different).

Make a subdirectory of your home directory called bin (which just means a location for programs; the name really doesn’t matter but it’s traditional). You can do this by typing

$ mkdir ~/bin

at the prompt (the tilde (~) character is shorthand for your home directory). Note: if you already have a ~/bin directory, obviously you don’t have to do this step.

Move the style checker into the newly-created bin directory:

$ mv c_style_check ~/bin

Setting up your path

Now comes the tricky part. You have to adjust your path so that it includes your bin directory. The path is just a list of directories that contain programs; it’s the way the computer knows where to find programs. There is a default path set up for you when you start your computer, but it may not include the bin directory (because most users don’t have one).

To adjust your path you have to do these steps.

First check to see if you already have the bin directory in your path. Type:

$ echo $PATH

at the prompt and if the result includes /home/<your-login-name>/bin then you’re all set and you can skip the rest of the steps in this section.

Otherwise, you need to adjust the path in your terminal initialization file. There are two kinds of "shells" (command interpreter programs) that are used in Unix-style terminals: one is called bash and the other is called zsh. On Windows or Linux, you will probably be running bash, whereas on a Mac you will probably be running zsh. To find out which one you are using, you can type this:

$ which bash
$ which zsh

If one of these gives a "not found" error message, you are using the other one. If both give a "not found" error message, you need to talk to a TA. The initialization file for bash is called .bashrc; the one for zsh is called .zshrc. We’ll assume you are using bash in what follows; the changes for zsh are analogous. Check if you have a file called .bashrc in your home directory directory:

$ ls ~/.bashrc

If this file doesn’t exist, you might try this:

$ ls /etc/bash.bashrc

(or /etc/zshrc for zsh.) If this exists, copy it to your home directory:

$ cp /etc/bash.bashrc ~

If that file isn’t there, you need to see the TAs. Otherwise, continue.

Open up the .bashrc file in a text editor. Go to the end of the file and add this line:

export PATH=$PATH:$HOME/bin

Save the file, get out of the text editor and do this at the prompt:

$ source ~/.bashrc
$ hash -r

Now you’re all set. You will be able to use the style checker from any directory you’re in, and you’ll never have to download it again or do any of these steps again.

Using the style checker

To style check a file called e.g. foo.c, do:

$ c_style_check foo.c

You can also style check multiple files all at once:

$ c_style_check foo.c bar.c baz.c

Don’t be alarmed if there are a lot of errors reported; just go through the file and fix them. Some lines will probably have multiple style violations; you should fix all of them. You’ll probably be very annoyed with the style checker at first, but your code will become much more readable as a result. If you think you’ve found a bug in it, let us know at once; this is a (permanent) work in progress. Note that the style checker will sometimes be too stupid to know when it’s in the middle of a comment or a literal string, so it may report errors that aren’t really errors in those cases. If so, just disregard them.

The most common style mistakes

These mistakes occur so often that they’re almost universal. Therefore, please pay particular attention to avoiding them. Follow the links to get to the descriptions below.

[TABS] Using tab characters in your code.
[OPERATOR_SPACE] Not putting spaces between operators.
[COMMA_SPACE] Not putting a space after a comma.
[LINE_LENGTH] Writing lines longer than 80 characters.
[USAGE_STMT] Missing or inadequate usage statement.
[EMPTY_LINES] Using too many or too few empty (blank) lines in functions.
[COMMENTS_FULL_SENTENCES] Writing comments that are not full sentences.
[COMMENTS_GRAMMATICAL] Writing comments that are not grammatically correct or are misspelled.
[COMMENT_SPACE] Not putting a space after the open-comment symbol /* and/or before the close-comment symbol */.
[COMMENT_HEADER] Not writing a proper comment at the head of a function.
[FUNCTION_PROTOTYPES] Not writing prototypes for all the functions defined in a file.
[FUNCTION_BLANK_LINES] Not separating function definitions by blank lines.

Catalog of style mistakes

General

[TABS]

Never, ever, ever use the tab character (ASCII 0x9)! Different people use different tab settings in their editors, and code that looks just fine with a tab width of 2 becomes unreadable with a tab width of 8. Unfortunately, many text editing programs will stick in tab characters without making it obvious that they’re doing it.

Almost all text editors allow you to force tabs to actually print spaces. Some examples:

If you use emacs for text editing put the following lines in your .emacs or init.el startup file:

(setq c-mode-hook
  '(lambda ()
     (progn
        (set-variable 'indent-tabs-mode nil)
        ;; other customizations, if any, go here
        )))

This is actually Emacs Lisp code, but don’t worry about that. Then exit and restart emacs. Now when you hit the tab key while editing C code, emacs won’t actually put any tab characters in your code, but instead will just put in spaces.

If you’re using the vim or neovim editors instead of emacs ^[1], you can see all the tab characters by typing:

:set list

into the editor while in command mode. This will make all tab characters look like ^I (a circumflex accent followed by a capital I). This makes it easy to go through a file and replace all tabs by e.g. four spaces. Better still, you can put this into your ~/.vimrc file:

set expandtab

and when you hit the tab key, spaces will be printed instead.

With other editors (Atom, Sublime Text, Visual Studio Code) look in the preferences; there will almost always be an option to replace tabs with spaces. If your editor doesn’t do that, use a different editor!

If you don’t like removing tabs from your code manually, here’s a trick that will help. Let’s say you have a file called foo.c and you’ve run the style checker on it, and every other line has tabs in it. Just do this from the unix prompt ($ in this example):

$ sed -e 's/\t/    /g' < foo.c > foo.c.notabs
$ mv foo.c.notabs foo.c

and your file will no longer have any tabs. On the other hand, this can mess up the indentation, so you should go over it afterwards to make sure it looks presentable and add spaces if necessary. If you don’t, and the result is unreadable, you will probably have to redo it.

The sed in the command line is a program called sed (which means "stream editor"). It does simple editing on files on a line-by-line basis. ^[2] So when you type

sed -e 's/\t/    /g' < foo.c > foo.c.notabs
             ^^^^ 4 space characters here

it executes a command called s/\t/ /g on each line of the file foo.c, putting the results into a new file called foo.c.notabs. The line s/\t/ /g means "substitute (s) for every tab (\t) character, four space characters (which is what’s between the / / characters), and do it for every tab in the line (g, which means global)". Note that you have to type this in exactly as described or it won’t work.

If you want, you can use more or less than four space characters per tab. Most editors use eight space characters for a tab by default, so that might be a good alternative. That would look like this:

sed -e 's/\t/        /g' < foo.c > foo.c.notabs
             ^^^^^^^^ 8 space characters here

[OPERATOR_SPACE]

Use a single space to separate variable names from operators, i.e. write

a = b + c * d;

instead of

a=b+c*d;

The only exception to this rule is for array subscripts e.g.

b = a[i-1]  /* not a[i - 1] */

but you can put the spaces in here too if you want. Unfortunately the style checker currently complains if you don’t put the spaces in. Don’t worry about the warning in this case.

[COMMA_SPACE]

Always put a space after a comma. There are no exceptions to this rule.

[PAREN_CURLY_SPACE]

If you are using a formatting style where the opening curly brace of a block is on the same line as an if, while, or for statement (which we discourage; it’s better to put the curly brace on a separate line), make sure that there is a space between the close paren on the line and the open curly brace, e.g. do

for (i = 0; i < n; i++) {
    /* code goes here */
}

instead of:

for (i = 0; i < n; i++){
    /* code goes here */
}

because the latter is hard to read. Similarly, leave a space between an else keyword and an opening curly brace if they’re on the same line.

[LINE_LENGTH]

Don’t write lines that are longer than 80 characters long. Long lines tend to be wrapped, or worse, to be truncated when printing out the source. Printing out source code is a valuable way to review your code. It is almost never necessary to have long lines, even for long strings; you can always break up a string like this:

printf("this is a really, really, really, really, really, really, "
       "really, really, really, really, really, really long string.\n");

and the two strings will be concatenated together. This will work for any number of consecutive strings. Note that this trick only works for literal strings, not for variables which contain (point to) strings.

[ANSI_VIOLATION]

For portability, you should restrict yourself to pure ANSI-compliant C code exclusively. Note that gcc will not do this for you. If you want to be safe you need to use several compiler flags:

gcc -Wall -Wstrict-prototypes -ansi -pedantic

and make sure that your program doesn’t generate any warnings. (Code which generates warnings will be severely penalized.)

[MAGIC_NUMBER]

Avoid putting a large number into a file which has no obvious relevance to the surrounding code. This is known as a "magic number" and is often found when setting the size of arrays, e.g.:

int my_array[4096];  /* 4096 is a magic number */

The reason for avoiding this is twofold:

It’s not usually clear from the context what the significance of the number is.
The same number tends to occur several times in the file, which causes problems when you want to change the value.

The right thing to do is this:

#define BUFSIZE 4096  /* size of buffer */

...

int my_array[BUFSIZE];

Alternatively, it’s perfectly valid to declare a constant:

const int BUFSIZE = 4096;  /* size of buffer */

...

int my_array[BUFSIZE];

[USELESS_CODE]

Don’t put in code that has no function or no effect. If it’s code that’s was only used for debugging purposes, it should be removed before you submit your assignment.

[USAGE_STMT]

If a program is called with incorrect arguments, it should detect that and print a usage statement to the terminal. The usage statement should include the program name. The easiest way to do that is to use argv[0] i.e.

char usage[] = "usage: %s input_filename output_filename\n";

if (argc != 2)
{
   fprintf(stderr, usage, argv[0]);
   exit(1);
}

Note that the arguments have mnemonic names. Don’t write the usage message multiple times. If necessary, you can define a usage function:

void usage(char *progname)
{
    fprintf(stderr, "usage: %s input_filename output_filename\n", progname);
}

and then call it like this:

int main(int argc, char **argv)
{
    /* code omitted */
    if (/* arguments are incorrect */)
    {
        usage(argv[0]);
        return 1;
    }

    /* more code omitted */
    return 0;
}

Alternatively, you could put a call to the exit() function in the usage() function:

void usage(char *progname)
{
    fprintf(stderr, "usage: %s input_filename output_filename\n", progname);
    exit(1);
}

and then call it like this:

#include <stdlib.h>   /* declaration of exit() function */

int main(int argc, char **argv)
{
    /* code omitted */
    if (/* arguments are incorrect */)
    {
        usage(argv[0]);  /* no return needed */
    }

    /* more code omitted */
    return 0;
}

In this example, the return 1; line wasn’t needed because when exit(1) is called from the usage() function the program will exit with a return value of 1.

Also, make sure that you use fprintf and print to stderr (the error output stream) instead of using printf, which prints to stdout (the normal output stream).

As a general rule, any error that involves the user supplying invalid command-line arguments should give rise to a usage statement like the ones described above. You should try to make your usage statements comprehensive enough so that one statement will work for all such errors.

For more on the correct format of usage statements, see this page.

[STMTS_ON_LINE]

Never put more than one statement on a line. It makes for unreadable code. The only exception is in the for line of a for loop, which typically has three statements.

[PRECEDENCE]

Use parentheses to show operator precedence in all cases except that of multiplication/division over addition/subtraction and assignment statements.

[EMPTY_LINES]

Do not put large numbers of empty lines (> 2) between code sections unless there is a clear need to distinguish different sections of the code. Conversely, do put an empty line between logical sections in a single function. An example of this is between the type declarations and the first line of actual code. Another example is at the end of a block in curly braces (though this is a judgment call). Long functions that have no blank lines in them are really hard to read.

[BLOCK_CURLY_BRACES]

Use curly braces for the body of all if statements, even if the body is only a single statement. Do the same for else, else if, for, and while statements. The reason for this is twofold: first, it makes the code more readable, and second, it makes it easier to add printf statements for debugging in the body of the expression (which you will frequently have to do).

[MATCH_CURLY_BRACES]

If you are using a formatting style where the curly braces of a block are on a separate line (which we encourage), make sure that the column of the curly braces match e.g. do this:

for (i = 0; i < n; i++)
{
    /* code goes here */
}

instead of:

for (i = 0; i < n; i++)
  {
    /* code goes here */
}  /* braces don't line up */

[CODE_ON_CURLY_BRACE_LINE]

Don’t put code on the same line as an open curly brace. For instance, this is bad:

if (a != 0)
{  a = b + c;  /* ugly */
   printf("a is now: %d\n");
}

Keep the curly braces on their own lines; this makes the code easier to read. Unfortunately, you often see code written like that in books about programming; the reason is that they have to cram as much code as possible onto a single page. You don’t. Instead, write this as:

if (a != 0)
{
   a = b + c;
   printf("a is now: %d\n");
}

[IF_FOR_WHILE_DO_SPACE]

Put a single space between the keywords if, for, while, or do and the opening parenthesis on the same line e.g. do

for (i = 0; i < n; i++)
{
/* code goes here */
}

and not

for(i = 0; i < n; i++)
{
/* code goes here */
}

The logic is that keywords are not function calls and should not look like them. Similarly, leave a space after return keyword which is followed by an expression.

[BLOCK_ON_SINGLE_LINE]

Do not put an entire block on a single line, and most especially do not put it on the same lines as an if, while, for etc. For instance, change

if (i < 10) { break; }

if (i < 10)
{
    break;
}

[INADEQUATE_INDENTING]

Lines within a block should be indented relative to lines outside a block. Also, don’t indent one space; two spaces is the bare minimum. Four spaces looks nice.

[INCONSISTENT_INDENTING]

Lines at the same level of a block should start at the same column.

[FOR_LOOP_COMPUTATIONS]

Do not try to do complex calculations in the testing or increment parts of for loops. Don’t try to impress everyone with how clever you are; clever code is a maintenance disaster.

[VARIABLE_NAMES]

Make variable names descriptive as much as possible; avoid one or two character names unless it’s for something trivial like a loop index. It’s perfectly OK (and usually desirable) to have longer descriptive names for variables. When you do this with names that are actually multiple words, use one of two conventions:

the underscore convention: long_variable_name
the capwords convention: longVariableName

Either convention is OK as long as you’re consistent.

[IMPLICIT_CONVERSIONS]

Avoid using implicit int to float or int to double conversions (or vice-versa) as much as possible. It’s hard to keep track of the types of the results otherwise, and C compilers tend not to be very strict about this, which often leads to unexpectedly wrong results. Instead, use explicit type casts when you want to convert an int to a double etc. For instance, this:

int a = 10;
double b;

b = a;  /* implicit conversion */

should be written as:

int a = 10;
double b;

b = (double) a;  /* explicit conversion */

Yes, it’s a bit more verbose, but it’s absolutely unambiguous.

Comments

[COMMENTS_FULL_SENTENCES]

This is the single most common style mistake. If a comment is a full sentence, its first word should be capitalized, unless it is an identifier that begins with a lower case letter (never alter the case of identifiers!), and it should end in a period. We prefer comments that are complete sentences. You should use two spaces after a sentence-ending period.

Bad:

/* go through the loop and make sure that all the array elements
 * have been set to zero */

Good:

/*
 * Go through the loop and make sure that all the array elements
 * have been set to zero.
 */

That wasn’t so hard, was it?

When you need to refer to identifiers in the code, put them inside backticks e.g.

/* The variable `nitems` represents the number of items in the stack. */

(This is a convention borrowed from a format called "markdown"; it looks a bit weird but it’s better than the alternatives.)

If a comment is very short, it doesn’t have to be a full sentence or end in a period e.g.

i = 1;  /* loop index */

This is called an "inline comment". Use these only when describing something i.e. in the above code snippet you’re saying "The variable i represents a loop index." (Actually, this comment should probably be omitted altogether, since it should be obvious what the loop index is.)

[COMMENT_GRAMMATICAL]

Comments should be grammatically correct. In particular, incorrect spelling is unacceptable. We hate to sound like your high school English teacher, but it’s a pain to read code with tons of spelling mistakes. Use a spell checker if you have to.

[COMMENT_SPACE]

Put a space after the open-comment symbol and before the close-comment symbol i.e. do this:

/* This is a comment that is easy to read. */

and not this:

/*This is a comment that is harder to read.*/

[COMMENT_C++]

Do not use C++ style comments i.e. comments that start with // and go to the end of the line. It is true that most C compilers (including gcc) accept them, and they are part of the C99 standard. But for this track, don’t use them.

[COMMENT_MULTI_LINE]

Use this style for multi-line comments:

/*
 * This is a multi-line comment.
 * Spiffy, isn't it?
 */

Most especially, do not use this style:

/* This is a bad way to write multi-line comments. */
/* You comment out every line individually. */
/* Ugly, isn't it? */

People who write comments this way may not be aware of the fact that comments can span more than one line. Well, they can, so take advantage of it.

[COMMENT_BLOCK]

Block comments generally apply to some (or all) code that follows them, and are indented to the same level as that code. Each line of a block comment starts with a * and a single space (unless it is indented text inside the comment). Paragraphs inside a block comment are separated by a line containing a single *. Block comments are best surrounded by a blank line above and below them (or two lines above and a single line below for a block comment at the start of a a new section or function definition). We prefer to start and end block comments with a line containing a single *. In other words, a block comment looks like this:

/*
 * The first line comes after an empty line.
 *
 * Separate paragraphs are also separated by an empty line,
 * and there's an empty line at the end.
 *
 */

[COMMENT_NON_OBVIOUS]

Write comments for anything that isn’t completely obvious from the context. In particular, write comments for any tricky algorithm or code you are using. When in doubt, comment more rather than less.

[COMMENT_REDUNDANT]

Conversely, don’t make completely redundant comments, e.g.

i = 1;  /* Set i to 1. */

What constitutes redundancy is often a judgment call. If in doubt, comment more rather than less.

[COMMENT_MEANINGLESS]

Don’t make meaningless comments e.g.

/* i */
i = 1;

Don’t laugh; we’ve actually seen this sort of thing.

[COMMENT_HEADER]

You should almost always put a comment at the beginning of each function describing what it does. The only exception is when you have a series of very similar functions which are written out one after another, and where the first comment applies (suitably modified) to all of them. This kind of "header comment" (not to be confused with header files) is by far the most important kind of comments, because even if the person reading your code has no idea how a given function works, the header comment will at least tell them what it does and how to use it. You should state what each of the arguments represents and what the function returns. You may also want to describe the algorithm used, its efficiency, and any other relevant facts. Here’s an example:

/*
 * bubble_sort:
 *    This function takes an array and sorts it in-place using the bubble
 *    sort algorithm.  This algorithm has a time complexity of O(n^2)
 *    where `n` is the size of the array, which is not very efficient.
 *    Therefore, for large arrays use a more efficient algorithm such as
 *    quicksort.
 *
 *    Arguments:
 *    - arr:  the array to be sorted
 *    - size: the length of the array to be sorted
 *
 *    Return value: none.
 */

void bubble_sort(int arr[], int size)
{
    /* code */
}

[COMMENTS_CONSISTENT_WITH_CODE]

Comments that contradict the code are worse than no comments. Always make a priority of keeping the comments up-to-date when the code changes!

[COMMENT_INDENT]

Always indent your comments to the same degree as the surrounding code.

[COMMENT_ALIGN_INLINE]

Try to line up inline comments where convenient. In other words, don’t do this:

x = x + 1       /* some cool comment about x */
y = y + 1   /* some even cooler comment about y */

Instead, do this:

x = x + 1   /* some cool comment about x */
y = y + 1   /* some even cooler comment about y */

Some people like to line up the close-comment token as well. Use your own judgment.

[COMMENT_PRECEDING]

Do not write comments that apply to the preceding code if you can possibly avoid it. Try to write comments that refer to the current line of code or to the lines of code which immediately follow. For instance, this is bad:

int res;
/*
 * 'res' contains the result of the program.  It will normally be 0,
 * unless an error occurs, in which case it will be 1.
 */

and this is good:

/*
 * 'res' contains the result of the program.  It will normally be 0,
 * unless an error occurs, in which case it will be 1.
 */
int res;

This is really bad:

int res;  /* 'res' contains the result of the program.
           * It will normally be 0, unless an error occurs,
           * in which case it will be 1.
           */

We hope this is obvious, but we see it in students' submissions all the time. It’s OK to put a comment on the same line as the code, but only if the entire comment will fit on that line e.g.:

int res;  /* 'res' contains the result of the program. */

If what you have to say won’t fit on the line, put the comment on the lines above the line of code. (Don’t use a very long line to try to fit the comment on one line, of course; see [LINE_LENGTH] above).

Functions

[FUNCTION_PROTOTYPES]

Always write function prototypes at the top of a file for every function whose definition is in that file. This is not only good documentation, it enables you to use these functions anywhere in the file without having to worry about putting them in strict definition order. In case you don’t know already, this is a function prototype:

int foo(double bar,  char * baz);

It’s just a function without a body.

[FUNCTION_BLANK_LINES]

Please separate your function definitions by at least two blank lines. Otherwise it’s hard to find where a function definition begins.

[FUNCTION_STARTING_COLUMN]

Start the line that begins a function in column 0 (the leftmost column).

Finally…

Don’t worry if you can’t remember all of these rules; we don’t expect you to. At this point it’s more important that you develop an intuition for what is good and what is bad style, and if you aren’t sure, you can refer back to this page later.

1. you have good taste! :)

2. Actually, it has been proved that sed is a Turing-complete programming language, but that’s a long, weird story.