Introduction

Debugging C programs is often extremely challenging. The direct pointer manipulations permitted by the C language give rise to bugs that can’t happen in most other computer languages. Often, these bugs manifest themselves in strange ways, such as the program printing interesting messages like core dump or bus error with no additional information. This is the price we pay for the efficiency and low-level control that the C language provides.

Debugging is a big subject, and we can only scratch the surface here. In general, here are three approaches you can use for debugging:

  1. When you get a bug, put lots of printf statements in code likely to have caused the bug so that you can monitor the values of variables which may not be what they should be.

  2. Add lots of assert statements so that when something goes wrong the program halts right away instead of continuing. If you don’t know about assert, do man assert. We will talk more about this later in the course.

  3. Use a "debugger" like gdb to find out where your code went wrong.

These approaches are not mutually exclusive and almost every programmer uses a combination of all three (plus others). The first two methods are pretty self-explanatory. The third needs a bit more explanation, which we provide below. You can also do man gdb and/or info gdb to get much more information.

In what follows, we’ll assume you are working in a Unix-like environment such as Linux, MacOS, or Windows Subsystem for Linux or Cygwin under Windows.

GDB basics

GDB stands for Gnu DeBugger. It is an interactive environment in which you can run a C program in such a way as to make it very easy to identify bugs.

To use gdb, do the following:

  • Compile your C program with the -g flag e.g.

$ gcc -Wall -Wstrict-prototypes -ansi -pedantic -g myprog.c -o myprog

(Note that we’re using a lot of warning options as well, which are the -Wall -Wstrict-prototypes -ansi -pedantic options; these force the compiler to complain if your code isn’t ANSI-compliant or if it has other suspicious features. It’s a good habit to always use these options.) The -g option puts debugging information into the executable. Most importantly, it puts the text of the source code file into the executable so you can examine it as the program executes (we’ll see how below).

  • Type gdb myprog (for the example above). This will start the interactive debugger. It’s basically an interpreter-like environment in which you can run your program line-by-line and do useful debugging tasks as well.

After starting the debugger, you will end up with this prompt:

gdb>

From here, you have a choice of lots of commands. Type info to get a list of them. Here are some of the most important ones:

  • run: runs the program

  • where: tells you where you are in the program when you have stopped at some point. Also tells you the call history of the program up to that point (i.e. which functions have been called to get you where you are).

  • p <variable>: prints the value of a variable

  • break <file>:<line>: causes the program to stop at a particular line in a particular source code file

  • break <function>: causes the program to stop when entering a particular function.

  • n: executes the next statement and then stops. This command will not enter a new function while you’re inside a function. Instead, it goes to the next statement in the current function.

  • s: executes the next statement, possibly entering a new function, and then stops.

  • l: lists lines in a source code file.

  • c: continues executing the program.

  • q: exits (quits) gdb.

Several of these commands have longer names that you can use as well:

  • print for p

  • next for n

  • step for s

  • list for l

  • cont for c

  • quit for q

For more information about any of these, type help <cmdname> at the gdb prompt, where <cmdname> is the name of the command listed above.

Things to try when things go wrong

Let’s say that you’re running a C program and it crashes. The error message you get is unlikely to be helpful; it will probably be something like

segmentation violation (core dumped)

with no other information (no file name, no line number, nothing).

First, let’s identify what that cryptic phrase means. A "segmentation violation" means that your program tried to access memory that it wasn’t allowed to. Since Unix is a multitasking operating system, each process lives in its own little world, with its own little hunk of memory that it’s allowed to play with. The operating system knows what hunk belongs to your process and what doesn’t; if your process tries to access memory that it doesn’t have the right to access, then it violates the (memory) segment boundaries and you get a segmentation violation, which (normally) causes your program to abort. A "core dump" refers to the fact that by default, a "core" file will be "dumped" into the directory from which you ran your program. The file is actually called "core" and can be very large (several megabytes or more). That’s because it’s a dump of what the memory contained when your program crashed. It is possible to use the core file to debug your program, but there are much easier ways to debug, so we won’t cover that here. Most Unix shells (i.e. the command interpreter like bash) allow you to put a statement in the initialization file (.bashrc for bash) that restricts the size of core dumps (ideally to zero bytes, in which case no core file is dumped); ask your local Unix guru for more information on this.

OK. Now what you need to know is where the segmentation violation occurred. To do this, compile your program with the -g option described above, start up gdb, and type run myprog (where myprog is the name of your program). Alternatively you can invoke gdb as gdb myprog and then just type run at the gdb prompt. This will run your program until the segmentation violation occurs.

If your program needs command-line arguments, you should supply them after the run or run myprog statement e.g. run myprog arg1 arg2 arg3.

GDB will tell you that the segmentation violation occurred and then wait for your command. It will look something like this:

Program received signal SIGSEGV, Segmentation fault.
0x4006cb26 in free () from /lib/libc.so.6

This means that the segmentation violation (also known as a segmentation fault or segfault for short) occurred in the library function free. This is weird; does this mean that there is a bug in free? Almost certainly not. Instead, your program did something bad that caused free to fail (possibly by asking it to free a NULL pointer).

Type where and you will get a stack backtrace. This is probably the single most useful thing you can have when something goes wrong. A stack backtrace is a list of function names in your program and associated data. It looks something like this:

(gdb) where
#0  0x4006cb26 in free () from /lib/libc.so.6
#1  0x4006ca0d in free () from /lib/libc.so.6
#2  0x8048951 in board_updater (array=0x8049bd0, ncells=2) at 1dCA2.c:148
#3  0x80486be in main (argc=3, argv=0xbffff7b4) at 1dCA2.c:44
#4  0x40035a52 in __libc_start_main () from /lib/libc.so.6

The stack is a data structure which holds information about functions which have partially finished executing. When a function calls another function, information about the new function being called is "pushed" onto a stack. This includes information such as the arguments to the function, the contents of local variables, etc. This information is referred to as a stack frame. When the function is finished its work it "pops" the frame off the stack and returns to the previous stack frame, which belongs to the function that called it. In the above backtrace, we see that the function __libc_start_main called main which called board_updater which called free which called itself recursively. In this case, the functions __libc_start_main and free are C library functions which you didn’t write. main is the good old main function that you write in every C program. What seems to have happened here is that something went wrong in the board_updater function, and gdb even tells you what line it happened on (which is what the -g option did). You should look at that line, and perhaps set a breakpoint there:

break 1dCA2.c:148

Now, when you run the program again, gdb will stop it on that line, and you will be able to print out the values of any relevant variables before free is called.

There is much, much more to debugging than I have time to go into here, but this should get you started. Reading the gdb info documentation (type info gdb at the shell prompt, or read it online at the link given below) will be a good place to go for more information, as will asking your Unix guru friends.

MacOS X notes: using lldb instead of gdb

Mac OS is incredibly strict with respect to security policies, and one unusual effect of this is that it can be very hard to run gdb on a Mac. [1] If you are installing software on MacOS using homebrew, you can easily install gdb just by typing

$ brew install gdb

However, the gdb that gets installed probably won’t allow you to debug your code. Instead, you will get a cryptic error message telling you that gdb needs to be "codesigned". There are ways of getting around this (do an internet search if you’re curious) but they are very complicated.

A perhaps better alternative is simply to use a different debugger! MacOS by default uses the lldb debugger for C code; this debugger is quite similar to gdb. Annoyingly, many of the same commands have different names or are invoked differently. For instance,

gdb> break foo

(to break as soon as the function foo is entered) becomes

(lldb) breakpoint set --name foo

Printing the contents of an array is also different. Instead of

gdb> print *arr@10

you have to write

(lldb) parray 10 arr

Here is a brief tutorial on using lldb.

Note also that lldb is expected to be used on files compiled with the clang C compiler (which is the standard C compiler used on MacOS), but it also seems to work for files compiled with gcc. If you run into problems, you could try using clang instead of gcc, but be aware that some of the gcc command-line options don’t work with clang.

Both clang and lldb are part of the "LLVM" project, which is an extremely interesting project. Many computer languages other than C (e.g. C++, Rust) can be compiled using an LLVM backend. The LLVM home page is here.

References


1. The reasons for this are beyond the scope of the present discussions.