Introduction
Debugging C programs is often extremely challenging. The direct pointer
manipulations permitted by the C language give rise to bugs that can’t happen
in most other computer languages. Often, these bugs manifest themselves in
strange ways, such as the program printing interesting messages like core
dump
or bus error
with no additional information. This is the price we pay
for the efficiency and low-level control that the C language provides.
Debugging is a big subject, and we can only scratch the surface here. In general, here are three approaches you can use for debugging:
-
When you get a bug, put lots of
printf
statements in code likely to have caused the bug so that you can monitor the values of variables which may not be what they should be. -
Add lots of
assert
statements so that when something goes wrong the program halts right away instead of continuing. If you don’t know aboutassert
, doman assert
. We will talk more about this later in the course. -
Use a "debugger" like
gdb
to find out where your code went wrong.
These approaches are not mutually exclusive and almost every programmer
uses a combination of all three (plus others). The first two methods are
pretty self-explanatory. The third needs a bit more explanation, which we
provide below. You can also do man gdb
and/or info gdb
to get much more
information.
In what follows, we’ll assume you are working in a Unix-like environment such as Linux, MacOS, or Windows Subsystem for Linux or Cygwin under Windows. |
GDB basics
GDB stands for Gnu DeBugger. It is an interactive environment in which you can run a C program in such a way as to make it very easy to identify bugs.
To use gdb
, do the following:
-
Compile your C program with the
-g
flag e.g.
$ gcc -Wall -Wstrict-prototypes -ansi -pedantic -g myprog.c -o myprog
(Note that we’re using a lot of warning options as well, which are the -Wall
-Wstrict-prototypes -ansi -pedantic
options; these force the compiler to
complain if your code isn’t ANSI-compliant or if it has other suspicious
features. It’s a good habit to always use these options.) The -g
option
puts debugging information into the executable. Most importantly, it puts the
text of the source code file into the executable so you can examine it as the
program executes (we’ll see how below).
-
Type
gdb myprog
(for the example above). This will start the interactive debugger. It’s basically an interpreter-like environment in which you can run your program line-by-line and do useful debugging tasks as well.
After starting the debugger, you will end up with this prompt:
gdb>
From here, you have a choice of lots of commands. Type info
to get a list of
them. Here are some of the most important ones:
-
run
: runs the program -
where
: tells you where you are in the program when you have stopped at some point. Also tells you the call history of the program up to that point (i.e. which functions have been called to get you where you are). -
p <variable>
: prints the value of a variable -
break <file>:<line>
: causes the program to stop at a particular line in a particular source code file -
break <function>
: causes the program to stop when entering a particular function. -
n
: executes the next statement and then stops. This command will not enter a new function while you’re inside a function. Instead, it goes to the next statement in the current function. -
s
: executes the next statement, possibly entering a new function, and then stops. -
l
: lists lines in a source code file. -
c
: continues executing the program. -
q
: exits (quits) gdb.
Several of these commands have longer names that you can use as well:
-
print
forp
-
next
forn
-
step
fors
-
list
forl
-
cont
forc
-
quit
forq
For more information about any of these, type help <cmdname>
at the gdb
prompt, where <cmdname>
is the name of the command listed above.
Things to try when things go wrong
Let’s say that you’re running a C program and it crashes. The error message you get is unlikely to be helpful; it will probably be something like
segmentation violation (core dumped)
with no other information (no file name, no line number, nothing).
First, let’s identify what that cryptic phrase means. A "segmentation
violation" means that your program tried to access memory that it wasn’t
allowed to. Since Unix is a multitasking operating system, each process lives
in its own little world, with its own little hunk of memory that it’s allowed
to play with. The operating system knows what hunk belongs to your process and
what doesn’t; if your process tries to access memory that it doesn’t have the
right to access, then it violates the (memory) segment boundaries and you get a
segmentation violation, which (normally) causes your program to abort. A "core
dump" refers to the fact that by default, a "core" file will be "dumped" into
the directory from which you ran your program. The file is actually called
"core" and can be very large (several megabytes or more). That’s because it’s
a dump of what the memory contained when your program crashed. It is possible
to use the core file to debug your program, but there are much easier ways to
debug, so we won’t cover that here. Most Unix shells (i.e. the command
interpreter like bash
) allow you to put a statement in the initialization
file (.bashrc
for bash
) that restricts the size of core dumps (ideally to
zero bytes, in which case no core file is dumped); ask your local Unix guru for
more information on this.
OK. Now what you need to know is where the segmentation violation occurred.
To do this, compile your program with the -g
option described above, start up
gdb
, and type run myprog
(where myprog
is the name of your program).
Alternatively you can invoke gdb as gdb myprog
and then just type run
at
the gdb
prompt. This will run your program until the segmentation violation
occurs.
If your program needs command-line arguments, you
should supply them after the |
GDB will tell you that the segmentation violation occurred and then wait for your command. It will look something like this:
Program received signal SIGSEGV, Segmentation fault.
0x4006cb26 in free () from /lib/libc.so.6
This means that the segmentation violation (also known as a segmentation fault
or segfault for short) occurred in the library function free
. This is weird;
does this mean that there is a bug in free
? Almost certainly not. Instead,
your program did something bad that caused free
to fail (possibly by asking
it to free a NULL pointer).
Type where
and you will get a stack backtrace. This is probably the single
most useful thing you can have when something goes wrong. A stack backtrace is
a list of function names in your program and associated data. It looks
something like this:
(gdb) where
#0 0x4006cb26 in free () from /lib/libc.so.6
#1 0x4006ca0d in free () from /lib/libc.so.6
#2 0x8048951 in board_updater (array=0x8049bd0, ncells=2) at 1dCA2.c:148
#3 0x80486be in main (argc=3, argv=0xbffff7b4) at 1dCA2.c:44
#4 0x40035a52 in __libc_start_main () from /lib/libc.so.6
The stack is a data structure which holds information about functions which
have partially finished executing. When a function calls another function,
information about the new function being called is "pushed" onto a stack. This
includes information such as the arguments to the function, the contents of
local variables, etc. This information is referred to as a stack frame.
When the function is finished its work it "pops" the frame off the stack and
returns to the previous stack frame, which belongs to the function that called
it. In the above backtrace, we see that the function
__libc_start_main
called main
which called board_updater
which
called free
which called itself recursively. In this case, the functions
__libc_start_main
and free
are C library functions which you didn’t
write. main
is the good old main
function that you write in every C
program. What seems to have happened here is that something went wrong in the
board_updater
function, and gdb even tells you what line it happened on
(which is what the -g
option did). You should look at that line, and perhaps
set a breakpoint there:
break 1dCA2.c:148
Now, when you run the program again, gdb
will stop it on that line, and you
will be able to print out the values of any relevant variables before free
is
called.
There is much, much more to debugging than I have time to go into here, but
this should get you started. Reading the gdb info documentation (type info
gdb
at the shell prompt, or read it online at the link given below) will be a
good place to go for more information, as will asking your Unix guru friends.
MacOS X notes: using lldb
instead of gdb
Mac OS is incredibly strict with respect to security policies, and one unusual
effect of this is that it can be very hard to run gdb
on a Mac.
[1] If you are installing software on MacOS using
homebrew, you can easily install gdb
just by typing
$ brew install gdb
However, the gdb
that gets installed probably won’t allow you to debug your
code. Instead, you will get a cryptic error message telling you that gdb
needs to be "codesigned". There are ways of getting around this (do an
internet search if you’re curious) but they are very complicated.
A perhaps better alternative is simply to use a different debugger! MacOS by
default uses the lldb
debugger for C code; this debugger is quite similar to
gdb. Annoyingly, many of the same commands have different names or are invoked
differently. For instance,
gdb> break foo
(to break as soon as the function foo
is entered) becomes
(lldb) breakpoint set --name foo
Printing the contents of an array is also different. Instead of
gdb> print *arr@10
you have to write
(lldb) parray 10 arr
Here is a brief tutorial on using
lldb
.
Note also that lldb
is expected to be used on files compiled with the clang
C compiler (which is the standard C compiler used on MacOS), but it also seems
to work for files compiled with gcc
. If you run into problems, you could try
using clang
instead of gcc
, but be aware that some of the gcc
command-line options don’t work with clang
.
Both |
References
-
The LLVM home page.