NOTE: This is an old essay (probably from about 2003), reposted by popular demand.
I have been a serious computer programmer for many years. Until I joined the Caltech computer science department, however, I was not fully aware of how rich and varied the field of computer science really is. Call it the fervor of the newly-converted religious zealot if you will, but I believe that computer science is an outstandingly beautiful field with a great deal to offer the rest of science. This page is my attempt to describe why I think CS is such a beautiful science. I will avoid all the usual reasons why people might think CS is good e.g. because so much of modern society depends on computers, or because it's easy to get a job when you have a CS degree. I'm after more substantive reasons here. I will add to this page periodically as I come up with new ideas and/or clarify my older ones.
The reason why I think that computer science is a beautiful science is that it relates to the execution of programs on computers. Computers are conceptually very simple machines which are by design finite and deterministic. The requirement that computer programs run correctly on a finite and deterministic machine forces intellectual honesty on the field of computer science, whereby there are fewer gaps, holes, loose ends and implicit knowledge than in other fields. I believe that other fields, notably mathematics, logic (yes, logic!), and physics can benefit from this kind of rigor.
On a purely emotional level, I really loathe vagueness in science (or, for that matter, in philosophy and argumentation in general). A vague argument is one that you can't be sure you understand, and if you can't understand it you can't make progress. Furthermore, the pernicious effects of vagueness tend to be cumulative; when you don't understand one argument well, you are even less prepared to understand another argument that purports to build on the first one. Most sciences include large amounts of vagueness. The so-called "soft sciences" like psychology and biology don't even try to be precise except inasmuch as they use statistics to establish their points (and even then they usually don't use statistics very well). Hard sciences like physics aren't much better (physicists are notorious for being sloppy with their mathematics). But computer science is different, as I hope to show.
If you ask most people what the most fundamental science is, they will probably say physics. Physics is the science that attempts to understand the fundamental laws of the universe. Assuming a sufficiently perfect understanding of physics, you could hope to understand all of chemistry, biology, psychology, sociology etc. from that foundation (although I'm not saying that it's practical to do so). What is the foundation of physics? Aside from experiments, it's mathematics. What is the foundation of mathematics? Most people would say set theory and logic. But logic is a very abstract field indeed, and very few people, even mathematicians, study it seriously. For instance, most mathematicians do not need a profound understanding of logic in order to do their mathematical work.
However, computer science also depends on logic for its foundations. There are many levels to this. At the most obvious level, a computer program proceeds in a logical manner from its beginning to its end. At a deeper level, logic guides the development of type systems for programming languages (see the book by Pierce in the references), and abstract logical ideas like decidability become extremely tangible in this context. For instance, if a new computer language has an undecidable type system it will probably not be practically usable, so you have to know if it's decidable or not; it's not an abstract exercise. Similarly, it's possible to use computer programs to clarify issues in logic that might be very obscure otherwise. Gregory Chaitin shows how a simple lisp program can demonstrate a result related to Godel's Incompleteness Theorem much more elegantly than a formal logical proof (actually, the lisp program is the proof, but you get the idea). These are not isolated examples. Any computer program is an abstraction, just like any mathematical proof is an abstraction. However, a computer program is an executable abstraction, which is very different from a mathematical proof, which is not executable. This leads me neatly into my next argument...
Let's continue with the comparison between a mathematical proof and a computer program. To do this, let's look at why mathematical proofs are so difficult to understand for most people. I'll ignore trivial issues like notation.
First, any realistic mathematical proof will leave out a great many steps, which are considered to be the "required background knowledge" for anyone who wants to understand the proof. By the way, a very interesting project called the metamath project is trying to create an online archive of mathematical proofs which are specified all the way to the bottom, starting from set theory. But this is a very rare exception to the general rule.
Second, the arguments presented in a mathematical proof may be very rigorous, or they may occasionally have a "hand-waving" quality to them. From the standpoint of logic, this means that the inference rules are not always specified accurately, or are not believable if they are. This is why you will sometimes read a mathematical proof and say "I understand the reasoning, but I just don't buy it". Sometimes the reasoning employed conceals or skirts over very difficult material; for instance, Newton's original calculus used infinitesimals, which have long since been abandoned because it is difficult to formulate a rigorous mathematical theory using them. [Side note: Abraham Robinson's nonstandard analysis did exactly that, but it took several hundred years after Newton's death to get that far.]
Finally, it is rare to find a mathematical proof that doesn't have at least one significant typographical error in it (especially if it's written on a blackboard!).
All of these problems make mathematics much harder to learn than it should be; the learner is required to fill in the gaps, fix the errors and to take the dubious reasoning methods on faith in order to make any progress.
Now let's look at all these issues as they relate to computer programs.
First of all, a computer program has to be self-contained. The only background knowledge is the syntactical and semantic rules of the language (which are (or should be) precisely specified and few in number), and the knowledge of how the available functions/objects work (which is generally available in the documentation for the language). If all this information isn't available in some form, the program simply will not work, as the interpreter/compiler will not know what to do with the program. This forces a certain intellectual honesty on the process of executing a program; nothing can be left unspecified.
The process of executing a computer program is completely deterministic. If you know the rules of the game, you know exactly how to get from a particular point in the computer program to all later points. This is a good thing, too, since when the program doesn't do what you want it to do you can then use a debugger to trace the execution of the program, in as much detail as you want.
A typographical error in a computer program will have one of two effects. First, it may cause the program to fail during compilation or execution. Either way, you know that something is wrong. Alternatively, the program can execute to completion but can produce wrong results. Assuming you have some way of verifying the results (which is not always the case), you will again be alerted that something went wrong. The point is that there is an inbuilt protection against various kinds of errors, particularly typographical errors, which isn't there in the mathematical proof.
There are many clever ways of specifying extremely simple abstract computational paradigms that are "Turing-complete" i.e. that can be used in principle to write any computer program (if you believe the Church-Turing thesis, which I won't cover here). The most famous of these is the Turing machine. An even simpler one is untyped lambda calculus, which has just three kinds of expressions:
Variables: a, b, c
Functions, also known as lambda expressions: lambda x: e where x is a variable and e is an expression. If you supply a value for x, e is then evaluated with the value of x you supplied substituted for x in every place in e where there is an x. Got that? Good!
Function applications: e1 e2, which means to apply the function which is the value of e1 to the expression e2. e1 has to evaluate to a function or it's an error.
Now, as if that wasn't simple enough, one of the most basic theorems of combinatory logic is that you actually only need the S and K combinators; the I combinator can be derived as follows:
I = S K KI've known about this for a long time, and occasionally I would play with the definitions to see if I could derive the theorem. Most of the time, I would make a mistake and get the wrong answer. I couldn't figure out what I was doing wrong until I wrote an interpreter for the SK system (using the language ocaml, which parenthetically is a wonderful language for building little interpreters like this). In the process of doing this i.e. of building a computer program which could mechanically evaluate SKI combinator expressions, I figured out what my problem was. It is this: the process of applying functions is left-associative. In other words, the expressions:
K x y = x S x y z = x z (y z) I = S K Kreally mean
K x y = ((K x) y) = x S x y z = (((S x) y) z) = ((x z) (y z)) I = ((S K) K)Knowing this, the derivation becomes trivial:
S K K x = (((S K) K) x) = ((K x) (K x)) = xbecause ((K x) <anything>) = x (here <anything> is (K x)). Since
I x = xby definition, then
S K K x = I x ((S K) K) x = I x ((S K) K) = I (this is technically called an eta reduction) S K K = Iand we have our result.
What is significant about this result is not the result itself (cool though it is), it's how the use of the computer (specifically, the act of writing the interpreter) made it possible for me to understand the process by which the result is derived. Of course, you could say that maybe all this shows is how stupid I am (no argument there), but if the example was considerably more complex you can just imagine how useful it would be to have a computer check every step of your work. This leads me nicely into the final section...
For instance, there are a number of different kinds of logic. Most (all?) of these can fairly easily be translated into programming languages of some form. Why isn't this a routine activity for logicians?
As for the rest of mathematics, I would like to see computers used routinely to fill in the gaps in published mathematical proofs (or to alert us to when those gaps can't be filled in), so that you never again have to say "I have no idea how the author got from point A to point B", although you might say "My God, what a lot of equations have to be written down to go from point A to point B!" This is reminiscent of Whitehead and Russell's Principia Mathematica where hundreds of pages pass until the authors can prove that 1 + 1 = 2. Annoying though this might be, I think it's much better than hand-waving.
I can see that something similar might be possible in physics, but it would be much harder, since the level of rigor in most physics papers is quite low. Perhaps this means that computers can have an even bigger impact in physics than in mathematics. I don't know, but it's something that's worth thinking about.
Go back to my home page. | Last updated April 11, 2017 |
Mike Vanier (mvanier@cs.caltech.edu)