Complex systems have complex errors. Real systems have a variety of mishandled corner cases triggered by intricate sequences of events. In practice, this leaves a residue of errors that cause system crashes but only after days or weeks of continuous execution. When detected, such problems are often very difficult to diagnose because the errors are not reproducible and the sequence of events leading to them cannot be reconstructed.
Formal verification methods are a possible way to find and diagnose such deep errors. One option is explicit model checking, which systematically enumerates the possible states of the system. A basic model checker starts from an initial state and recursively generates successive system states by executing the nondeterministic events of the system. States are stored in a hash table to ensure that each state is explored at most once. This process continues either until the whole state space is explored, or until the model checker runs out of resources. When it works, this style of state graph exploration can achieve the effect of impractically massive testing by avoiding the redundancy that would occur in conventional testing.
Conventional model checkers usually assume that the design is described at a high level that abstracts away many details of the actual implementation. Verifying actual code using such a tool requires reconstructing this abstract description from the code. This process requires a great deal of manual effort, hampering the use of model checking in actual system design. Moreover, human errors in the manual abstraction process result in missing bugs and cause false alarms during verification, further increasing the cost and reducing the usefulness of model checking. Such errors can be introduced both when constructing the model and as a result of drift as the actual system evolves.
This thesis proposes CMC (C Model Checker) to address these issues. CMC works on unmodified C or C++ implementations without resorting to any intermediate description. Like traditional model checkers, CMC explores large state spaces efficiently by storing states and achieves the equivalent of executing astronomical numbers of tests in reasonable time. By not requiring any high-level model of the code, CMC considerably reduces the manual effort involved in model checking real systems. More importantly, it finds the bugs that are actually in the implementation: it does not miss implementation bugs that would be omitted from a model, nor does it create false reports of errors that appear in the model but not in the implementation.
The primary goals of CMC are to scale to large systems and to effectively find errors in such systems. However, CMC is not able to provide any correctness guarantees about the input system. While providing such guarantees is desirable, doing so for even moderately sized systems is prohibitively difficult. Instead, CMC focuses its resources to enhance its scalability and effectiveness. For instance, CMC stores only a small signature of each state in the hash table rather than storing the entire state. This enables CMC to handle many orders of magnitude larger states but at the risk of missing errors due to hash collisions. Similarly, CMC employs many approximate, yet powerful state transformations techniques to improve its state space search. Such techniques are not available to model checkers that focus on absolute correctness.
While the approach of CMC is applicable to system software in general, it has been validated only on network protocol implementations. Network protocols are especially suited to model checking as they are mostly self-contained systems where few external, well defined inputs trigger many interleaving behaviors. The thesis applies CMC to two different network protocols of varying complexity: AODV, a loop free routing protocol for ad-hoc networks and TCP, the most widely used transport protocol in the Internet today.
CMC checked three different publicly available implementations of the AODV protocol and found 34 unique errors at a rate of roughly one bug per 300 lines of code. CMC also found an error in the underlying AODV specification.
For the TCP protocol, CMC checked the Linux TCP implementation from the stable kernel version 2.4.19. This implementation is around 50,000 lines of code. It is mature, thoroughly tested by its actual deployment in the Internet, and is visually inspected by many in the open source community. CMC found 4 errors in this implementation where it fails to satisfy the TCP specification. Also, CMC achieved over 90% protocol coverage while model checking this implementation. These results demonstrate that CMC is able to scale to systems as large and complex as the TCP implementation and successfully find errors in them.