 |
| DoubleCheck Static Analysis Tool |
|
| Coping with Complexity |
Complexity has become the most significant challenge to
meeting time to market and reliability demands for
software. Traditional debugging and testing methodologies
simply fall short when dealing with today’s sophisticated
code bases. Increased complexity reduces software quality,
reliability, safety, and security. Automated tools are needed
to cope with this complexity explosion, and static source
code analysis represents one of the most effective
strategies.
Static analyzers attempt to find code sequences that may result in buffer overflows, resource leaks, or many other
security and reliability problems. Source code analyzers are
effective at locating a significant class of defects that are
not detected by compilers during standard builds and
often go undetected during run-time testing or typical field
operation. |
|
 |
Case Study |
| |
|
DoubleCheck improves quality for mature software as
well as new projects. DoubleCheck has been executed on
many large, important code bases, including the Apache
web server. Apache runs 70% of the world’s Internet
web sites. The Apache web server is an open source
application consisting of approximately 200,000 lines of
code and has been in production use for well over a
decade.
DoubleCheck found a number of serious defects in the
Apache web server. In fact, DoubleCheck discovered
significantly more defects than another static analysis
tool (whose results for Apache were published by the
tool vendor). The other tool’s analysis time for Apache
was reported to be 10 minutes. In comparison,
DoubleCheck’s execution time on the same hardware is
about 30 seconds. |
|
|
 |
|
| DoubleCheck Source Code Analyzer |
Unlike other source code analyzers that run as separate
tools, DoubleCheck™ is an Integrated Static Analyzer (ISA).
DoubleCheck is built into the Green Hills™ C/C++ compiler,
taking advantage of accurate and efficient analysis
algorithms that have been tuned and field proven over the
past 25 years. DoubleCheck can be used as a single
integrated tool to perform compilation and defect analysis
in the same pass.
A typical compiler will issue warnings and errors for some
basic potential code problems, such as violations of the
language standard or use of implementation-defined
constructs. In contrast, DoubleCheck performs a full
program analysis, finding bugs caused by complex
interactions between pieces of code that may not even be
in the same source file.
Unlike other tools, DoubleCheck automatically uses the
exact same code configuration as used during the build
process. This allows developers to be certain that the code
executed is the same code that was checked.
DoubleCheck determines potential execution paths
through code, including paths into and across subroutine
calls, and how the values of program objects (such as
standalone variables or fields within aggregates) could
change across these paths.
DoubleCheck looks for many types of flaws, including:
- Potential NULL pointer dereferences
- Access beyond an allocated area (e.g. array or
dynamically allocated buffer); otherwise known as a
buffer overflow
- Potential writes to read-only memory
- Reads of potentially uninitialized objects
- Resource leaks (e.g. memory leaks and file descriptor
leaks)
- Use of memory that has already been deallocated
- Out of scope memory usage (e.g. returning the address
of an automatic variable from a subroutine)
- Failure to set a return value from a subroutine
- Buffer and array underflows
The analyzer understands the behavior of many standard
runtime library functions. For example it knows that
subroutines like free should be passed pointers to memory
allocated by subroutines like malloc. The analyzer uses this
information to detect errors in code that calls or uses the
result of a call to these functions.
|
|
| Customizing the Bug Search |
|
DoubleCheck can be taught about properties of userdefined
subroutines. For example if a custom memory
allocation system is used, DoubleCheck can be taught to
look for misuses of this system, finding more bugs and
reducing false positives. DoubleCheck is highly accurate–
much better at limiting false positives than traditional
UNIX analyzers like lint. In addition to flaws that lead
directly to program faults, DoubleCheck can detect
questionable constructs that should be fixed to improve
code clarity. A good example of this is a write to a variable
that is never subsequently read.
|
| Green Hills Coding Standard |
Many software development organizations employ an
internal coding standard which governs programming
practices to help ensure quality, maintainability, and
reliability. DoubleCheck helps automate the enforcement
of coding standards.
For example, DoubleCheck measures and, optionally,
limits software component complexity using standardized
metrics such as McCabe. These metrics help make code
easier to understand, maintain, and test.
DoubleCheck also has a Green Hills Mode, incorporating
25 years of Green Hills experience in helping customers
develop high quality software. Green Hills Mode adds a
number of sensible quality controls to DoubleCheck’s bug
finding mission, including a number of MISRA compliance
checks, enforcement of optional but important language
standards, and more.
Once again, since DoubleCheck is already traversing the
code tree to find bugs, metric computations and
enforcement of other coding rules do not incur significant
overhead. Because DoubleCheck can be configured to
generate a build error pointing out the offending code,
the developer is unable to accidentally submit software
that violates the coding rule. Using DoubleCheck as an
automated software quality control saves the time and
frustration typically associated with peer reviews. |
|
| Output of the Analyzer |
| DoubleCheck is capable of emitting errors as part of the
build process as well as generating an intuitive set of
web pages, powered by an integrated web server. The
user can browse high level summaries of the different
flaws found by the analyzer (Figure 1) and then click on
hyperlinks to investigate specific problems. Within a
specific problem display, the error is displayed inline with
the surrounding code, making it easy to understand
(Figure 2). Function names and other objects are
hyperlinked for convenient browsing of the source code.
Since the web pages are running under a web server, the
results can easily be shared and browsed by any member
of the development team. |
| Analysis Time |
Analysis time is a gating factor in the adoption of source
code analyzers. Unlike other analyzers that are used
sporadically as a testing tool, DoubleCheck is fast enough
to be used by all developers, all the time. DoubleCheck
executes 5 times faster than other commercial analyzers.
This advantage increases to a factor of 20 or more when
DoubleCheck’s distributed build engine is used to
automatically parallelize the analysis across available
workstation resources on the developers’ network.
Furthermore, DoubleCheck uses sophisticated subroutinelevel
dependency checking. With other analyzers, a simple
change to a single source file will result in a lengthy reanalysis.
With DoubleCheck, analysis time is limited to
portions of the code base affected by the edit, once again
ensuring that DoubleCheck can be used throughout the development cycle. |
| Return on Investment of 30:1 |
DoubleCheck reduces development cost by enabling
engineers to detect and resolve problems more efficiently
and earlier in the development cycle. By reducing
development time, products reach market faster and stay
in market longer, translating into higher sales and profits.
By increasing product quality, DoubleCheck reduces postsales
cost (or “user” cost) associated with product failures,
recalls, and in-field maintenance. Furthermore, increased
quality improves market positioning and reputation,
enabling organizations to command higher prices which
filter directly to the bottom line.
Many studies have attempted to estimate the cost to
produce and deliver software to market. It is estimated
that it cost $1000 to develop each line of code on the
space shuttle. Developing software to the stringent DO-
178B Level A standard (for critical aircraft systems) has
been estimated at hundreds of dollars per line. On the
lower end, Red Hat Linux has been estimated to cost $33
per line of code. Other estimates generally place the cost
of good quality commercial software in the range of $30
to $40 per line of code.
Yet other studies have estimated how this development
time is spent. Most concur that more than half of software
development time is spent debugging: identifying and
correcting software defects. If we use an estimate of $30
per line of code in total cost, this means that
organizations conservatively spend $15 to debug each line
of code.
Another commonly held belief is that the cost of
identifying and correcting defects grows dramatically as
the development cycle progresses. Some studies have
shown that the time to fix a bug grows from an average
of 2-3 hours during the coding phase to 16-18 hours
when a defect must be tracked down during postintegration
quality assurance testing. Author Steve
McConnell is often quoted for his estimates that defects
cost 10 to 100 times more to fix when they escape
detection during the coding phase.
DoubleCheck decreases defect resolution time
Now let’s consider the decrease in defect resolution time
enabled by DoubleCheck. Some studies have shown that
static analysis can reduce the number of defects found
relative to manual reviews by more than 40%. In addition
to new code, DoubleCheck has been run on mature,
production code, including the Apache web server, Linux
kernel, OpenSSL, and sendmail. DoubleCheck has found
many defects, including serious security vulnerabilities, in
these code bases. When a defect is identified using static
analysis, the most expensive part of defect resolution–
tracking down the bug–is reduced to a negligible amount:
the tool automatically locates defects and elucidates the
offending code sequence leading to the failure. Using a
conservative estimate of 10% for the decrease in bug
fixing time enabled by DoubleCheck, the $15 cost to
debug a line of code is reduced by $1.50.
A savings of $1.50 per line of code represents a return on
investment of approximately 30 to 1. This ROI calculation
completely ignores the aforementioned “user” cost
reduction, time to market benefits, pricing benefits, etc.
Studies have shown that the post-production cost of
software defects can be as high or even multiple factors
higher than the total development cost. This is certainly
common in the aerospace, medical, and automotive
industries.
As software grows in complexity, integrated static
analyzers represent a powerful and cost effective tool to
help manage and control that complexity.
|
|
|
|
|