Looking at section 3.1 (OOM):
OOM testing is accomplished by
simulating OOM errors. SQLite allows
an application to substitute an
alternative malloc() implementation
using the
sqlite3_config(SQLITE_CONFIG_MALLOC,...)
interface. The TCL and TH3 test
harnesses are both capable of
inserting a modified version of
malloc() that can be rigged to fail
after a certain number of allocations.
These instrumented mallocs can be set
to fail only once and then start
working again, or to continue failing
after the first failure. OOM tests are
done in a loop. On the first iteration
of the loop, the instrumented malloc
is rigged to fail on the first
allocation. Then some SQLite operation
is carried out and checks are done to
make sure SQLite handled the OOM error
correctly. Then the time-to-failure
counter on the instrumented malloc is
increased by one and the test is
repeated. The loop continues until the
entire operation runs to completion
without ever encountering a simulated
OOM failure. Tests like this are run
twice, once with the instrumented
malloc set to fail only once, and
again with the instrumented malloc set
to fail continuously after the first
failure.
Note that section 7 explicitly states 100% core coverage as determined by gcov. I agree with Donal Fellows that the test framework is largely responsible for the test coverage beyond what a call graph would suggest. Its a much different thing to see malloc() entered nn times and write a test for it than it is to write dozens of tests geared to simulate environments where malloc() is likely to fail.
Yes, the resulting coverage is an artifact of diligence, however so is the selection of a test framework that enables that kind of diligence.
Finally, reiterating the obvious, malloc()
takes only a single void pointer. This suggests that the tests written around it are by deliberate design, not automatically generated.