Code coverage != code quality

My current project has a new requirement that in order to be feature complete we must reach 80% code coverage with our unit tests. At first, this seems like a good idea. You want to ensure code quality. Unit tests can help with that. So you figure you’ll come up with a way of measuring how much of your code is being tested. Soon, however, deadlines get tighter and actual features need to be finished. The code coverage is short of the required 80%. So you take the following code:

   if(unreachable) {
      doTheUnreachable();
   }

   int codeCoverageRequired = 80;
   cout < < "Feature complete requires " << codeCoverageRequired << "% code coverage" << endl;

This code has 4 executable lines of code, only 3 of which are being executed. You’re only at 75% code coverage. So you make the following change:

   if(unreachable) {
      doTheUnreachable();
   }

   int codeCoverageRequired = 80;
   cout < < "Feature complete requires ";
   cout << codeCoverageRequired;
   cout << " % code coverage";
   cout << endl;

This code has 7 lines of executable code, 6 of which are being executed. Now you’re at 86% code coverage. You’ve boosted your code coverage numbers by 10% simply by adjusting some lines of code. The quality is no better, but your code coverage is higher.

You can also play games by under-reporting the total lines of code in your project. As it turns out, this is easier than you might think. The code coverage tool we’re using (gcov) appears to have an issue with not reporting the lines of code in files that don’t get tested at all.

By gaming the system you’re able to give management the warm, fuzzy feeling that the code quality is high when, in truth, the opposite may true. Even a high code coverage number without gaming the system doesn’t necessarily mean that the code quality is high. Take the following, for example:

   size_t function( void *ptr, size_t size, size_t nmemb, void *stream) {
      ((string*)stream).append(ptr, size * nmemb);
      return size * nmemb;
   }

   string requestUrl(string url) {
      static string buffer;

      CURL* ch = curl_easy_init();
      curl_easy_setopt(ch, CURLOPT_URL, "http://unclehulka.com/ryan/blog/");
      curl_easy_setopt(ch, CURLOPT_WRITEDATA, &buffer);
      curl_easy_setopt(ch, CURLOPT_WRITEFUNCTION, writehandle);
      curl_easy_perform(ch);
      curl_easy_cleanup(ch);

      return buffer;
   }

If you call requestUrl() from your unit test, you’ll end up with 100% code coverage, however this code is as buggy as it gets (see the ’static string’ declaration).

The lesson is if you want to ensure code quality, use something that actually measures code quality.

3 Responses to “Code coverage != code quality”

  1. Sam Says:

    Presumably your management believes that you won’t game the system and that you will write useful tests to reach the metric. If you want something that tests how good the tests are look at things like Jester:

    http://jester.sourceforge.net/

    Don’t know if something like that is available for C/C++.

  2. Ryan Says:

    They can presume what they want, I know what I’ve seen. ;)

  3. Ryan Says:

    Also, I’m more upset about meaningless numbers being a requirement for feature complete. I’m quite happy that they care about code quality, I just think they’re going about it the wrong way.

    The fact that my project has had multiple serious data corruption bugs recently…both in code with 100% code coverage illustrates the point.

Leave a Reply