How to Narrow Down What to Test

An old friend told me that they did not do automatic testing at her company, or any kind of testing for that matter, because in terms of money they are better off if they do ad hoc testing and bug fixing one week prior to the delivery date. Then I got into an interesting discussion where the topic was that the customer does not pay for tests, she pays for a working software. I was really surprised at the attitude of companies towards testing. So I had to find out what experts think, and at the end I found the following quote on stackoverflow from Kent Beck (more on the topic is available on hacker news):

I get paid for code that works, not for tests, so my philosophy is to test as little as possible to reach a given level of confidence (I suspect this level of confidence is high compared to industry standards, but that could just be hubris). If I don’t typically make a kind of mistake (like setting the wrong variables in a constructor), I don’t test for it. I do tend to make sense of test errors, so I’m extra careful when I have logic with complicated conditionals. When coding on a team, I modify my strategy to carefully test code that we, collectively, tend to get wrong.

This was almost a game changer for me, but then I remembered why I liked [automated] test cases in the first place. They give me confidence that my change does not modify existing functionality, and I don’t have to do manual tests repetitively. So I’m still good, and I learnt a new thing: when I talk to people about testing or work for somebody else, then I have to consider the time (money) factor, but in my own projects I’ll write as many automated tests as I can.

I have to admit that testing, especially using automatic test cases, costs a lot. This looks like an extra expense in short time, but saves a lot of trouble in the long run. Unfortunately, not every organisation is mature enough to realize this, or can afford to spend expensive coding time on things that have no value to the customer. Nevertheless, I know that deep down they want to do testing, so in this post I’m going to share several methods that can be used to find areas which are worth testing, so that companies do not have to spend more time on testing than what is absolutely necessary.

## How to Narrow Down Where to Start

The best way to save money is to be effective, so test those parts of the code which really need to be tested. I recommend writing tests for those parts of the code which…

…are the most often used
…are changing frequently
…are changing data and working with financial data
…are more likely to fail

I found one of my old projects which I wrote before I started doing eXtreme Programming. The code is not nice, there aren’t many test cases, but it works just fine. My friend, who used it for a while told me that he never had any problems with it. I put a lot of manual testing effort into it, which I don’t want to do again, but I would like to extend my code, and I need that confidence Kent Beck talked about. So I’m going to use the list above to create an ordered list of classes and methods which I have to test in order to have that confidence during development.

The Mythical 80 Percent Line Coverage

Before explaining the mentioned list in more detail, let’s talk a bit about test code coverage.

I intentionally left test code coverage out of the list, not because I have problems with code coverage metrics, but because they have been misinterpreted for a long time. Test code coverage results do not show how good a test case harness is, just that certain parts of the code have been visited during test case execution.

Some companies use the mythical 80 percent line coverage principle, which means that they require developers to have enough test cases to cover 80 percent of the code. It is mythical, because nobody knows how this number had evolved - at least I was unable to find a reference to it after several hours of research. Alberto Savoia wrote a very good forum post about code coverage, I really recommend reading it before going further.

Here is an example on how to gain more than 30 percent test code coverage in 2 minutes. Have a look at the following code snippet and its coverage results:

public class CheaterTest {
    @Test
    public void shouldIncreaseTheCoverage() {
        HarvesterTask harvester = new HarvesterTask();
        Project project = new Project();
        project.setBaseDir(new File("."));
        harvester.setProject(project);
        harvester.setRepository("../repository");
        harvester.setHistory("history");
        harvester.setTemplate("templates");
        harvester.execute();
    }
}

coverage with cobertura:

cobertura_cheater_test_coverage cobertura coverage report on the cheater test case

coverage with emma:

emma_cheater_test_coverage emma coverage report on the cheater test case

As you can see, my test case does not do any checking on the result - no assertions -, but it executes without any problems and, more importantly, it covers more than 30 percent of the code base. Additionally, the coverage results are not consistent. Two tools provided different results for the very same test case on the very same code base. Conclusion: do not depend too much on test code coverage metrics unless you review the test code as well.

Determine Which Parts of the Code are Really Used

According to the Standish Group Study:

Another interesting statistic that Jim quoted was the large proportion of features that aren’t used in a software product. He quoted two studies: a DuPont study quoted only 25% of a system’s features were really needed. A Standish study found that 45% of features were never used and only 20% of features were used often or always.

usage_pie_diagram

Somehow we have to find that 20 percent and write test cases to cover it. Actually, it is easier said than done, but there are tools, which are capable of showing which code parts have been visited: code coverage tools. We can use code coverage tools to find out which code parts have been visited when the customer used our software. **Mind that I’m not using the testing prefix in order to avoid confusion. **Instrument the code base and deliver it to the testers or the customer (I used cobertura in my examples).

Instrumentation means that the code gets enhanced with flags, and when the execution passes a certain flag, it gets set. When a test code coverage measurement is made then the tool

instruments the code - places these flags

runs the test cases - flags get set

and finally prints out the result - how many flags have been set

During usage coverage measurement, test execution is replaced by customer interaction.

Unfortunately, QA people hate approaches like this, because instrumentation changes the original code base. But if you develop your own web applications and you deploy them quite often, you can sometimes deploy instrumented applications. With this approach, you will know exactly, which parts of your code are really used. If your product cannot be deployed that often or you deliver to another department, then release a beta version (no new features, delivery should be limited to a specific group), and let the testers give you this information. Believe me, they know what the customer needs, and you will be able to map those needs to code.

I reviewed and executed the existing test cases of my legacy application (columns on the left), and manually tested through the main use case (columns on the right). Let’s see whether I managed to cover the main use case with my tests:

class_coverage_compare

line_coverage_compare

As you can see, the uploader.ant package could use some test cases, along with the uploader.FileBasedVerifer class.

My “list of classes I have to test” looks like this at the moment:

uploader.FileBasedVerifier (less effort, almost there)
uploader.ant package (still a lot of work left, but there are only two classes in there)
…

Find Out Which Parts of the Code Change Often

Philosophically, change means a tremendous amount of things, but in software development, when a file changes, it mostly means that:

something has been added to it
it had a problem and got fixed

As a bonus, when one changes a file without proper regression test harness background, then there is a very good chance that this will introduce regression. As Kent mentioned before, it is all about confidence. We need confidence in order to do our job well. So write some test cases for these files.

With this in mind find out which files change often, because it is quite certain that something interesting is happening with them. I used a simple script to find out which files were changed (committed) the most often in the version control system from the beginning until this very moment:

14, VerifierTask.java
13, index.jsp
11, FileBasedUserHome.java
11, FileBasedUser.java
11, FileBasedContentTracker.java
 8, IntegrityCheckTask.java
 7, MailSender.java

And our winner is: VerifierTask.java, which belongs to the uploader.ant package. Excellent, now we know a bit more about the uploader.ant package, and we can also put some interesting classes on our list:

uploader.FileBasedVerifier
uploader.ant.VerifierTask (on the top of our change list)
uploader.ant.HarvesterTask (not changed that often, but used by customer)
index.jsp
uploader.FileBasedUserHome
uploader.FileBasedContentTracker
…

Determine Which Part of the Code Changes Data

Everybody hates losing data. For example, if your code lists certain items (read-only operation) and you allow the user to change their status (read-write operation), you have to test the status change first and maybe take care of the listing later. In the customer’s perspective, it is better not to list something than to lose or corrupt it. There is only one thing worse than data loss, and that is money loss. If your application handles money, then test it really thoroughly.

One way to find these parts of code is to perform a code review. I have the following classes:

uploader/admin/ChangePassword.java
uploader/ant/HarvesterTask.java
uploader/ant/VerifierTask.java
uploader/Checksum.java
uploader/FileBasedContentTracker.java
uploader/FileBasedMetadata.java
uploader/FileBasedUserHome.java
uploader/FileBasedUser.java
uploader/FileBasedVerifier.java
uploader/FileHelper.java
uploader/HtmlHelper.java
uploader/ListHelper.java
uploader/LockExpiredException.java
uploader/LoginBean.java
uploader/MailSender.java
uploader/ReportProcessor.java
uploader/Type.java
uploader/UserExistsException.java
uploader/UserHelper.java
uploader/UserHome.java
uploader/User.java
uploader/UserNotFoundException.java
uploader/Verifier.java

Based on their names, the ChangePassword and FileHelper may change something. After checking the classes, the case of ChangePassword is quite obvious, and after the code review I found out that the FileHelper has the following methods: setContent(), delete(), and copyFile(). Perfect, I put them on the list immediately. They might have been covered by existing test cases, but better safe than sorry. Additionally, it turned out during the code review that FileBasedUserHome and FileBasedContentTracker change data, too so they move up a little bit on the list:

uploader.FileBasedVerifier
uploader.ant.VerifierTask
uploader.ant.HarvesterTask
uploader.FileBasedUserHome (modifies the user’s metadata)
uploader.FileBasedContentTracker (also modifies the user’s metadata)
*uploader.FileHelper (modifies files on the file system, but hasn’t been changed that often) *
index.jsp
uploader.admin.ChangePassword (modifies user’s password)
…

Determine Where the Code Is Most Likely Going to Fail

Static code checkers are often used to find programming errors. When a programming error is obvious to a tool, then the real application will most likely fail when it executes that faulty code. It’s worth having test cases for parts of the code base which are more likely to fail, so that you don’t waste your precious programming time on fixing defects for free.

I executed findbugs and crap4j on my code. Findbugs does what its name says, and crap4j uses cyclomatic complexity and* code coverage tests* to find problematic code snippets. It has a nice algorithm, but the point is that the more complex your code is and the fewer test cases you have to cover these areas, the more problematic your code is. It is pretty straightforward. It’s safe to use crap4j in this project, because I reviewed the test cases and they really do testing.

The result of the findbugs execution:

findbugs_result findbugs result

The result of the crap4j execution:

crap4j_result crap4j result

Huhh, the result isn’t really nice, but on the other hand I have a more fine-grained list of test candidates:

uploader.FileBasedVerifier.setVerified() (crap4j)
*uploader.ant.HarvesterTask.execute() (findbugs, 2nd on the crap4j list)** *
uploader.ant.VerifierTask.execute() (1st on the crap4j list)
uploader.FileBasedVerifier rest of the methods (previous steps)</em>
uploader.FileHelper (thanks findbugs, but this isn’t that problematic after all)
uploader.FileBasedUserHome
uploader.FileBasedContentTracker
index.jsp
uploader.admin.ChangePassword
…

Until this point, I wasn’t sure how to handle the HarvesterTask and VerifierTask classes, but thanks to the static code checkers, I know now.

Determine Which Part of the Code is Commonly Used

This method is kind of a +1, if you intend to make your list more fine-grained.

When I do coding, I hardly write test cases for error scenarios - please don’t hold it against me. In the next step I’m going to do a quick code review and see how the code looks like.

A snippet from my uploader.ant.HarvesterTask class:

if (!user.getVersions(Type.JAVA).isEmpty()) {
    if (user.integrityCheck(Type.JAVA)) {
        resultFsckMessage = MESSAGE_OK;
        try {
            if (harvestJava(user)) {
                resultHarvestMessage = MESSAGE_OK;
            } else {
                resultHarvestMessage = MESSAGE_NONE;
            }
        } catch (IOException e) {
            resultHarvestMessage = MESSAGE_FAILED;
            log(userName + " JAVA IO error: " + e.getMessage());
        }
    } else {
        resultFsckMessage = MESSAGE_FAILED;
    }
} else {
    resultHarvestMessage = MESSAGE_NONE;
}

In this case, I will write test cases which cover the case when the there are Type.JAVA versions, because based on the usage of the software, customers are mostly using Java classes. ** Cover the most common scenarios in your test cases and when you find classes or methods involved only in error handling, then put them at the end of the list**.

Conclusion

I started with 23 classes without any guidance about how to start my work. With approximately 1.5 hours of work - which I won’t have to repeat, because the scripts are already in place -, I ended up with a prioritized list of classes and methods which I need to test first if I don’t want to cause too much trouble to myself by breaking existing functionality or having more defects in my code. Next time when I have to perform a similar examination, I’ll just execute my scripts and evaluate the result in 5 minutes. With this approach, I can save hours of coding work.

I don’t recommend starting writing test cases for each item mentioned in the above list. The goal was to narrow down where to test. Check out how much time you have and start working on the list from top to bottom. For example, testing index.jsp will cost you a lot of time, because you cannot test it the usual way. You need a higher level test framework like selenium which is capable of testing web applications.

I hope these ideas help to find the places you may want to test. If you have a different method or opinion, please share it in the comment section.