Saturday, March 14, 2009

On the smallest possible condition

Consider how 'if' statement logic can frequently obscure the behavior of the code.

Frequently a simple 'if' statement can contain two to three conditions. Sometimes more, rarely less.

This results in hard to read and maintain conditions. In some cases a condition can have two or more sets of logical conditions.

As an example, consider the following simple condition:

if (name != null && name.length < 20 && address.country == "CA") {
 ...
}

As programmers we may recognize that there are two intended conditions here. One deals with the validity of the name, while the other deals with the address being a domestic address (if you're in Canada). In trying to clean up this code we will frequently break this 'and' of two conditions into two 'if' statements like so:

if (name != null && name.length < 20) {
 if (address.country == "xyz") {
  ...
 }
}

this strikes me as no more clear than the first case. The dual 'if' statements correctly identify the two conditions, but their separation does not clarify the meaning of the code.

In my mind this clean up is heading in the wrong direction. I propose we take a step back to the original example.

If the conditions can be named in a descriptive way, then maybe we can clear this up a little. I propose that we create methods that will express the two conditions separately. I propose isNameValid() and isDomesticAddress().

This should result in the following condition:

boolean isNameValid() {
 return name != null && name.length < 20;
}

boolean isDomesticAddress() {
 return address.country == "xyz";
}

...
if (isNameValid() && isDomesticAddress()) {
 ...
}

This now speaks volumes more than either of the previous examples. Firstly in defines a clear logical combination of conditions.

Arguably we could have applied the same refactoring to the two-if structure, and it would have looked like this:

if (isNameValid()) {
 if (isDomesticAddress()) {
  ...
 }
}

I don't think this helps clarify things as much as the combined if statement.

In following with the idea of condition methods, the and-ed content of the condition in the third code snippet should be regrouped as:

boolean isValidForDomesticShipping() {
 return isNameValid() && isDomesticAddress();
}

and subsequently the conditional would look like this:

if (isValidForDomesticShipping()) {
 ...
}

Now the isValidForDomesticShipping can be easily tested on its own, and the if statement reads much better.

Negations

Now continuing along the road to conditional clarity, let us consider the impact of using negative conditions. In C-style languages such as C++, C# and Java, the negation of a boolean expression is done with the use of the exclamation point. Now the simplicity of this syntax is great, except that it really doesnt jump out when speed reading the code. Obviously Python doesnt suffer from this problem, the negation is a resounding three letter operator "not".

An if statements should not use the negative condition to inverse the behavior of the conditional method. Instead it is clearer to write a conditional method that represents the expressed inversion. This forces the method to say what the condition means.

isNotValidForShipping stands out more clearly than !isValidForShipping.

Help, I'm drowning in methods

As the number of conditional methods increases, you may find your classes getting a little cluttered. This may be a sign that the classes have too many members.

in order to unclutter a class, pushing some of its members and corresponding methods (including conditional methods) into smaller classes.

This means that instead of a string member called name, your Shipping slip may have a member of type CustomerName, and as well as a member of type Address.

An example of the above code with delegated responsibility could resemble this:

boolean isValidForDomesticShipping() {
 return customerName.isValid() && shippingAddress.isDomestic();
}

Now this is starting to make the code cleaner and clearer still.

Conclusion

Rules

  1. 'if' statements should not include conditions, only a call to a conditional method.
  2. As much as possible, an if statements should not use the negative condition to inverse the behavior of the conditional method. Instead it is clearer to write a conditional method that represents the negation in plain English (or whatever language you're programming in)
  3. A conditional method should use only one comparison operation (!=, >, <, ==, etc) or one combining operation (&&, ||). This forces combined conditions to use condition methods on the individual members.
  4. Conditional operations should never have 'if' statements themselves, they should be composed of logical operators and conditional method calls only.
  5. Complex condition methods should be broken into a number of conditional methods.

This approach to condition evaluation helps to make code self-documenting. It can also facilitate testing of conditional logic as it separates the evaluation of the condition from the if statement.

If I start to find that a class is accumulating too many conditional methods, then chances are that the class does too much, and some of its members need to be split off into new classes, the conditional methods will follow.

Friday, February 13, 2009

On testing data objects, applied TDD

In a previous post I discussed how you might approach unit testing a data object. In a test-first approach to development, we first write the tests (obviously), and then write the simplest code that makes the test pass. In this exercise I will propose the tests, writing the code will be up to you. This exercise is designed to be written in Java with JUnit 3, but can be easily adapted to another xUnit framework and language.

The context of the exercise is the reservation of tables at a fair. A person can reserve one or more tables on which to present their offerings.

First lets imagine we want to create a reservation for a single table by 'bob'.

public void testSingleTableReservationBelongsToReserver() {
 Reservation bobsReservation = Reservation.forSingleTable("bob");
 assertEquals("bob", bobsReservation.getReservedBy());
}

Ask yourself what the simplest code to implement the reservation class would look like, and write it. Make sure your test passes.

Next we will validate that the reservation is specifically for one table. This is another independent test since I want to be able to easily identify the source of failure if a test fails. Chaining multiple assertions in a single test can cause one failure to hide another.

public void testSingleTableReservationReservesOneTable() {
 Reservation bobsReservation = Reservation.forSingleTable("bob");
 assertEquals(1, bobsReservation.getNumberOfTables());
}

Again, implement the simplest code to make the test pass. In this case the method getNumberOfTables could return the constant '1'.

Next let's validate that the cost of the reservation is the sum of the registration fee (30$) and the cost per table (20$). Again this method could return a hard-coded constant.

public void testCostOfSingleTableReservationIsCostPerTablePlusRegistrationFee();
 Reservation bobsReservation = Reservation.forSingleTable("bob");
 assertEquals(50, bobsReservation.getTotalReservationCost());
}

Next let's do a negative test for the creation of a multi-table reservation. The maximum number of tables allowed is 4 per person.

public void testCannotCreateReservationForMoreThanMaxAllowedTables() {
 try {
  Reservation.forMultipleTable("jim", 5);
  fail();
 } catch (TooManyTablesRequestedException a) {
 }
}

It seems that Jim's out of luck. This test forced us to create a new factory method as well as a new exception. If your new method has an if statement, you've already gone too far. At this point your method should only contain a single statement: a throw new exception. I know it's painfuly, small step to many newbies, but we are working our way up slowly as would be done in TDD. Its is an incremental development process.

Ok, maybe Jim can settle for reserving 4 tables.

public void testCreateReservationForMaximumNumberOfTables() throws Exception {
 Reservation joesReservation = Reservation.forMultipleTable("joe", 4);
 assertEquals(4, joesReservation.getNumberOfTables());
}

So far things are moving along well. Our business rules for the calculation of the cost of the reservation may need some elaborating. Lets ask for the cost of reserving two tables.

public void testCostOfReservationForTwo() throws Exception{
  Reservation chrisReservation = Reservation.forMultipleTable("chris", 2);
  assertEquals(70, chrisReservation.getTotalReservationCost());
}

At this point we may apply a 10% discount to Jim's reservation because he's reserving 4 tables.

public void testCostOfReservationForFourTablesIncludesDiscount() throws Exception{
  Reservation joesReservation = Reservation.forMultipleTable("joe", 4);
  assertEquals(99, joesReservation.getTotalReservationCost());
}

We will now add two tests that will target the updating of the data object's numberOfTables property.

public void testChangeNumberOfTablesReservedOverridesPreviousNumber() throws Exception {
  Reservation joesReservation = Reservation.forMultipleTable("joe", 4);
  joesReservation.setNumberOfTables(3);
  assertEquals(3, joesReservation.getNumberOfTables());
}
 
public void testChangeNumberOfTablesReservedChangesCostOfReservation() throws Exception {
  Reservation joesReservation = Reservation.forMultipleTable("joe", 4);
  joesReservation.setNumberOfTables(1);
  assertEquals(50, joesReservation.getTotalReservationCost());
}

In many baby steps we will have developed a simple, well tested implementation of Reservation. This particular exercise is trivial, and did not push us to do any complicated refactoring. In practice we would want to follow the tree step TDD cycle: red, green, refactor.

Usually the hardest part to test-first development is imagining what to test. I hope this exercise has shed some light on how to start the test writing.

Sunday, February 8, 2009

On handling exceptions in happy path tests

I'd like to start this post with a reflection on the goal of unit testing.

The goal of unit tests is to validate the single, precise behavior of a block of code. This indicates that a test's value is directly dependent on its ability to spot deviations in that behavior.

So lets look at a simple test, the likes of which I have seen before:

void testSomeBizareCondition() {
 try {
  op.someOperartionThatShouldBeOK();
  assertEquals("abc", op.getValue());
 } catch (NullPointerException e) {
  fail("oops");
 } catch (SomeOperationException e) {
  fail("oops");
 }
}

What is wrong with this test? If the test should fail with an "oops" there is no way to differentiate between the two error conditions. Instead of reading the test report logs and quickly correcting the code, some developers will find themselves debugging the tests. The answer to that is not to change the failure messages for each catch, because this would lead us down the path of catching every imaginable exception type.

Another reason not to favor catching of individual exception types is illustrated bellow:

void testSomeBizareConditionThrowsAnOperationException() {
 try {
  op.someOperartionThatShouldBeOK();
  fail("oops");
 } catch (NullPointerException e) {
  fail("oops");
 } catch (SomeOperationException e) {
 }
}

At a glance, can you tell what the goal of the test is? Maybe you can, but it isn't obvious, and you shouldn't have to thing about it. This test is intended to validate that SomeOperationException is thrown during test execution. Unfortunately this intention is not evident.

In order to illustrate the solution to both of the above tests, lets look at the basic concept. Bellow is a test that represents a simplified version of the first example.

void testSomeOptimisticCondition() {
 try {
  op.someOperationThatShouldBeOK();
  assertEquals("abc", op.getValue());
 } catch (Trowable th) {
  fail("bad");
 }
}

This example still masks the failure reason, the cause of exception is not in evidence. The stack trace and exception details go a long way to facilitating a quick code correction.

A second problem with this test is that it masks the details of the assertion failure. This could be corrected by changing the catch type from Throwable to Exception. This is because an assertion failure is expressed in the form of an exception.

The solution

Its hard to imagine, but sometimes the best thing to do is let the exception propagate out of the test. For unchecked exceptions this is easy, the test will fail, and the cause will be plain as day in the test output.

For checked exceptions this is a little more complicated, but only slightly. Let the test method throw a generic exception.

void testSomeOptimisticCondition() throws Exception {
 op.someOperationThatShouldBeOK();
 assertEquals("abc", op.getValue());
}

Not only does the test now guaranty that all failure conditions are now explicit, it makes the test a lot clearer.

Test cleanup

Now all this discussion on removing clutter may well have some readers thinking: "yeah, but I use a catch to cleanup after the test". It is true that some tests require the cleaning of external resources. The problem with doing so in a try/catch is that it can complicate test logic. There are two viable solutions, one is to use a try/finally, putting the cleanup in the finally, and another is to put the logic in the tear down method. I tend to favor the tear-down, because it removes clutter from the test method.

Test should always fail fast. So how do we handle exceptions in happy-path tests? We don't.

Sunday, January 18, 2009

On testing data objects, the first step towards TDD

On testing data objects, the first step towards TDD

The simplest class behavior to test is the basic mapping of an object's properties either through it's constructors or through it's accessors. This translates to validating the correctness of the object creation, or validating the correctness of the object's modification.

Validating object construction

The first and easiest test cases involve the construction of an object. Usually the first test case involves validating and object's default values through a default constructor. Remember to question what an accessor should return if an object field is null.

Subsequent tests involve and validate that the construction parameters are correctly represented through the objects accessors.

I encourage validating a single object property per test. This makes it easier to identify what code is at fault when a test fails, it also facilitates defining test names which describe the expected result of the test.

Validating object modification

The next step in object testing, may well be the modification of properties. I am personally of the opinion that this should cause us to question our object modeling, but there are some valid uses for this, see bellow.

These tests involve constructing an object with one set of values, and validating that modification to that truely re-assigns the value. It is important to use test data that clearly demonstrates the modification of the field. So if the initial value is the value '1', the modification should assign '2'.

There is little interest in validating multiple subsequent modifications on a specific object instance unless the object supports some special caching or history behavior. In most cases and assignment is an assignment is an assignment. Dont clutter your tests with sequential validations.

Why validate the obvious case?

In the javabean and pre-version 3.0 C# approach to properties, an accessor is an explicit block of code. Consequently, it can contain bugs. This is rarely the case, but it does happen from time to time. Usually such bugs are quickly identified and fixed, and subsequently not repeated, but there are nonetheless reasons to validate them.

Remember that even if the property uses the trivial implementation, it remains a kind of method. Even C#'s properties are a kind of field-separated accessor with a signature that remains independent of the internal class data representation. This means that through the magic of refactoring, changes to the underlying structure of a class can continue to expose identical properties even if the origin of said properties has changed. Wouldnt it be nice if we could validate that a refactoring of a class' internal workings continued to provide the same external behavior? I certainly think so.

In C# 3.0, it is now possible to specify a property with a short-hand notation which automatically generates the trivial implementation. This does not invalidate my previous argument. The short-form syntax can be replaced with an elaborate evaluation logic without changing the class' signature. A simple test could help to catch a sudden change in behavior. Remember that tests are a kind of code-level contract. Changes in class behavior are a change in contract that must be re-negotiated by the programmer for all the affected parties including the test code.

How far should we go with testing the trivial implementation? Not very far. The tests are just a safe-guard against obvious violations, and the number of tests should reflect the complexity of the tested code. Only if the underling code grows in complexity, should the tests grow in number.

What about encapsulation?

Let me underline that this discussion focuses on data objects. Objects who's primary function is to store and transport data. Though there are reasons and contexts in which the are appropriate, and others in which they can be avoided, they remain the most frequent type of object that I've seen in application code bases. Regardless of your opinion on object modeling, they remain the easiest kind of object to test, and so for a TDD newbie, they are the low-hanging fruit.

Conclusion

Data objects are easy to test. The simplicity of their implementations make them an ideal starting point for test-driven development newbies. Start by validating object construction, then move on to object modification.

Thursday, November 27, 2008

greetings

Hi all, this is a short message to welcome you to the code reef blog. This blog will hopefully grow from my musings and comments on the developing of software, the software user, and everything in between.

Why "the code reef"? though I am not a diver, I find the allusion to be an imagination inspiring one. I am reminded of nature shows I watched as a youngster. Shows where divers would swim through reefs around the world and show us the beautiful schools of fish that inhabited the reef, the complex ecosystem of living creatures, some cohabiting peacefully or symbiotically, others preying on those lower in the food chain. Many potentially fatal dangers lurk hidden in these reefs as well.

I am also reminded that the the ecosystem of a reef is a fragile one, easily unbalance and destroyed by external factors. Pollution, natural disasters, violent weather patterns, and sometime a simple imbalance in the natural order of things can all contribute to the unfortunate end to the beauty and promise that once was.

In short, I find the comparison fitting - much of what I have said about the ecosystem of reefs, could also be used to describe the industry in which I work, the projects I have been involved in, and the teams and technologies I have worked with.

I love software development most of the time, I hate it others, but I can safely say that I learn something new every day. Hopefully some of the more interesting morsels will find their way into these posts.