A TDD Example

How about a little walkthrough of TDD?

The Problem

We’ll do the Recurring Rainfall Problem:

Design a [function] called rainfall that consumes a list of numbers representing daily rainfall amounts as entered by a user. The list may contain the number -999 indicating the end of the data of interest. Produce the average of the non-negative values in the list up to the first -999 (if it shows up). There may be negative numbers other than -999 in the list.

The Process

We’ll use Uncle Bob’s Three Rules of TDD:

You are not allowed to write any production code unless it is to make a failing unit test pass.
You are not allowed to write any more of a unit test than is sufficient to fail; and compilation failures are failures.
You are not allowed to write any more production code than is sufficient to pass the one failing unit test.

This means that, yes, we begin by writing a unit test for the function which we have not yet written, because “compilation failures are failures.” We iterate as follows:

while not finished: [RedHat] write a(nother) failing unit test [GreenHat] make it (and all previous tests) pass [BlueHat] refactor

A Walkthrough

We’ll assume you have a Python interpreter and py.test installed. Start (in red hat mode) by writing your first unit test (in rainfall_test.py):

from rainfall import rainfall
def test_empty_dataset_produces_zero():
    assert rainfall([]) == 0

And we run it, expecting failure:

$ py.test -q rainfall_test.py
=============================== ERRORS ================================
__________________ ERROR collecting rainfall_test.py __________________
rainfall_test.py:1: in 
    from rainfall import rainfall
E   ImportError: No module named 'rainfall'
1 error in 0.01 seconds

Good. That was expected. Now we go to green hat mode and make it pass. Let’s create rainfall.py and write, according to the rules, only enough code to make the failing test pass:

def rainfall(amounts):
    return 0

And run the test:

[~/w/code/python]$ py.test -q rainfall_test.py
.
1 passed in 0.00 seconds

Cool. Now here we would look at our code and see if we can refactor to clean it up, but there’s nothing to do here. So we move to the next simplest case, without any negative numbers or sentinels and add (red hat mode) a failing unit test:

from rainfall import rainfall
def test_empty_dataset_produces_zero():
    assert rainfall([]) == 0

def test_produces_average_when_no_negatives_or_sentinels():
    assert rainfall([3]) == 3
    assert rainfall([1, 2, 3, 4, 5, 6]) == 3.5

Run it, expecting failure:

$ py.test -q rainfall_test.py
.F
============================== FAILURES ===============================
________ test_produces_average_when_no_negatives_or_sentinels _________

    def test_produces_average_when_no_negatives_or_sentinels():
>       assert rainfall([3]) == 3
E       assert 0 == 3
E        +  where 0 = rainfall([3])

rainfall_test.py:6: AssertionError
1 failed, 1 passed in 0.01 seconds

Good, now make it pass, which we do by changing our code to implement the average:

def rainfall(amounts):
    return sum(amounts) / len(amounts)

Now run it, and see if it passes:

$ py.test -q rainfall_test.py
F.
============================== FAILURES ===============================
__________________ test_empty_dataset_produces_zero ___________________

    def test_empty_dataset_produces_zero():
>       assert rainfall([]) == 0

rainfall_test.py:3:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

amounts = []

    def rainfall(amounts):
>       return sum(amounts) / len(amounts)
E       ZeroDivisionError: division by zero

rainfall.py:2: ZeroDivisionError
1 failed, 1 passed in 0.01 seconds

Whoa! We made the new test pass, but the old test broke with a division-by-zero error! We are still in green hat mode, because we have to ensure all the tests pass before we can go on.

Let’s fix this:

def rainfall(amounts):
    return sum(amounts) / len(amounts) if amounts else 0

Does it work?

$ py.test -q rainfall_test.py
..
2 passed in 0.00 seconds

Next failing test to write: skipping negatives.

from rainfall import rainfall
def test_empty_dataset_produces_zero():
    assert rainfall([]) == 0

def test_produces_average_when_no_negatives_or_sentinels():
    assert rainfall([3]) == 3
    assert rainfall([1, 2, 3, 4, 5, 6]) == 3.5

def test_negatives_are_skipped():
    assert rainfall([-5, -2, 5, 10, 30, -1]) == 15
    assert rainfall([-1]) == 0
    assert rainfall([3, -5, -5, -100]) == 3
    assert rainfall([10, 0, -100]) == 5

This new test will fail with:

E       assert 6.166666666666667 == 15
E        +  where 6.166666666666667 = rainfall([-5, -2, 5, 10, 30, -1])

So we make it pass:

  def rainfall(amounts):
      amounts = [a for a in amounts if a >= 0]
      return sum(amounts) / len(amounts) if amounts else 0

and it does:

$ py.test -q rainfall_test.py
..
3 passed in 0.00 seconds

Now can we refactor? We probably should; a lot of people don’t like the style of writing to parameters. So let’s do (Blue Hat!) the following:

def rainfall(amounts):
    non_negatives = [a for a in amounts if a >= 0]
    return sum(non_negatives) / len(non_negatives) if non_negatives else 0

This passes (try it!) but we can refactor more. The second line is long and messy, so we break it out and name it:

def rainfall(amounts):
    non_negatives = [a for a in amounts if a >= 0]
    return safe_average(non_negatives)

def safe_average(amounts):
    return sum(amounts) / len(amounts) if amounts else 0

That passes! So, next step is a new failing test. We haven’t done the “stop at -999” so let’s add a test for that:

from rainfall import rainfall
def test_empty_dataset_produces_zero():
    assert rainfall([]) == 0

def test_produces_average_when_no_negatives_or_sentinels():
    assert rainfall([3]) == 3
    assert rainfall([1, 2, 3, 4, 5, 6]) == 3.5

def test_negatives_are_skipped():
    assert rainfall([-5, -2, 5, 10, 30, -1]) == 15
    assert rainfall([-1]) == 0
    assert rainfall([3, -5, -5, -100]) == 3
    assert rainfall([10, 0, -100]) == 5

def test_sentinels_are_respected():
    assert rainfall([-999]) == 0
    assert rainfall([8, -999]) == 8
    assert rainfall([8, 1, -2, -999, 5000]) == 4.5

This fails as expected, with:

rainfall_test.py:18: AssertionError
1 failed, 3 passed in 0.01 seconds

We just need to make this pass (green hat). We don’t have to worry about efficiency.

def rainfall(amounts):
    truncated_amounts = up_to(amounts, -999)
    non_negatives = [a for a in truncated_amounts if a >= 0]
    return safe_average(non_negatives)

def safe_average(amounts):
    return sum(amounts) / len(amounts) if amounts else 0

def up_to(amounts, value):
    return amounts[:amounts.index(value)] if value in amounts else amounts

$ py.test -q rainfall_test.py
....
4 passed in 0.00 seconds

Nice, this passes. Now we have working code (though it is pretty inefficient!), and a complete test suite. We can now put on a permanent blue hat and refactor until we drop. We can even scrap what we have an go completely iterative. Our tests already have the special cases built-in, so we can be confident even of a full rewrite!