1 `Back To Debugging`_ - `Forward To Documentation`_
3 .. _Back To Debugging: https://github.com/thehackerwithin/UofCSCBC2012/tree/master/4-Debugging/
4 .. _Forward To Documentation: https://github.com/thehackerwithin/UofCSCBC2012/tree/master/6-Documentation/
8 **Presented By Anthony Scopatz**
10 **Based on materials by Katy Huff, Rachel Slaybaugh, and Anthony Scopatz**
12 .. image:: https://github.com/thehackerwithin/UofCSCBC2012/raw/scopz/5-Testing/test_prod.jpg
17 Software testing is a process by which one or more expected behaviors and
18 results from a piece of software are exercised and confirmed. Well chosen
19 tests will confirm expected code behavior for the extreme boundaries of the
20 input domains, output ranges, parametric combinations, and other behavioral
25 Unless you write flawless, bug-free, perfectly accurate, fully precise, and
26 predictable code every time, you must test your code in order to trust it
27 enough to answer in the affirmative to at least a few of the following questions:
29 * Does your code work?
31 * Does it do what you think it does?
32 * Does it continue to work after changes are made?
33 * Does it continue to work after system configurations or libraries are upgraded?
34 * Does it respond properly for a full range of input parameters?
35 * What about edge or corner cases?
36 * What's the limit on that input parameter?
37 * How will it affect your `publications`_?
39 .. _publications: http://www.nature.com/news/2010/101013/full/467775a.html
43 *Verification* is the process of asking, "Have we built the software correctly?"
44 That is, is the code bug free, precise, accurate, and repeatable?
48 *Validation* is the process of asking, "Have we built the right software?"
49 That is, is the code designed in such a way as to produce the answers we are
50 interested in, data we want, etc.
52 Uncertainty Quantification
53 **************************
54 *Uncertainty Quantification* is the process of asking, "Given that our algorithm
55 may not be deterministic, was our execution within acceptable error bounds?" This
56 is particularly important for anything which uses random numbers, eg Monte Carlo methods.
61 Say we have an averaging function:
63 .. code-block:: python
70 Tests could be implemented as runtime exceptions in the function:
72 .. code-block:: python
79 print "The number list was not a list of numbers."
81 print "There was a problem evaluating the number list."
85 Sometimes tests they are functions alongside the function definitions they are testing.
87 .. code-block:: python
94 print "The number list was not a list of numbers."
96 print "There was a problem evaluating the number list."
101 assert mean([0, 0, 0, 0]) == 0
102 assert mean([0, 200]) == 100
103 assert mean([0, -200]) == -100
104 assert mean([0]) == 0
107 def test_floating_mean():
108 assert mean([1, 2]) == 1.5
110 Sometimes they are in an executable independent of the main executable.
112 .. code-block:: python
117 length = len(numlist)
119 print "The number list was not a list of numbers."
121 print "There was a problem evaluating the number list."
125 Where, in a different file exists a test module:
127 .. code-block:: python
132 assert mean([0, 0, 0, 0]) == 0
133 assert mean([0, 200]) == 100
134 assert mean([0, -200]) == -100
135 assert mean([0]) == 0
138 def test_floating_mean():
139 assert mean([1, 2]) == 1.5
143 The three right answers are:
149 The longer answer is that testing either before or after your software
150 is written will improve your code, but testing after your program is used for
151 something important is too late.
153 If we have a robust set of tests, we can run them before adding something new and after
154 adding something new. If the tests give the same results (as appropriate), we can have
155 some assurance that we didn't wreak anything. The same idea applies to making changes in
156 your system configuration, updating support codes, etc.
158 Another important feature of testing is that it helps you remember what all the parts
159 of your code do. If you are working on a large project over three years and you end up
160 with 200 classes, it may be hard to remember what the widget class does in detail. If
161 you have a test that checks all of the widget's functionality, you can look at the test
162 to remember what it's supposed to do.
166 In a collaborative coding environment, where many developers contribute to the same code base,
167 developers should be responsible individually for testing the functions they create and
168 collectively for testing the code as a whole.
170 Professionals often test their code, and take pride in test coverage, the percent
171 of their functions that they feel confident are comprehensively tested.
173 How are tests written?
174 ======================
175 The type of tests that are written is determined by the testing framework you adopt.
176 Don't worry, there are a lot of choices.
180 **Exceptions:** Exceptions can be thought of as type of runtime test. They alert
181 the user to exceptional behavior in the code. Often, exceptions are related to
182 functions that depend on input that is unknown at compile time. Checks that occur
183 within the code to handle exceptional behavior that results from this type of input
184 are called Exceptions.
186 **Unit Tests:** Unit tests are a type of test which test the fundamental units of a
187 program's functionality. Often, this is on the class or function level of detail.
188 However what defines a *code unit* is not formally defined.
190 To test functions and classes, the interfaces (API) - rather than the implementation - should
191 be tested. Treating the implementation as a black box, we can probe the expected behavior
192 with boundary cases for the inputs.
194 **System Tests:** System level tests are intended to test the code as a whole. As opposed
195 to unit tests, system tests ask for the behavior as a whole. This sort of testing involves
196 comparison with other validated codes, analytical solutions, etc.
198 **Regression Tests:** A regression test ensures that new code does change anything.
199 If you change the default answer, for example, or add a new question, you'll need to
200 make sure that missing entries are still found and fixed.
202 **Integration Tests:** Integration tests query the ability of the code to integrate
203 well with the system configuration and third party libraries and modules. This type
204 of test is essential for codes that depend on libraries which might be updated
205 independently of your code or when your code might be used by a number of users
206 who may have various versions of libraries.
208 **Test Suites:** Putting a series of unit tests into a collection of modules creates,
209 a test suite. Typically the suite as a whole is executed (rather than each test individually)
210 when verifying that the code base still functions after changes have been made.
214 **Behavior:** The behavior you want to test. For example, you might want to test the fun()
217 **Expected Result:** This might be a single number, a range of numbers, a new fully defined
218 object, a system state, an exception, etc. When we run the fun() function, we expect to
219 generate some fun. If we don't generate any fun, the fun() function should fail its test.
220 Alternatively, if it does create some fun, the fun() function should pass this test.
221 The the expected result should known *a priori*. For numerical functions, this is
222 result is ideally analytically determined even if the function being tested isn't.
224 **Assertions:** Require that some conditional be true. If the conditional is false,
227 **Fixtures:** Sometimes you have to do some legwork to create the objects that are
228 necessary to run one or many tests. These objects are called fixtures as they are not really
229 part of the test themselves but rather involve getting the computer into the appropriate state.
231 For example, since fun varies a lot between people, the fun() function is a method of
232 the Person class. In order to check the fun function, then, we need to create an appropriate
233 Person object on which to run fun().
235 **Setup and teardown:** Creating fixtures is often done in a call to a setup function.
236 Deleting them and other cleanup is done in a teardown function.
238 **The Big Picture:** Putting all this together, the testing algorithm is often:
240 .. code-block:: python
247 But, sometimes it's the case that your tests change the fixtures. If so, it's better
248 for the setup() and teardown() functions to occur on either side of each test. In
249 that case, the testing algorithm should be:
251 .. code-block:: python
265 ----------------------------------------------------------
267 Nose: A Python Testing Framework
268 ================================
269 The testing framework we'll discuss today is called nose. However, there are several
270 other testing frameworks available in most language. Most notably there is `JUnit`_
271 in Java which can arguably attributed to inventing the testing framework.
273 .. _nose: http://readthedocs.org/docs/nose/en/latest/
274 .. _JUnit: http://www.junit.org/
276 Where do nose tests live?
277 *************************
278 Nose tests are files that begin with ``Test-``, ``Test_``, ``test-``, or ``test_``.
279 Specifically, these satisfy the testMatch regular expression ``[Tt]est[-_]``.
280 (You can also teach nose to find tests by declaring them in the unittest.TestCase
281 subclasses chat you create in your code. You can also create test functions which
282 are not unittest.TestCase subclasses if they are named with the configured
283 testMatch regular expression.)
287 To write a nose test, we make assertions.
289 .. code-block:: python
291 assert should_be_true()
292 assert not should_not_be_true()
294 Additionally, nose itself defines number of assert functions which can be used to
295 test more specific aspects of the code base.
297 .. code-block:: python
299 from nose.tools import *
302 assert_almost_equal(a, b)
305 assert_raises(exception, func, *args, **kwargs)
306 assert_is_instance(a, b)
309 Moreover, numpy offers similar testing functions for arrays:
311 .. code-block:: python
313 from numpy.testing import *
315 assert_array_equal(a, b)
316 assert_array_almost_equal(a, b)
319 Exercise: Writing tests for mean()
320 **********************************
321 There are a few tests for the mean() function that we listed in this lesson.
322 What are some tests that should fail? Add at least three test cases to this set.
323 Edit the ``test_mean.py`` file which tests the mean() function in ``mean.py``.
325 *Hint:* Think about what form your input could take and what you should do to handle it.
326 Also, think about the type of the elements in the list. What should be done if you pass
327 a list of integers? What if you pass a list of strings?
331 nosetests test_mean.py
333 Test Driven Development
334 =======================
335 Test driven development (TDD) is a philosophy whereby the developer creates code by
336 **writing the tests fist**. That is to say you write the tests *before* writing the
339 This is an iterative process whereby you write a test then write the minimum amount
340 code to make the test pass. If a new feature is needed, another test is written and
341 the code is expanded to meet this new use case. This continues until the code does
344 TDD operates on the YAGNI principle (You Ain't Gonna Need It). People who diligently
345 follow TDD swear by its effectiveness. This development style was put forth most
346 strongly by `Kent Beck in 2002`_.
348 .. _Kent Beck in 2002: http://www.amazon.com/Test-Driven-Development-By-Example/dp/0321146530
352 Say you want to write a fib() function which generates values of the
353 Fibonacci sequence of given indexes. You would - of course - start
354 by writing the test, possibly testing a single value:
356 .. code-block:: python
358 from nose import assert_equal
365 assert_equal(obs, exp)
367 You would *then* go ahead and write the actual function:
369 .. code-block:: python
372 # you snarky so-and-so
375 And that is it right?! Well, not quite. This implementation fails for
376 most other values. Adding tests we see that:
378 .. code-block:: python
383 assert_equal(obs, exp)
389 assert_equal(obs, exp)
393 assert_equal(obs, exp)
395 This extra test now requires that we bother to implement at least the initial values:
397 .. code-block:: python
405 However, this function still falls over for ``2 < n``. Time for more tests!
407 .. code-block:: python
412 assert_equal(obs, exp)
418 assert_equal(obs, exp)
422 assert_equal(obs, exp)
428 assert_equal(obs, exp)
432 assert_equal(obs, exp)
434 At this point, we had better go ahead and try do the right thing...
436 .. code-block:: python
443 return fib(n - 1) + fib(n - 2)
445 Here it becomes very tempting to take an extended coffee break or possibly a
446 power lunch. But then you remember those pesky negative numbers and floats.
447 Perhaps the right thing to do here is to just be undefined.
449 .. code-block:: python
454 assert_equal(obs, exp)
460 assert_equal(obs, exp)
464 assert_equal(obs, exp)
470 assert_equal(obs, exp)
474 assert_equal(obs, exp)
480 assert_equal(obs, exp)
484 assert_equal(obs, exp)
486 This means that it is time to add the appropriate case to the function itself:
488 .. code-block:: python
491 # sequence and you shall find
492 if n < 0 or int(n) != n:
493 return NotImplemented
494 elif n == 0 or n == 1:
497 return fib(n - 1) + fib(n - 2)
499 And thus - finally - we have a robust function together with working tests!
503 **The Problem:** In 2D or 3D, we have two points (p1 and p2) which define a line segment.
504 Additionally there exists experimental data which can be anywhere in the domain.
505 Find the data point which is closest to the line segment.
507 In the ``close_line.py`` file there are four different implementations which all
508 solve this problem. `You can read more about them here.`_ However, there are no tests!
509 Please write from scratch a ``test_close_line.py`` file which tests the closest_data_to_line()
512 *Hint:* you can use one implementation function to test another. Below is some sample data
513 to help you get started.
515 .. image:: https://github.com/thehackerwithin/UofCSCBC2012/raw/scopz/5-Testing/evo_sol1.png
517 .. code-block:: python
521 p1 = np.array([0.0, 0.0])
522 p2 = np.array([1.0, 1.0])
523 data = np.array([[0.3, 0.6], [0.25, 0.5], [1.0, 0.75]])
525 .. _You can read more about them here.: http://inscight.org/2012/03/31/evolution_of_a_solution/