5-Testing/Readme.rst

   1 `Back To Debugging`_ - `Forward To Documentation`_
   2
   3 .. _Back To Debugging: https://github.com/thehackerwithin/UofCSCBC2012/tree/master/4-Debugging/
   4 .. _Forward To Documentation: https://github.com/thehackerwithin/UofCSCBC2012/tree/master/6-Documentation/
   5
   6 -----------
   7
   8 **Presented By Anthony Scopatz**
   9
  10 **Based on materials by Katy Huff, Rachel Slaybaugh, and Anthony Scopatz**
  11
  12 What is testing?
  13 ================
  14 Software testing is a process by which one or more expected behaviors and
  15 results from a piece of software are exercised and confirmed. Well chosen
  16 tests will confirm expected code behavior for the extreme boundaries of the
  17 input domains, output ranges, parametric combinations, and other behavioral
  18 edge cases.
  19
  20 Why test software?
  21 ==================
  22 Unless you write flawless, bug-free, perfectly accurate, fully precise, and
  23 predictable code every time, you must test your code in order to trust it
  24 enough to answer in the affirmative to at least a few of the following questions:
  25
  26 * Does your code work?
  27 * Always?
  28 * Does it do what you think it does?
  29 * Does it continue to work after changes are made?
  30 * Does it continue to work after system configurations or libraries are upgraded?
  31 * Does it respond properly for a full range of input parameters?
  32 * What about edge or corner cases?
  33 * What's the limit on that input parameter?
  34
  35 Verification
  36 ************
  37 *Verification* is the process of asking, "Have we built the software correctly?"
  38 That is, is the code bug free, precise, accurate, and repeatable?
  39
  40 Validation
  41 **********
  42 *Validation* is the process of asking, "Have we built the right software?"
  43 That is, is the code designed in such a way as to produce the answers we are
  44 interested in, data we want, etc.
  45
  46 Where are tests?
  47 ================
  48 Say we have an averaging function:
  49
  50 .. code-block::
  51
  52     def mean(numlist):
  53         total = sum(numlist)
  54         length = len(numlist)
  55         return total/length
  56
  57 The test could be runtime exceptions in the function.
  58
  59 ::
  60
  61   def mean(numlist):
  62      try :
  63          total = sum(numlist)
  64          length = len(numlist)
  65      except ValueError :
  66          print "The number list was not a list of numbers."
  67      except :
  68          print "There was a problem evaluating the number list."
  69      return total/length
  70
  71
  72 Sometimes they’re alongside the function definitions they’re testing.
  73
  74 ::
  75
  76  def mean(numlist):
  77     try :
  78         total = sum(numlist)
  79         length = len(numlist)
  80     except ValueError :
  81         print "The number list was not a list of numbers."
  82     except :
  83         print "There was a problem evaluating the number list."
  84     return total/length
  85
  86  class TestClass:
  87     def test_mean(self):
  88         assert(mean([0,0,0,0])==0)
  89         assert(mean([0,200])==100)
  90         assert(mean([0,-200]) == -100)
  91         assert(mean([0]) == 0)
  92     def test_floating_mean(self):
  93         assert(mean([1,2])==1.5)
  94
  95 Sometimes they’re in an executable independent of the main executable.
  96
  97
  98 ::
  99
 100  def mean(numlist):
 101     try :
 102         total = sum(numlist)
 103         length = len(numlist)
 104     except ValueError :
 105         print "The number list was not a list of numbers."
 106     except :
 107         print "There was a problem evaluating the number list."
 108     return total/length
 109
 110 Where, in a different file exists a test module:
 111
 112
 113 ::
 114
 115   import mean
 116   class TestClass:
 117       def test_mean(self):
 118           assert(mean([0,0,0,0])==0)
 119           assert(mean([0,200])==100)
 120           assert(mean([0,-200]) == -100)
 121           assert(mean([0]) == 0)
 122       def test_floating_mean(self):
 123           assert(mean([1,2])==1.5)
 124
 125
 126 **When should we test?**
 127
 128 The short answer is all the time. The long answer is that testing either before or after your software is written will improve your code, but testing after your program is used for something important is too late.
 129
 130 If we have a robust set of tests, we can run them before adding something new and after adding something new. If the tests give the same results (as appropriate), we can have some assurance that we didn’t break anything. The same idea applies to making changes in your system configuration, updating support codes, etc.
 131
 132 Another important feature of testing is that it helps you remember what all the parts of your code do. If you are working on a large project over three years and you end up with 200 classes, it may be hard to remember what the widget class does in detail. If you have a test that checks all of the widget’s functionality, you can look at the test to remember what it’s supposed to do.
 133
 134 **Who tests?**
 135 In a collaborative coding environment, where many developers contribute to the same code, developers should be responsible individually for testing the functions they create and collectively for testing the code as a whole.
 136
 137 Professionals invariably test their code, and take pride in test coverage, the percent of their functions that they feel confident are comprehensively tested.
 138
 139 **How does one test?**
 140
 141 The type of tests you’ll write is determined by the testing framework you adopt.
 142
 143 **Types of Tests:**
 144 *Exceptions*
 145 Exceptions can be thought of as type of runttime test. They alert the user to exceptional behavior in the code. Often, exceptions are related to functions that depend on input that is unknown at compile time. Checks that occur within the code to handle exceptional behavior that results from this type of input are called Exceptions.
 146
 147 *Unit Tests*
 148
 149 Unit tests are a type of test which test the fundametal units a program’s functionality. Often, this is on the class or function level of detail.
 150
 151 To test functions and classes, we want to test the interfaces, rather than the implmentation. Treating the implementation as a ‘black box’, we can probe the expected behavior with boundary cases for the inputs.
 152
 153 In the case of our fix_missing function, we need to test the expected behavior by providing lines and files that do and do not have missing entries. We should also test the behavior for empty lines and files as well. These are boundary cases.
 154
 155 *System Tests*
 156
 157 System level tests are intended to test the code as a whole. As opposed to unit tests, system tests ask for the behavior as a whole. This sort of testing involves comparison with other validated codes, analytical solutions, etc.
 158
 159 *Regression Tests*
 160
 161 A regression test ensures that new code does change anything. If you change the default answer, for example, or add a new question, you’ll need to make sure that missing entries are still found and fixed.
 162
 163 *Integration Tests*
 164
 165 Integration tests query the ability of the code to integrate well with the system configuration and third party libraries and modules. This type of test is essential for codes that depend on libraries which might be updated independently of your code or when your code might be used by a number of users who may have various versions of libraries.
 166
 167 **Test Suites**
 168 Putting a series of unit tests into a suite creates, as you might imagine, a test suite.
 169
 170 **Elements of a Test**
 171
 172 **Behavior**
 173
 174 The behavior you want to test. For example, you might want to test the fun() function.
 175
 176 **Expected Result**
 177
 178 This might be a single number, a range of numbers, a new, fully defined object, a system state, an exception, etc.
 179
 180 When we run the fun function, we expect to generate some fun. If we don’t generate any fun, the fun() function should fail its test. Alternatively, if it does create some fun, the fun() function should pass this test.
 181
 182 **Assertions**
 183
 184 Require that some conditional be true. If the conditional is false, the test fails.
 185
 186 **Fixtures**
 187
 188 Sometimes you have to do some legwork to create the objects that are necessary to run one or many tests. These objects are called fixtures.
 189
 190 For example, since fun varies a lot between people, the fun() function is a member function of the Person class. In order to check the fun function, then, we need to create an appropriate Person object on which to run fun().
 191
 192 **Setup and teardown**
 193
 194 Creating fixtures is often done in a call to a setup function. Deleting them and other cleanup is done in a teardown function.
 195
 196 **The Big Picture**
 197 Putting all this together, the testing algorithm is often:
 198
 199 ::
 200
 201   setUp
 202   test
 203   tearDown
 204
 205
 206 But, sometimes it’s the case that your tests change the fixtures. If so, it’s better for the setup and teardown functions to occur on either side of each test. In that case, the testing algorithm should be:
 207
 208 ::
 209
 210   setUp
 211   test1
 212   tearDown
 213   setUp
 214   test2
 215   tearDown
 216   setUp
 217   test3
 218   tearDown
 219
 220 ----------------------------------------------------------
 221 Python Nose
 222 ----------------------------------------------------------
 223
 224 The testing framework we’ll discuss today is called nose, and comes packaged with the enthought python distribution that you’ve installed.
 225
 226 **Where is a nose test?**
 227
 228 Nose tests are files that begin with Test-, Test_, test-, or test_. Specifically, these satisfy the testMatch regular expression [Tt]est[-_]. (You can also teach nose to find tests by declaring them in the unittest.TestCase subclasses chat you create in your code. You can also create test functions which are not unittest.TestCase subclasses if they are named with the configured testMatch regular expression.)
 229
 230 Nose Test Syntax
 231 To write a nose test, we make assertions.
 232
 233 ::
 234
 235     assert (ShouldBeTrue())
 236     assert (not ShouldNotBeTrue())
 237
 238
 239 In addition to assertions, in many test frameworks, there are expectations, etc.
 240
 241 **Add a test to our work**
 242
 243 There are a few tests for the mean function that we listed in this lesson. What are some tests that should fail? Add at least three test cases to this set.
 244
 245 *Hint: think about what form your input could take and what you should do to handle it. Also, think about the type of the elements in the list. What should be done if you pass a list of integers? What if you pass a list of strings?*
 246
 247 **Test Driven Development**
 248
 249 Some people develop code by writing the tests first.
 250
 251 If you write your tests comprehensively enough, the expected behaviors that you define in your tests will be the necessary and sufficient set of behaviors your code must perform. Thus, if you write the tests first and program until the tests pass, you will have written exactly enough code to perform the behavior your want and no more. Furthermore, you will have been forced to write your code in a modular enough way to make testing easy now. This will translate into easier testing well into the future.
 252
 253 --------------------------------------------------------------------
 254 An example
 255 --------------------------------------------------------------------
 256 The overlap method takes two rectangles (red and blue) and computes the degree of overlap between them. Save it in overlap.py. A rectangle is defined as a tuple of tuples: ((x_lo,y_lo),(x_hi),(y_hi))
 257
 258 ::
 259
 260  def overlap(red, blue):
 261     '''Return overlap between two rectangles, or None.'''
 262
 263     ((red_lo_x, red_lo_y), (red_hi_x, red_hi_y)) = red
 264     ((blue_lo_x, blue_lo_y), (blue_hi_x, blue_hi_y)) = blue
 265
 266     if (red_lo_x >= blue_hi_x) or \
 267        (red_hi_x <= blue_lo_x) or \
 268        (red_lo_y >= blue_hi_x) or \
 269        (red_hi_y <= blue_lo_y):
 270         return None
 271
 272     lo_x = max(red_lo_x, blue_lo_x)
 273     lo_y = max(red_lo_y, blue_lo_y)
 274     hi_x = min(red_hi_x, blue_hi_x)
 275     hi_y = min(red_hi_y, blue_hi_y)
 276     return ((lo_x, lo_y), (hi_x, hi_y))
 277
 278
 279 Now let's create a set of tests for this class. Before we do this, let's think about *how* we might test this method. How should it work?
 280
 281
 282 ::
 283
 284  from overlap import overlap
 285
 286  def test_empty_with_empty():
 287     rect = ((0, 0), (0, 0))
 288     assert overlap(rect, rect) == None
 289
 290  def test_empty_with_unit():
 291     empty = ((0, 0), (0, 0))
 292     unit = ((0, 0), (1, 1))
 293     assert overlap(empty, unit) == None
 294
 295  def test_unit_with_unit():
 296     unit = ((0, 0), (1, 1))
 297     assert overlap(unit, unit) == unit
 298
 299  def test_partial_overlap():
 300     red = ((0, 3), (2, 5))
 301     blue = ((1, 0), (2, 4))
 302     assert overlap(red, blue) == ((1, 3), (2, 4))
 303
 304
 305 Run your tests.
 306
 307 ::
 308
 309  [rguy@infolab-33 ~/TestExample]$ nosetests
 310  ...F
 311  ======================================================================
 312  FAIL: test_overlap.test_partial_overlap
 313  ----------------------------------------------------------------------
 314  Traceback (most recent call last):
 315    File "/usr/lib/python2.6/site-packages/nose/case.py", line 183, in runTest
 316      self.test(*self.arg)
 317    File "/afs/ictp.it/home/r/rguy/TestExample/test_overlap.py", line 19, in test_partial_overlap
 318      assert overlap(red, blue) == ((1, 3), (2, 4))
 319  AssertionError
 320
 321  ----------------------------------------------------------------------
 322  Ran 4 tests in 0.005s
 323
 324  FAILED (failures=1)
 325
 326
 327 Oh no! Something failed. The failure was on line in this test:
 328
 329 ::
 330
 331  def test_partial_overlap():
 332    red = ((0, 3), (2, 5))
 333    blue = ((1, 0), (2, 4))
 334    assert overlap(red, blue) == ((1, 3), (2, 4))
 335
 336
 337 Can you spot why it failed? Try to fix the method so all tests pass.