Working on testing lecture for Python

author Greg Wilson <gvwilson@third-bit.com>

Thu, 22 Aug 2013 01:02:22 +0000 (21:02 -0400)

committer W. Trevor King <wking@tremily.us>

Sat, 2 Nov 2013 17:40:51 +0000 (10:40 -0700)
author Greg Wilson <gvwilson@third-bit.com>
Thu, 22 Aug 2013 01:02:22 +0000 (21:02 -0400)
committer W. Trevor King <wking@tremily.us>
Sat, 2 Nov 2013 17:40:51 +0000 (10:40 -0700)
diff --git a/lessons/swc-python/python-5-testing.ipynb b/lessons/swc-python/python-5-testing.ipynb

new file mode 100644 (file)

index 0000000..fb3d8a2
--- /dev/null
+++ b/lessons/swc-python/python-5-testing.ipynb
@@ -0,0 +1,1200 @@
+{
+ "metadata": {
+  "name": ""
+ },
+ "nbformat": 3,
+ "nbformat_minor": 0,
+ "worksheets": [
+  {
+   "cells": [
+    {
+     "cell_type": "heading",
+     "level": 1,
+     "metadata": {},
+     "source": [
+      "Basic Programming Using Python: Testing"
+     ]
+    },
+    {
+     "cell_type": "heading",
+     "level": 2,
+     "metadata": {},
+     "source": [
+      "Objectives"
+     ]
+    },
+    {
+     "cell_type": "markdown",
+     "metadata": {},
+     "source": [
+      "* FIXME"
+     ]
+    },
+    {
+     "cell_type": "heading",
+     "level": 2,
+     "metadata": {},
+     "source": [
+      "Setting Expectations"
+     ]
+    },
+    {
+     "cell_type": "markdown",
+     "metadata": {},
+     "source": [
+      "We created, found, and fixed over half a dozen bugs\n",
+      "in our [previous lesson](python-4-files-lists.ipynb).\n",
+      "How can we be sure that others aren't still lurking in our code?\n",
+      "It's not an idle worry:\n",
+      "every year,\n",
+      "programmers find errors in software that has been in use for years,\n",
+      "and the number of papers that have been retracted\n",
+      "because of computational mistakes\n",
+      "is constantly growing."
+     ]
+    },
+    {
+     "cell_type": "markdown",
+     "metadata": {},
+     "source": [
+      "The short answer is that it's practically impossible to prove that a program will always do what it's supposed to.\n",
+      "To see why,\n",
+      "consider a function that checks whether a character strings contains only 'A', 'C', 'G', and 'T'.\n",
+      "These four tests clearly aren't sufficient:"
+     ]
+    },
+    {
+     "cell_type": "markdown",
+     "metadata": {},
+     "source": [
+      "```python\n",
+      "assert is_all_bases('A')\n",
+      "assert is_all_bases('C')\n",
+      "assert is_all_bases('G')\n",
+      "assert is_all_bases('T')\n",
+      "```"
+     ]
+    },
+    {
+     "cell_type": "markdown",
+     "metadata": {},
+     "source": [
+      "because this implementation of `is_all_bases` passes them:"
+     ]
+    },
+    {
+     "cell_type": "markdown",
+     "metadata": {},
+     "source": [
+      "```python\n",
+      "def is_all_bases(bases):\n",
+      "    return True\n",
+      "```"
+     ]
+    },
+    {
+     "cell_type": "markdown",
+     "metadata": {},
+     "source": [
+      "Adding these tests isn't enough:"
+     ]
+    },
+    {
+     "cell_type": "markdown",
+     "metadata": {},
+     "source": [
+      "```python\n",
+      "assert not is_all_bases('X')\n",
+      "assert not is_all_bases('Y')\n",
+      "assert not is_all_bases('Z')\n",
+      "```"
+     ]
+    },
+    {
+     "cell_type": "markdown",
+     "metadata": {},
+     "source": [
+      "because this implementation passes:"
+     ]
+    },
+    {
+     "cell_type": "markdown",
+     "metadata": {},
+     "source": [
+      "```python\n",
+      "def is_all_bases(bases):\n",
+      "    return bases[0] in 'ACGT'\n",
+      "```"
+     ]
+    },
+    {
+     "cell_type": "markdown",
+     "metadata": {},
+     "source": [
+      "We can add yet more tests:"
+     ]
+    },
+    {
+     "cell_type": "markdown",
+     "metadata": {},
+     "source": [
+      "```python\n",
+      "assert is_all_bases('ACGCGA')\n",
+      "assert not is_all_bases('CGAZ')\n",
+      "```"
+     ]
+    },
+    {
+     "cell_type": "markdown",
+     "metadata": {},
+     "source": [
+      "but no matter how many we have,\n",
+      "we can always write a function that passes them,\n",
+      "but does the wrong thing in other cases."
+     ]
+    },
+    {
+     "cell_type": "markdown",
+     "metadata": {},
+     "source": [
+      "Testing is still worth doing, though:\n",
+      "it's one of those things that doesn't work in theory,\n",
+      "but is surprisingly effective in practice.\n",
+      "If we choose our tests carefully,\n",
+      "we can demonstrate that our software is as likely to be correct as a mathematical proof\n",
+      "or a physical experiment."
+     ]
+    },
+    {
+     "cell_type": "markdown",
+     "metadata": {},
+     "source": [
+      "And ensuring that we have the right answer is only one reason to to software.\n",
+      "The other is that it speeds up development\n",
+      "by reducing the amount of re-work we have to do.\n",
+      "Even small programs can be quite complex,\n",
+      "and changing one thing can all too easily break something else.\n",
+      "If we test changes as we make them,\n",
+      "and re-test things we've already done,\n",
+      "we can catch errors while the changes are still fresh in our minds,\n",
+      "which makes fixing them much easier."
+     ]
+    },
+    {
+     "cell_type": "markdown",
+     "metadata": {},
+     "source": [
+      "It's important to realize,\n",
+      "though,\n",
+      "that testing itself doesn't make software better.\n",
+      "As Steve McConnell once said,\n",
+      "trying to improve the quality of software by doing more testing\n",
+      "is like trying to lose weight by weighing yourself more often.\n",
+      "Testing just tells us what the quality *is*;\n",
+      "if we want to improve it,\n",
+      "so that we don't have to throw away a week's worth of analysis because of a missing semi-colon,\n",
+      "we have to change our programs,\n",
+      "and change the way we go about writing programs."
+     ]
+    },
+    {
+     "cell_type": "heading",
+     "level": 2,
+     "metadata": {},
+     "source": [
+      "Defensive Programming"
+     ]
+    },
+    {
+     "cell_type": "markdown",
+     "metadata": {},
+     "source": [
+      "The first step is to use [defensive programming](glossary.html#defensive_programming),\n",
+      "i.e.,\n",
+      "to put assertions in our programs so that they check their own execution as they run.\n",
+      "Programs like the Firefox browser are littered with assertions&mdash;in fact,\n",
+      "10-20% of the code they contain\n",
+      "are there to check that the other 80-90% are working correctly.\n",
+      "Broadly speaking,\n",
+      "assertions fall into three categories:\n",
+      "\n",
+      "- A [precondition](glossary.html#precondition) is something that must be true\n",
+      "  in order for a piece of code to work correctly.\n",
+      "- A [postcondition](glossary.html#postcondition) is something that must be true\n",
+      "  at the end of a piece of code if it worked correctly.\n",
+      "- An [invariant](glossary.html#invariant) is something that is always true\n",
+      "  at a particular point inside a piece of code."
+     ]
+    },
+    {
+     "cell_type": "markdown",
+     "metadata": {},
+     "source": [
+      "For example,\n",
+      "suppose we are representing rectangles using a list of four coordinates `[x0, y0, x1, y1]`.\n",
+      "In order to do some calculations,\n",
+      "we need to normalize the rectangle so that it is at the origin\n",
+      "and 1.0 units long on its longest axis.\n",
+      "This function does that:"
+     ]
+    },
+    {
+     "cell_type": "code",
+     "collapsed": false,
+     "input": [
+      "def normalize_rectangle(rect):\n",
+      "    x0, y0, x1, y1 = rect\n",
+      "    assert x0 < x1, 'Invalid X coordinates'\n",
+      "    assert y0 < y1, 'Invalid Y coordinates'\n",
+      "\n",
+      "    dx = x1 - x0\n",
+      "    dy = y1 - y0\n",
+      "    if dx > dy:\n",
+      "        scaled = float(dy) / dx\n",
+      "        upper_x, upper_y = 1.0, scaled\n",
+      "    else:\n",
+      "        scaled = float(dx) / dy\n",
+      "        upper_x, upper_y = scaled, 1.0\n",
+      "\n",
+      "    assert 0 < upper_x <= 1.0, 'Calculated upper X coordinate invalid'\n",
+      "    assert 0 < upper_y <= 1.0, 'Calculated upper Y coordinate invalid'\n",
+      "\n",
+      "    return [0, 0, upper_x, upper_y]"
+     ],
+     "language": "python",
+     "metadata": {},
+     "outputs": [],
+     "prompt_number": 1
+    },
+    {
+     "cell_type": "markdown",
+     "metadata": {},
+     "source": [
+      "The first two assertions test that the inputs are valid,\n",
+      "i.e.,\n",
+      "that the upper X and Y coordinates are greater than their lower counterparts.\n",
+      "Notice that the test is greater than,\n",
+      "not greater than or equal to:\n",
+      "this tells us (and the computer) that rectangles aren't allowed to have zero width or height.\n",
+      "The last two assertions check that the upper coordinates of the scaled rectangle are valid:\n",
+      "neither can be zero\n",
+      "(because that would mean the rectangle had zero width or height)\n",
+      "and neither can be greater than 1."
+     ]
+    },
+    {
+     "cell_type": "markdown",
+     "metadata": {},
+     "source": [
+      "Strictly speaking, these two assertions are redundant:\n",
+      "if the inputs are correct,\n",
+      "and our calculation is correct,\n",
+      "then the last two conditions should always hold.\n",
+      "But programmers aren't perfect, \n",
+      "and if there *is* a bug in our calculations,\n",
+      "we want the program to complain about it as early as possible."
+     ]
+    },
+    {
+     "cell_type": "markdown",
+     "metadata": {},
+     "source": [
+      "<hr/>\n",
+      "### *Assertions and Bugs*\n",
+      "\n",
+      "<em>\n",
+      "Another rule that good programmers follow is, \"Bugs become assertions.\"\n",
+      "Whenever we fix a bug in a program,\n",
+      "we should add some assertions to the program at that point to catch the bug if it reappears.\n",
+      "After all,\n",
+      "if we made the mistake once,\n",
+      "then we (or someone else) might well make it again.\n",
+      "Few things are as frustrating as\n",
+      "having someone delete several carefully-crafted lines of code that fixed a subtle problem\n",
+      "because they didn't realize what problem those lines were there to fix.\n",
+      "</em>\n",
+      "<hr/>"
+     ]
+    },
+    {
+     "cell_type": "heading",
+     "level": 2,
+     "metadata": {},
+     "source": [
+      "Handling Errors"
+     ]
+    },
+    {
+     "cell_type": "markdown",
+     "metadata": {},
+     "source": [
+      "Even when programs are careful,\n",
+      "things sometimes go wrong.\n",
+      "Some of these errors have external causes,\n",
+      "like missing or badly-formatted files.\n",
+      "Others are internal,\n",
+      "like bugs in code.\n",
+      "Either way,\n",
+      "it's actually pretty easy to handle errors in sensible ways."
+     ]
+    },
+    {
+     "cell_type": "markdown",
+     "metadata": {},
+     "source": [
+      "Let's start with a look at how programmers used to do error handling.\n",
+      "Back in the Dark Ages,\n",
+      "programmers wrote functions to return a [status code](glossary.html#status_code)\n",
+      "to indicate whether they had run correctly or not.\n",
+      "This led to programs like this:"
+     ]
+    },
+    {
+     "cell_type": "markdown",
+     "metadata": {},
+     "source": [
+      "```python\n",
+      "params, status = read_params(param_file)\n",
+      "if status != OK:\n",
+      "    log.error('Failed to read', param_file)\n",
+      "    sys.exit(ERROR)\n",
+      "\n",
+      "grid, status = read_grid(grid_file)\n",
+      "if status != OK:\n",
+      "    log.error('Failed to read', grid_file)\n",
+      "    sys.exit(ERROR)\n",
+      "```"
+     ]
+    },
+    {
+     "cell_type": "markdown",
+     "metadata": {},
+     "source": [
+      "The two function calls are all we really want;\n",
+      "the other six lines to check that files were opened and read properly,\n",
+      "and to report errors and exit if not."
+     ]
+    },
+    {
+     "cell_type": "markdown",
+     "metadata": {},
+     "source": [
+      "A lot of code is still written this way,\n",
+      "but this coding style makes it hard to see the forest for the trees.\n",
+      "When we're reading a program,\n",
+      "we want to understand what's supposed to happen when everything works,\n",
+      "and only then think about what might happen if something goes wrong.\n",
+      "When the two are interleaved,\n",
+      "both are harder to understand.\n",
+      "The net result is that most programmers don't bother to check the status codes their functions return.\n",
+      "Which means that when errors *do* occur,\n",
+      "they're even harder to track down."
+     ]
+    },
+    {
+     "cell_type": "markdown",
+     "metadata": {},
+     "source": [
+      "Luckily, there's a better way.\n",
+      "Modern languages like Python allow us to use [exceptions](glossary.html#exception) to handle errors.\n",
+      "More specifically,\n",
+      "using exceptions allows us to separate the \"normal\" flow of control\n",
+      "from the \"exceptional\" cases that arise when something goes wrong,\n",
+      "which makes both easier to understand:"
+     ]
+    },
+    {
+     "cell_type": "markdown",
+     "metadata": {},
+     "source": [
+      "```python\n",
+      "try:\n",
+      "    params = read_params(param_file)\n",
+      "    grid = read_grid(grid_file)\n",
+      "except:\n",
+      "    log.error('Failed to read', filename)\n",
+      "    sys.exit(ERROR)\n",
+      "```"
+     ]
+    },
+    {
+     "cell_type": "markdown",
+     "metadata": {},
+     "source": [
+      "To join the two parts together,\n",
+      "we use the keywords `try` and `except`.\n",
+      "These work together like `if` and `else`:\n",
+      "the statements under the `try` are what should happen if everything works,\n",
+      "while the statements under `except` are what the program should do if something goes wrong."
+     ]
+    },
+    {
+     "cell_type": "markdown",
+     "metadata": {},
+     "source": [
+      "We have actually seen exceptions before without knowing it,\n",
+      "since by default,\n",
+      "when an exception occurs,\n",
+      "Python prints it out and halts our program.\n",
+      "For example,\n",
+      "trying to open a nonexistent file triggers a type of exception called an `IOError`,\n",
+      "while an out-of-bounds index to a list triggers an `IndexError`:"
+     ]
+    },
+    {
+     "cell_type": "code",
+     "collapsed": false,
+     "input": [
+      "open('nonexistent-file.txt', 'r')"
+     ],
+     "language": "python",
+     "metadata": {},
+     "outputs": [
+      {
+       "ename": "IOError",
+       "evalue": "[Errno 2] No such file or directory: 'nonexistent-file.txt'",
+       "output_type": "pyerr",
+       "traceback": [
+        "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m\n\u001b[0;31mIOError\u001b[0m                                   Traceback (most recent call last)",
+        "\u001b[0;32m<ipython-input-2-58cbde3dd63c>\u001b[0m in \u001b[0;36m<module>\u001b[0;34m()\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0mopen\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m'nonexistent-file.txt'\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;34m'r'\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m",
+        "\u001b[0;31mIOError\u001b[0m: [Errno 2] No such file or directory: 'nonexistent-file.txt'"
+       ]
+      }
+     ],
+     "prompt_number": 2
+    },
+    {
+     "cell_type": "code",
+     "collapsed": false,
+     "input": [
+      "values = [0, 1, 2]\n",
+      "print values[999]"
+     ],
+     "language": "python",
+     "metadata": {},
+     "outputs": [
+      {
+       "ename": "IndexError",
+       "evalue": "list index out of range",
+       "output_type": "pyerr",
+       "traceback": [
+        "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m\n\u001b[0;31mIndexError\u001b[0m                                Traceback (most recent call last)",
+        "\u001b[0;32m<ipython-input-3-7fed13afc650>\u001b[0m in \u001b[0;36m<module>\u001b[0;34m()\u001b[0m\n\u001b[1;32m      1\u001b[0m \u001b[0mvalues\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0;34m[\u001b[0m\u001b[0;36m0\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;36m1\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;36m2\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m----> 2\u001b[0;31m \u001b[0;32mprint\u001b[0m \u001b[0mvalues\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0;36m999\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m",
+        "\u001b[0;31mIndexError\u001b[0m: list index out of range"
+       ]
+      }
+     ],
+     "prompt_number": 3
+    },
+    {
+     "cell_type": "markdown",
+     "metadata": {},
+     "source": [
+      "We can use `try` and `except` to deal with these errors ourselves\n",
+      "if we don't want the program simply to fall over.\n",
+      "Here,\n",
+      "for example,\n",
+      "we put our attempt to open a nonexistent file inside a `try`,\n",
+      "and in the `except`, we print a not-very-helpful error message:"
+     ]
+    },
+    {
+     "cell_type": "code",
+     "collapsed": false,
+     "input": [
+      "try:\n",
+      "    reader = open('nonexistent-file.txt', 'r')\n",
+      "except IOError:\n",
+      "    print 'Whoops!'"
+     ],
+     "language": "python",
+     "metadata": {},
+     "outputs": [
+      {
+       "output_type": "stream",
+       "stream": "stdout",
+       "text": [
+        "Whoops!\n"
+       ]
+      }
+     ],
+     "prompt_number": 6
+    },
+    {
+     "cell_type": "markdown",
+     "metadata": {},
+     "source": [
+      "When Python executes this code,\n",
+      "it runs the statement inside the `try`.\n",
+      "If that works, it skips over the `except` block without running it.\n",
+      "If an exception occurs inside the `try` block,\n",
+      "though,\n",
+      "Python compares the type of the exception to the type specified by the `except`.\n",
+      "If they match, it executes the code in the `except` block."
+     ]
+    },
+    {
+     "cell_type": "markdown",
+     "metadata": {},
+     "source": [
+      "`IOError` is Python's way of reporting several kinds of problems\n",
+      "related to input and output:\n",
+      "not just files that don't exist,\n",
+      "but also things like not having permission to read files,\n",
+      "and so on.\n",
+      "We can put as many lines of code in a `try` block as we want,\n",
+      "just as we can put many statements under an `if`.\n",
+      "We can also handle several different kinds of errors afterward.\n",
+      "For example,\n",
+      "here's some code to calculate the entropy at each point in a grid:"
+     ]
+    },
+    {
+     "cell_type": "markdown",
+     "metadata": {},
+     "source": [
+      "```python\n",
+      "try:\n",
+      "    params = read_params(param_file)\n",
+      "    grid = read_grid(grid_file)\n",
+      "    entropy = lee_entropy(params, grid)\n",
+      "    write_entropy(entropy_file, entropy)\n",
+      "except IOError:\n",
+      "    log_error_and_exit('IO error')\n",
+      "except ArithmeticError:\n",
+      "    log_error_and_exit('Arithmetic error')\n",
+      "```"
+     ]
+    },
+    {
+     "cell_type": "markdown",
+     "metadata": {},
+     "source": [
+      "Python tries to run the four statements inside the `try` as normal.\n",
+      "If an error occurs in any of them,\n",
+      "Python immediately jumps down\n",
+      "and tries to find an `except` whose type matches the type of the error that occurred.\n",
+      "If it's an `IOError`,\n",
+      "Python jumps into the first error handler.\n",
+      "If it's an `ArithmeticError`,\n",
+      "Python jumps into the second handler instead.\n",
+      "It will only execute one of these,\n",
+      "just as it will only execute one branch\n",
+      "of a series of `if`/`elif`/`else` statements."
+     ]
+    },
+    {
+     "cell_type": "markdown",
+     "metadata": {},
+     "source": [
+      "This layout has made the code easier to read,\n",
+      "but we've lost something important:\n",
+      "the message printed out by the `IOError` branch doesn't tell us\n",
+      "which file caused the problem.\n",
+      "We can do better if we capture and hang on to the object that Python creates\n",
+      "to record information about the error:"
+     ]
+    },
+    {
+     "cell_type": "markdown",
+     "metadata": {},
+     "source": [
+      "```python\n",
+      "try:\n",
+      "    params = read_params(param_file)\n",
+      "    grid = read_grid(grid_file)\n",
+      "    entropy = lee_entropy(params, grid)\n",
+      "    write_entropy(entropy_file, entropy)\n",
+      "except IOError as err:\n",
+      "    log_error_and_exit('Cannot read/write' + err.filename)\n",
+      "except ArithmeticError as err:\n",
+      "    log_error_and_exit(err.message)\n",
+      "```"
+     ]
+    },
+    {
+     "cell_type": "markdown",
+     "metadata": {},
+     "source": [
+      "If something goes wrong in the `try`,\n",
+      "Python creates an exception object,\n",
+      "fills it with information,\n",
+      "and assigns it to the variable `err`.\n",
+      "(There's nothing special about this variable name&mdash;we can use anything we want.)\n",
+      "Exactly what information is recorded depends on what kind of error occurred;\n",
+      "Python's documentation describes the properties of each type of error in detail,\n",
+      "but we can always just print the exception object.\n",
+      "In the case of an I/O error,\n",
+      "we print out the name of the file that caused the problem.\n",
+      "And in the case of an arithmetic error,\n",
+      "printing out the message embedded in the exception object is what Python would have done anyway."
+     ]
+    },
+    {
+     "cell_type": "markdown",
+     "metadata": {},
+     "source": [
+      "So much for how exceptions work:\n",
+      "how should they be used?\n",
+      "Some programmers use `try` and `except` to give their programs default behaviors.\n",
+      "For example,\n",
+      "if this code can't read the grid file that the user has asked for,\n",
+      "it creates a default grid instead:"
+     ]
+    },
+    {
+     "cell_type": "markdown",
+     "metadata": {},
+     "source": [
+      "```python\n",
+      "try:\n",
+      "    grid = read_grid(grid_file)\n",
+      "except IOError:\n",
+      "    grid = default_grid()\n",
+      "```"
+     ]
+    },
+    {
+     "cell_type": "markdown",
+     "metadata": {},
+     "source": [
+      "Other programmers would explicitly test for the grid file,\n",
+      "and use `if` and `else` for control flow:"
+     ]
+    },
+    {
+     "cell_type": "markdown",
+     "metadata": {},
+     "source": [
+      "```python\n",
+      "if file_exists(grid_file):\n",
+      "    grid = read_grid(grid_file)\n",
+      "else:\n",
+      "    grid = default_grid()\n",
+      "```"
+     ]
+    },
+    {
+     "cell_type": "markdown",
+     "metadata": {},
+     "source": [
+      "It's mostly a matter of taste,\n",
+      "but we prefer the second style.\n",
+      "As a rule,\n",
+      "exceptions should only be used to handle exceptional cases.\n",
+      "If the program knows how to fall back to a default grid,\n",
+      "that's not an unexpected event.\n",
+      "Using `if` and `else`\n",
+      "instead of `try` and `except`\n",
+      "sends different signals to anyone reading our code,\n",
+      "even if they do the same thing."
+     ]
+    },
+    {
+     "cell_type": "markdown",
+     "metadata": {},
+     "source": [
+      "Novices often ask another question about exception handling style as well,\n",
+      "but before we address it,\n",
+      "there's something in our example that you might not have noticed.\n",
+      "Exceptions can actually be thrown a long way:\n",
+      "they don't have to be handled immediately.\n",
+      "Take another look at this code:"
+     ]
+    },
+    {
+     "cell_type": "markdown",
+     "metadata": {},
+     "source": [
+      "```python\n",
+      "try:\n",
+      "    params = read_params(param_file)\n",
+      "    grid = read_grid(grid_file)\n",
+      "    entropy = lee_entropy(params, grid)\n",
+      "    write_entropy(entropy_file, entropy)\n",
+      "except IOError as err:\n",
+      "    log_error_and_exit('Cannot read/write' + err.filename)\n",
+      "except ArithmeticError as err:\n",
+      "    log_error_and_exit(err.message)\n",
+      "```"
+     ]
+    },
+    {
+     "cell_type": "markdown",
+     "metadata": {},
+     "source": [
+      "The four lines in the `try` block are all function calls.\n",
+      "They might catch and handle exceptions themselves,\n",
+      "but if an exception occurs in one of them that *isn't* handled internally,\n",
+      "Python looks in the calling code for a matching `except`.\n",
+      "If it doesn't find one there,\n",
+      "it looks in that function's caller,\n",
+      "and so on.\n",
+      "If we get all the way back to the main program without finding an exception handler,\n",
+      "Python's default behavior is to print an error message like the ones we've been seeing all along."
+     ]
+    },
+    {
+     "cell_type": "markdown",
+     "metadata": {},
+     "source": [
+      "This rule is the origin of the rule \"Throw Low, Catch High.\"\n",
+      "There are many places in our program where an error might occur.\n",
+      "There are only a few, though, where errors can sensibly be handled.\n",
+      "For example,\n",
+      "a linear algebra library doesn't know whether it's being called directly from the Python interpreter,\n",
+      "or whether it's being used as a component in a larger program.\n",
+      "In the latter case,\n",
+      "the library doesn't know if the program that's calling it is being run from the command line or from a GUI.\n",
+      "The library therefore shouldn't try to handle or report errors itself,\n",
+      "because it has no way of knowing what the right way to do this is.\n",
+      "It should instead just raise an exception,\n",
+      "and let its caller figure out how best to handle it."
+     ]
+    },
+    {
+     "cell_type": "markdown",
+     "metadata": {},
+     "source": [
+      "Finally,\n",
+      "we can raise exceptions ourselves if we want to.\n",
+      "In fact,\n",
+      "we *should* do this,\n",
+      "since it's the standard way in Python to signal that something has gone wrong.\n",
+      "Here,\n",
+      "for example,\n",
+      "is a function that reads a grid and checks its consistency:"
+     ]
+    },
+    {
+     "cell_type": "markdown",
+     "metadata": {},
+     "source": [
+      "```python\n",
+      "def read_grid(grid_file):\n",
+      "    '''Read grid, checking consistency.'''\n",
+      "\n",
+      "    data = read_raw_data(grid_file)\n",
+      "    if not grid_consistent(data):\n",
+      "        raise Exception('Inconsistent grid: ' + grid_file)\n",
+      "    result = normalize_grid(data)\n",
+      "\n",
+      "    return result\n",
+      "```"
+     ]
+    },
+    {
+     "cell_type": "markdown",
+     "metadata": {},
+     "source": [
+      "The `raise` statement creates a new exception with a meaningful error message.\n",
+      "Since `read_grid` itself doesn't contain a `try`/`except` block,\n",
+      "this exception will always be thrown up and out of the function,\n",
+      "to be caught and handled by whoever is calling `read_grid`.\n",
+      "We can define new types of exceptions if we want to.\n",
+      "And we should,\n",
+      "so that errors in our code can be distinguished from errors in other people's code.\n",
+      "However,\n",
+      "this involves classes and objects,\n",
+      "which is outside the scope of these lessons."
+     ]
+    },
+    {
+     "cell_type": "heading",
+     "level": 2,
+     "metadata": {},
+     "source": [
+      "Unit Testing"
+     ]
+    },
+    {
+     "cell_type": "markdown",
+     "metadata": {},
+     "source": [
+      "Now that we understand how Python manages error,\n",
+      "we can return to the subject of testing.\n",
+      "The biggest obstacle to doing it isn't actually whether or not it's useful,\n",
+      "but whether or not it's easy to do.\n",
+      "If it isn't,\n",
+      "people will always find excuses to do something else.\n",
+      "It's therefore important to make things as painless as possible.\n",
+      "In particular, it has to be easy for people to:\n",
+      "\n",
+      "- add or change tests,\n",
+      "- understand the tests that have already been written,\n",
+      "- run those tests, and\n",
+      "- understand those tests' results."
+     ]
+    },
+    {
+     "cell_type": "markdown",
+     "metadata": {},
+     "source": [
+      "Test results must also be reliable to be useful.\n",
+      "If a testing tool says that code is working when it's not,\n",
+      "or reports problems when there actually aren't any,\n",
+      "people will lose faith in it and stop using it."
+     ]
+    },
+    {
+     "cell_type": "markdown",
+     "metadata": {},
+     "source": [
+      "Let's start with the simplest kind of testing.\n",
+      "A [unit test](glossary.html#unit_test) is\n",
+      "a test that exercises one component, or unit, in a program.\n",
+      "Every unit test has five parts.\n",
+      "The first is the [test fixture](glossary.html#test_fixture),\n",
+      "which is the thing the test is run on:\n",
+      "the inputs to a function,\n",
+      "or the data files to be processed."
+     ]
+    },
+    {
+     "cell_type": "markdown",
+     "metadata": {},
+     "source": [
+      "The second part is the [test action](glossary.html#test_action),\n",
+      "which is what we do to the fixture.\n",
+      "Ideally,\n",
+      "this just involves calling a function,\n",
+      "but some tests may involve more.\n",
+      "The third part of every unit test is its [expected result](glossary.html#expected_test_result),\n",
+      "which is what we expect the piece of code we're testing to do or return.\n",
+      "If we don't know the expected result,\n",
+      "we can't tell whether the test passed or failed.\n",
+      "As we'll see toward the end of this lesson,\n",
+      "defining fixtures and expected results can be a good way to design software."
+     ]
+    },
+    {
+     "cell_type": "markdown",
+     "metadata": {},
+     "source": [
+      "The first three parts of the unit test are used over and over again.\n",
+      "The fourth part is the [actual result](glossary.html#actual_test_result),\n",
+      "which is what happens when we run the test on a particular day,\n",
+      "with a particular version of our software.\n",
+      "The fifth and final part is a [test report](glossary.html#test_report)\n",
+      "that tells us whether the test passed,\n",
+      "or whether there's a failure of some kind that needs human attention.\n",
+      "As with the actual result,\n",
+      "this could be different each time we run the test."
+     ]
+    },
+    {
+     "cell_type": "markdown",
+     "metadata": {},
+     "source": [
+      "So much for terminology:\n",
+      "what does this all look like in practice?\n",
+      "As an example,\n",
+      "suppose we're testing a function called `rectangle_area`\n",
+      "that returns the area of an `[x0, y0, x1, y1]` rectangle.\n",
+      "We'll start by testing our code directly using `assert`.\n",
+      "Here,\n",
+      "we call the function three times with different arguments,\n",
+      "checking that the right value is returned each time.\n",
+      "(We import `rectangle_area_1` rather than `rectangle_area`\n",
+      "because we're going to use several different versions of this function in this lesson,\n",
+      "and need to give each one a different name.)"
+     ]
+    },
+    {
+     "cell_type": "code",
+     "collapsed": false,
+     "input": [
+      "from rectangle import rectangle_area_1\n",
+      "\n",
+      "assert rectangle_area_1([0, 0, 1, 1]) == 1.0\n",
+      "assert rectangle_area_1([1, 1, 4, 4]) == 9.0\n",
+      "assert rectangle_area_1([0, 1, 4, 7]) == 24.0"
+     ],
+     "language": "python",
+     "metadata": {},
+     "outputs": [
+      {
+       "ename": "AssertionError",
+       "evalue": "",
+       "output_type": "pyerr",
+       "traceback": [
+        "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m\n\u001b[0;31mAssertionError\u001b[0m                            Traceback (most recent call last)",
+        "\u001b[0;32m<ipython-input-16-47f105cffcf1>\u001b[0m in \u001b[0;36m<module>\u001b[0;34m()\u001b[0m\n\u001b[1;32m      3\u001b[0m \u001b[0;32massert\u001b[0m \u001b[0mrectangle_area_1\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0;36m0\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;36m0\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;36m1\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;36m1\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m)\u001b[0m \u001b[0;34m==\u001b[0m \u001b[0;36m1.0\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m      4\u001b[0m \u001b[0;32massert\u001b[0m \u001b[0mrectangle_area_1\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0;36m1\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;36m1\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;36m4\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;36m4\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m)\u001b[0m \u001b[0;34m==\u001b[0m \u001b[0;36m9.0\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m----> 5\u001b[0;31m \u001b[0;32massert\u001b[0m \u001b[0mrectangle_area_1\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0;36m0\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;36m1\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;36m4\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;36m7\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m)\u001b[0m \u001b[0;34m==\u001b[0m \u001b[0;36m24.0\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m",
+        "\u001b[0;31mAssertionError\u001b[0m: "
+       ]
+      }
+     ],
+     "prompt_number": 16
+    },
+    {
+     "cell_type": "markdown",
+     "metadata": {},
+     "source": [
+      "This is better than no tests at all,\n",
+      "but look what happens if we change the order of the tests:"
+     ]
+    },
+    {
+     "cell_type": "code",
+     "collapsed": false,
+     "input": [
+      "assert rectangle_area_1([0, 1, 4, 7]) == 24.0\n",
+      "assert rectangle_area_1([1, 1, 4, 4]) == 9.0\n",
+      "assert rectangle_area_1([0, 0, 1, 1]) == 1.0"
+     ],
+     "language": "python",
+     "metadata": {},
+     "outputs": [
+      {
+       "ename": "AssertionError",
+       "evalue": "",
+       "output_type": "pyerr",
+       "traceback": [
+        "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m\n\u001b[0;31mAssertionError\u001b[0m                            Traceback (most recent call last)",
+        "\u001b[0;32m<ipython-input-17-03f0be9f2eb4>\u001b[0m in \u001b[0;36m<module>\u001b[0;34m()\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0;32massert\u001b[0m \u001b[0mrectangle_area_1\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0;36m0\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;36m1\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;36m4\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;36m7\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m)\u001b[0m \u001b[0;34m==\u001b[0m \u001b[0;36m24.0\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m      2\u001b[0m \u001b[0;32massert\u001b[0m \u001b[0mrectangle_area_1\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0;36m1\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;36m1\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;36m4\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;36m4\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m)\u001b[0m \u001b[0;34m==\u001b[0m \u001b[0;36m9.0\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m      3\u001b[0m \u001b[0;32massert\u001b[0m \u001b[0mrectangle_area_1\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0;36m0\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;36m0\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;36m1\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;36m1\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m)\u001b[0m \u001b[0;34m==\u001b[0m \u001b[0;36m1.0\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n",
+        "\u001b[0;31mAssertionError\u001b[0m: "
+       ]
+      }
+     ],
+     "prompt_number": 17
+    },
+    {
+     "cell_type": "markdown",
+     "metadata": {},
+     "source": [
+      "Python halts at the first failed assertion,\n",
+      "so the second and third tests aren't run at all.\n",
+      "It would be more helpful if we could get data from all of our tests every time they're run,\n",
+      "since the more information we have,\n",
+      "the faster we're likely to be able to track down bugs.\n",
+      "It would also be helpful to have some kind of summary report:\n",
+      "if our [test suite](glossary.html#test_suite) includes thirty or forty tests\n",
+      "(as it well might for a complex function or library that's widely used),\n",
+      "we'd like to know how many passed or failed."
+     ]
+    },
+    {
+     "cell_type": "markdown",
+     "metadata": {},
+     "source": [
+      "Here's a different approach.\n",
+      "First, let's put each test in a function with a meaningful name:"
+     ]
+    },
+    {
+     "cell_type": "code",
+     "collapsed": false,
+     "input": [
+      "def test_unit_square():\n",
+      "    assert rectangle_area_1([0, 0, 1, 1]) == 1.0\n",
+      "\n",
+      "def test_large_square():\n",
+      "    assert rectangle_area_1([1, 1, 4, 4]) == 9.0\n",
+      "\n",
+      "def test_actual_rectangle():\n",
+      "    assert rectangle_area_1([0, 1, 4, 7]) == 24.0"
+     ],
+     "language": "python",
+     "metadata": {},
+     "outputs": [],
+     "prompt_number": 24
+    },
+    {
+     "cell_type": "markdown",
+     "metadata": {},
+     "source": [
+      "Next,\n",
+      "we'll import a library called `ears`\n",
+      "and ask it to run our tests for us:"
+     ]
+    },
+    {
+     "cell_type": "code",
+     "collapsed": false,
+     "input": [
+      "import ears\n",
+      "ears.run()"
+     ],
+     "language": "python",
+     "metadata": {},
+     "outputs": [
+      {
+       "output_type": "stream",
+       "stream": "stdout",
+       "text": [
+        "..f\n",
+        "2 pass, 1 fail, 0 error\n",
+        "----------------------------------------\n",
+        "fail: test_actual_rectangle\n",
+        "Traceback (most recent call last):\n",
+        "  File \"ears.py\", line 43, in run\n",
+        "    test()\n",
+        "  File \"<ipython-input-24-180c0c8d0e69>\", line 8, in test_actual_rectangle\n",
+        "    assert rectangle_area_1([0, 1, 4, 7]) == 24.0\n",
+        "AssertionError\n",
+        "\n"
+       ]
+      }
+     ],
+     "prompt_number": 25
+    },
+    {
+     "cell_type": "markdown",
+     "metadata": {},
+     "source": [
+      "`ears.run` looks in the calling program\n",
+      "for functions whose names start with the letters `'test_'`\n",
+      "and runs each one exactly once.\n",
+      "If the function complete without an assertion being triggered,\n",
+      "we count the test as a success.\n",
+      "If an assertion fails,\n",
+      "we count the test as a failure,\n",
+      "and if any other exception occurs,\n",
+      "we count it as an error\n",
+      "(i.e., we assume that the test itself is broken)."
+     ]
+    },
+    {
+     "cell_type": "markdown",
+     "metadata": {},
+     "source": [
+      "`ears` belongs to a family of tools called [xUnit testing library](glossary.html#xUnit).\n",
+      "The name \"xUnit\" comes from the fact that\n",
+      "many of them are imitations of a Java testing library called JUnit.\n",
+      "The [Wikipedia page](http://en.wikipedia.org/wiki/List_of_unit_testing_frameworks) on the subject\n",
+      "lists dozens of similar frameworks in almost as many languages,\n",
+      "all of which have a similar structure:\n",
+      "each test is a single function that follows some naming convention\n",
+      "(e.g., starts with `'test_'`),\n",
+      "and the framework runs them in some order\n",
+      "and reports how many passed, failed, or were broken."
+     ]
+    },
+    {
+     "cell_type": "heading",
+     "level": 2,
+     "metadata": {},
+     "source": [
+      "Test-Driven Development"
+     ]
+    },
+    {
+     "cell_type": "markdown",
+     "metadata": {},
+     "source": [
+      "Of course,\n",
+      "these libraries can't think of test cases for us.\n",
+      "We still have to decide what to test and how many tests to run.\n",
+      "Our best guide here is economics:\n",
+      "we want the tests that are most likely to give us useful information\n",
+      "that we don't already have.\n",
+      "For example,\n",
+      "if `rectangle_area([0, 0, 1, 1])` works,\n",
+      "there's probably not much point testing `rectangle_area([0, 0, 2, 2])`,\n",
+      "since it's hard to think of a bug that would show up in one case but not in the other."
+     ]
+    },
+    {
+     "cell_type": "markdown",
+     "metadata": {},
+     "source": [
+      "We should therefore try to choose tests that are as different from each other as possible,\n",
+      "so that we force the code we're testing to execute in all the different ways it can.\n",
+      "Another way of thinking about this is that we should try to find [boundary cases](glossary.html#boundary_case).\n",
+      "If a function works for zero,\n",
+      "one,\n",
+      "and a million values,\n",
+      "it will probably work for eighteen values."
+     ]
+    },
+    {
+     "cell_type": "markdown",
+     "metadata": {},
+     "source": [
+      "Using boundary values as tests has another advantage:\n",
+      "it can help us design our software.\n",
+      "To see how,\n",
+      "consider this test case for our rectangle area function:"
+     ]
+    },
+    {
+     "cell_type": "markdown",
+     "metadata": {},
+     "source": [
+      "```python\n",
+      "def test_inverted_rectangle():\n",
+      "    assert rectangle_area([1, 5, 5, 2]) == -12.0\n",
+      "```"
+     ]
+    },
+    {
+     "cell_type": "markdown",
+     "metadata": {},
+     "source": [
+      "Is that test correct?\n",
+      "I.e.,\n",
+      "are rectangles with `x1<x0` or `y1<y0` legal,\n",
+      "and do they have negative area?\n",
+      "Or should the test be:"
+     ]
+    },
+    {
+     "cell_type": "markdown",
+     "metadata": {},
+     "source": [
+      "```python\n",
+      "def test_inverted_rectangle():\n",
+      "    try:\n",
+      "        rectangle_area([1, 5, 5, 2])\n",
+      "        assert False, 'Function did not raise exception for invalid rectangle'\n",
+      "    except ValueError:\n",
+      "        pass # rectangle_area failed with the expected kind of exception\n",
+      "    except Exception:\n",
+      "        assert False, 'Function did not raise correct kind of exception for invalid rectangle'\n",
+      "```"
+     ]
+    },
+    {
+     "cell_type": "markdown",
+     "metadata": {},
+     "source": [
+      "The logic in this second version may take a moment to work out,\n",
+      "but the idea is straightforward:\n",
+      "we want to check that `rectangle_area` raises a `ValueError` exception\n",
+      "if it's given a rectangle whose upper edge is below or to the left of its lower edge."
+     ]
+    },
+    {
+     "cell_type": "markdown",
+     "metadata": {},
+     "source": [
+      "Here's another test case that can help us design our software:"
+     ]
+    },
+    {
+     "cell_type": "markdown",
+     "metadata": {},
+     "source": [
+      "```python\n",
+      "def test_zero_width():\n",
+      "    assert rectangle_area([2, 1, 2, 8]) == 0\n",
+      "```"
+     ]
+    },
+    {
+     "cell_type": "markdown",
+     "metadata": {},
+     "source": [
+      "We might decide that rectangles with negative areas aren't allowed,\n",
+      "but what about rectangles with zero area,\n",
+      "i.e.,\n",
+      "rectangles that are actually lines?\n",
+      "Any actual implementation of `rectangle_area` will do *something* with one of these;\n",
+      "writing unit tests for boundary cases is a good way to specify exactly what that something is."
+     ]
+    }
+   ],
+   "metadata": {}
+  }
+ ]
+}
+\ No newline at end of file
diff --git a/lessons/swc-python/rectangle.py b/lessons/swc-python/rectangle.py

new file mode 100644 (file)

index 0000000..2229d03
--- /dev/null
+++ b/lessons/swc-python/rectangle.py
@@ -0,0 +1,3 @@
+def rectangle_area_1(coords):
+    x0, y0, x1, y1 = coords
+    return (x1 - x0) * (x1 - y0)
author	Greg Wilson <gvwilson@third-bit.com>
	Thu, 22 Aug 2013 01:02:22 +0000 (21:02 -0400)
committer	W. Trevor King <wking@tremily.us>
	Sat, 2 Nov 2013 17:40:51 +0000 (10:40 -0700)
lessons/swc-python/python-5-testing.ipynb	[new file with mode: 0644]	patch \| blob
lessons/swc-python/rectangle.py	[new file with mode: 0644]	patch \| blob