From dc91cc4be0b82140551d2171422df2394fe55b4a Mon Sep 17 00:00:00 2001 From: Jon Speicher Date: Sat, 27 Jul 2013 20:32:32 -0400 Subject: [PATCH] Add evaluated version of IPython notebook --- .../SoftwareEngineering_answers.ipynb | 3161 +++++++++++++++++ 1 file changed, 3161 insertions(+) create mode 100644 python/sw_engineering/SoftwareEngineering_answers.ipynb diff --git a/python/sw_engineering/SoftwareEngineering_answers.ipynb b/python/sw_engineering/SoftwareEngineering_answers.ipynb new file mode 100644 index 0000000..85ef5bb --- /dev/null +++ b/python/sw_engineering/SoftwareEngineering_answers.ipynb @@ -0,0 +1,3161 @@ +{ + "metadata": { + "name": "SoftwareEngineering_answers" + }, + "nbformat": 3, + "nbformat_minor": 0, + "worksheets": [ + { + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Software Engineering in Python\n", + "\n", + "# Outline\n", + "\n", + "* What is software engineering?\n", + "* Goal\n", + "* Creating a library of code\n", + "* Refactoring\n", + "* Testing\n", + "* Nose\n", + "* Test-driven development\n", + "* Exceptions\n", + "* Practice" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "***\n", + "# What is software engineering?\n", + "***\n", + "\n", + "*\"Software engineering is the application of a systematic, disciplined, quantifiable approach to the design, development, operation, and maintenance of software, and the study of these approaches; that is, the application of engineering to software.\"* - [Wikipedia](https://en.wikipedia.org/wiki/Software_engineering)\n", + "\n", + "In practial terms, this means solving problems with software in a way that is:\n", + "\n", + "* Efficient\n", + "* Maintainable\n", + "* Repeatable\n", + "* Scalable\n", + "* Provably correct\n", + "* Cost-effective\n", + "\n", + "Today we'll focus on building a small, trusted library of code. We will strive to achieve the bullet points identified above. Although we'll work in IPython notebooks, we'll talk about how these techniques can be extended to standalone Python libraries and scripts executed at the command line." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "***\n", + "# Goal\n", + "***\n", + "\n", + "\n", + "Recalling the animal sighting data sets we worked with earlier, our goal will be to create a new function that finds the mean number of a particular animal seen per sighting of that animal. We will design our function so that it will work with any properly-formatted data file and allow us to find the mean number of sightings for any animal contained within without requiring us to edit the code to do so. We'd like it to work like this:\n", + "\n", + " In [1]: mean_sightings('big_animals.txt', 'Wolverine')\n", + " Out[1]: 11.7\n", + "\n", + "To achieve our goal, we'll approach the problem in a fashion typical of modern software engineering. Up until now, most of our work has been in the IPython notebooks. As we discussed earlier, however, there are advantages to creating standalone Python modules and command-line scripts. These advantages include the ability to execute the programs frequently or in an automated fashion, as well as to centralize commonly-used bits of code for import into multiple programs.\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "***\n", + "# Creating a library of code\n", + "***\n", + "\n", + "Since the best programmer is a lazy programmer, we'll start by copying some code that we've already written.\n", + "\n", + "* Open the IPython notebook that we created in the Intro session.\n", + "* Find the function `count_wolverines` that you created in Exercise 8.\n", + "* Copy the function into a new cell.\n", + "* Save the notebook.\n", + "\n", + "***\n", + "**Aside: A standalone module**\n", + "\n", + "When we talked about a standalone module in the Intro session, we named it `count_animals`, because it counted animal total sightings. In this exercise, although we are working in IPython for convenience, we are notionally developing a reusable module that will at the very least allow us to find the average number of sightings per animal per day. Since our goal in the real world would be to develop a reusable module of code to import into a number of programs, we can't say for sure that this is the **only** thing our module will do now and forever more. Furthermore, changing the module's name later, while not impossible, runs the risk of breaking programs that we've already written to rely on it, or will at least require us to touch each of them to update the name of the imported module. In this case, then, the name `sightings` would seem to be reasonable for our module because whether we are counting total sightings or computing average sightings, we are still dealing with \"sightings,\" which, by the way, is the primary information that our data files contain. \n", + "***" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Results\n", + "\n", + "When you are done, your notebook should look something like this." + ] + }, + { + "cell_type": "code", + "collapsed": false, + "input": [ + "def count_wolverines(filename):\n", + " '''Given a plain text file containing animal sighting data in the form \n", + " date time animal count\n", + " returns the total count of wolverines sighted.'''\n", + " animal_file = open(filename, 'r')\n", + " animal_file_lines = animal_file.readlines()\n", + " animal_file.close()\n", + "\n", + " total_count = 0\n", + " for line in animal_file_lines:\n", + " date, time, animal, count_string = line.split()\n", + " if animal == 'Wolverine':\n", + " total_count = total_count + int(count_string)\n", + " return total_count" + ], + "language": "python", + "metadata": {}, + "outputs": [], + "prompt_number": 1 + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Let's see how it works." + ] + }, + { + "cell_type": "code", + "collapsed": false, + "input": [ + "count_wolverines('big_animals.txt')" + ], + "language": "python", + "metadata": {}, + "outputs": [ + { + "output_type": "pyout", + "prompt_number": 2, + "text": [ + "117" + ] + } + ], + "prompt_number": 2 + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "***\n", + "# Refactoring\n", + "***\n", + "\n", + "Look at `count_wolverines`. You'll notice that there's a lot going on within this function. It:\n", + "\n", + "* Opens a file\n", + "* Reads lines from a file\n", + "* Closes the file\n", + "* Parses the string data from the file into usable data types\n", + "* Filters out records specific to a single animal\n", + "* Sums up a specific field contained within that set of filtered records\n", + "\n", + "You can imagine that we might want to do many things with data from the sightings file. For one, we want to average the number of sightings for a particular animal. Perhaps we want to count up the number of unique animals seen in a data set. Perhaps we want to figure out which days of the year have the most elk sightings. It's easy to see how we could modify the `count_wolverines` function above to achieve all of these goals, but if we simply replicated that entire function a half-dozen times, we would be repeating the code that opens the file and reads and splits the lines a half-dozen times, too.\n", + "\n", + "One approach to reducing this duplication is to *decompose* the function above into several separate functions, each with a single, small, well-defined responsibility (remember our list of good function criteria from the Intro session). This is often known as [refactoring](http://en.wikipedia.org/wiki/Code_refactoring). The goal is to rearrange code to make it easier to read, maintain, and reuse while preserving the existing functionality.\n", + "\n", + "We are going to *extract* a new function from `count_wolverines`." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Exercise: Read animal sightings from a file\n", + "\n", + "Create a new function called `read_sightings_from_file`. The function should:\n", + "\n", + "* Accept a sightings filename as a parameter\n", + "* Open the file\n", + "* Read the file's lines\n", + "* Split the lines\n", + "* Store each column of each line in a list dedicated to holding that type of data (i.e. dates, times, animals, counts)\n", + "* Return the lists\n", + "\n", + "For example, given the following file:" + ] + }, + { + "cell_type": "code", + "collapsed": false, + "input": [ + "cat animals.txt" + ], + "language": "python", + "metadata": {}, + "outputs": [ + { + "output_type": "stream", + "stream": "stdout", + "text": [ + "2011-04-22 21:06 Grizzly 36\r\n", + "2011-04-23 14:12 Elk 25\r\n", + "2011-04-23 10:24 Elk 26\r\n", + "2011-04-23 20:08 Wolverine 31\r\n", + "2011-04-23 18:46 Muskox 20\r\n" + ] + } + ], + "prompt_number": 3 + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "your function would return:\n", + "\n", + " (['2011-04-22', '2011-04-23', ...], ['21:06', '14:12', ...], ['Grizzly', 'Elk', ...], [36, 25, ...])\n", + "\n", + "Keep in mind that most of this functionality is already implemented in `count_wolverines`, so you can copy liberally from that function." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Results\n", + "\n", + "When you are done, your code should look something like this." + ] + }, + { + "cell_type": "code", + "collapsed": false, + "input": [ + "def read_sightings_from_file(filename):\n", + " '''Given a plain text file containing animal sighting data in the form\n", + " date time animal count\n", + " returns four lists, each containing the data from one column.'''\n", + " \n", + " animal_file = open(filename, 'r')\n", + " animal_file_lines = animal_file.readlines()\n", + " animal_file.close()\n", + " \n", + " dates = []\n", + " times = []\n", + " animals = []\n", + " counts = []\n", + " \n", + " for line in animal_file_lines:\n", + " date, time, animal, count_string = line.split()\n", + " dates.append(date)\n", + " times.append(time)\n", + " animals.append(animal)\n", + " counts.append(int(count_string))\n", + "\n", + " return dates, times, animals, counts" + ], + "language": "python", + "metadata": {}, + "outputs": [], + "prompt_number": 4 + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Let's see how it works." + ] + }, + { + "cell_type": "code", + "collapsed": false, + "input": [ + "cat animals.txt" + ], + "language": "python", + "metadata": {}, + "outputs": [ + { + "output_type": "stream", + "stream": "stdout", + "text": [ + "2011-04-22 21:06 Grizzly 36\r\n", + "2011-04-23 14:12 Elk 25\r\n", + "2011-04-23 10:24 Elk 26\r\n", + "2011-04-23 20:08 Wolverine 31\r\n", + "2011-04-23 18:46 Muskox 20\r\n" + ] + } + ], + "prompt_number": 5 + }, + { + "cell_type": "code", + "collapsed": false, + "input": [ + "read_sightings_from_file('animals.txt')" + ], + "language": "python", + "metadata": {}, + "outputs": [ + { + "output_type": "pyout", + "prompt_number": 6, + "text": [ + "(['2011-04-22', '2011-04-23', '2011-04-23', '2011-04-23', '2011-04-23'],\n", + " ['21:06', '14:12', '10:24', '20:08', '18:46'],\n", + " ['Grizzly', 'Elk', 'Elk', 'Wolverine', 'Muskox'],\n", + " [36, 25, 26, 31, 20])" + ] + } + ], + "prompt_number": 6 + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "***\n", + "**Aside: Updating `count_wolverines`**\n", + "\n", + "Now that we have extracted the `read_sightings_from_file` function, we could remove the duplicated code from `count_wolverines`, which would simplify that function a bit." + ] + }, + { + "cell_type": "code", + "collapsed": false, + "input": [ + "def count_wolverines(filename):\n", + " '''Given a plain text file containing animal sighting data in the form \n", + " date time animal count\n", + " returns the total count of wolverines sighted.'''\n", + " \n", + " dates, times, animals, counts = read_sightings_from_file(filename)\n", + " \n", + " total_count = 0\n", + " \n", + " for animal, count in zip(animals, counts):\n", + " if animal == 'Wolverine':\n", + " total_count = total_count + count\n", + " return total_count" + ], + "language": "python", + "metadata": {}, + "outputs": [], + "prompt_number": 7 + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Does it work?" + ] + }, + { + "cell_type": "code", + "collapsed": false, + "input": [ + "count_wolverines('big_animals.txt')" + ], + "language": "python", + "metadata": {}, + "outputs": [ + { + "output_type": "pyout", + "prompt_number": 8, + "text": [ + "117" + ] + } + ], + "prompt_number": 8 + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "***" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "***\n", + "# Testing\n", + "***\n", + "\n", + "We can tell that our new `read_sightings_from_file` function works because we imported it into IPython and ran it on a file with known contents, then examined the result. This is fine for one function, but what if we need to make changes to that function? What if we change a function that is called from many functions? Should we test all the calling functions too?\n", + "\n", + "If our goal is program correctness, the answer to all of these questions is \"yes\", but this will quickly become tedious. If we have dozens of functions, some of which call others, manual testing can be a nightmare, especially when considering boundary or error cases. A common, modern solution to this problem is to write a program to test our program. These programs are often referred to as *unit tests*.\n", + "\n", + "The benefits of unit testing are numerous:\n", + "\n", + "* Program correctness can be verified quickly\n", + "* Test coverage can be more complete\n", + "* Subtle bugs and interdependencies may be exposed by changes to seemingly-unrelated functions\n", + "* A test suite enables \"fearless refactoring\"\n", + "\n", + "What would this look like?" + ] + }, + { + "cell_type": "code", + "collapsed": false, + "input": [ + "def test_read_sightings_from_file():\n", + " dates, times, animals, counts = read_sightings_from_file('animals.txt')\n", + " if dates[0] == '2011-04-22':\n", + " print 'Looks good!'\n", + " else:\n", + " print 'Unexpected date!'\n", + " \n", + "test_read_sightings_from_file()" + ], + "language": "python", + "metadata": {}, + "outputs": [ + { + "output_type": "stream", + "stream": "stdout", + "text": [ + "Looks good!\n" + ] + } + ], + "prompt_number": 9 + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "This is the core concept of unit testing: you create code that calls your code and verifies the results. We'll look at a number of conveniences that Python offers to streamline the process as we develop towards our goal of finding the mean number of sightings per animal. We'll start with a pre-canned example and modify it from there." + ] + }, + { + "cell_type": "code", + "collapsed": false, + "input": [ + "def test_read_sightings_from_file():\n", + " expected_dates = ['2011-04-22', '2011-04-23', '2011-04-23', '2011-04-23', '2011-04-23']\n", + " expected_times = ['21:06', '14:12', '10:24', '20:08', '18:46']\n", + " expected_animals = ['Grizzly', 'Elk', 'Elk', 'Wolverine', 'Muskox']\n", + " expected_counts = [36, 25, 26, 31, 20]\n", + " \n", + " dates, times, animals, counts = read_sightings_from_file('animals.txt')\n", + " \n", + " assert dates == expected_dates, 'Dates do not match!'\n", + " assert times == expected_times, 'Times do not match!'\n", + " assert animals == expected_animals, 'Animals do not match!'\n", + " assert counts == expected_counts, 'Counts do not match!'" + ], + "language": "python", + "metadata": {}, + "outputs": [], + "prompt_number": 10 + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "**Note**: The act of testing to see if an actual result matches an expected result is so frequently used in unit testing that Python gives us the `assert` keyword to standardize and simplify the process." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "What does it look like if a test passes?" + ] + }, + { + "cell_type": "code", + "collapsed": false, + "input": [ + "test_read_sightings_from_file()" + ], + "language": "python", + "metadata": {}, + "outputs": [], + "prompt_number": 11 + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "What does it look like if a test fails? " + ] + }, + { + "cell_type": "code", + "collapsed": false, + "input": [ + "def add_two_plus_two():\n", + " return 2 + 3\n", + "\n", + "def test_add_two_plus_two_equals_four():\n", + " assert 4 == add_two_plus_two(), \"2 + 2 didn't equal 4\"\n", + " \n", + "test_add_two_plus_two_equals_four()" + ], + "language": "python", + "metadata": {}, + "outputs": [ + { + "ename": "AssertionError", + "evalue": "2 + 2 didn't equal 4", + "output_type": "pyerr", + "traceback": [ + "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m\n\u001b[0;31mAssertionError\u001b[0m Traceback (most recent call last)", + "\u001b[0;32m\u001b[0m in \u001b[0;36m\u001b[0;34m()\u001b[0m\n\u001b[1;32m 5\u001b[0m \u001b[0;32massert\u001b[0m \u001b[0;36m4\u001b[0m \u001b[0;34m==\u001b[0m \u001b[0madd_two_plus_two\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;34m\"2 + 2 didn't equal 4\"\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 6\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m----> 7\u001b[0;31m \u001b[0mtest_add_two_plus_two_equals_four\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m", + "\u001b[0;32m\u001b[0m in \u001b[0;36mtest_add_two_plus_two_equals_four\u001b[0;34m()\u001b[0m\n\u001b[1;32m 3\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 4\u001b[0m \u001b[0;32mdef\u001b[0m \u001b[0mtest_add_two_plus_two_equals_four\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m----> 5\u001b[0;31m \u001b[0;32massert\u001b[0m \u001b[0;36m4\u001b[0m \u001b[0;34m==\u001b[0m \u001b[0madd_two_plus_two\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;34m\"2 + 2 didn't equal 4\"\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 6\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 7\u001b[0m \u001b[0mtest_add_two_plus_two_equals_four\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n", + "\u001b[0;31mAssertionError\u001b[0m: 2 + 2 didn't equal 4" + ] + } + ], + "prompt_number": 12 + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "As you can see, the failed line is highlighted and the text supplied to the `assert` statement is printed. In addition, IPython provides what is known as a *traceback*. The traceback shows you which function called the function that called the function that called assert, ad infinitum. This can be a very helpful debugging aid." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "***\n", + "# Nose\n", + "***\n", + "\n", + "Running one test by hand is fine, but imagine if we had a large, complex system and many, many test cases per function. How would we run all of our tests? Running them by hand in an IPython interpreter or notebook would defeat the purpose of automation. We could create a \"test runner\" file that called each of our test functions in sequence:\n", + "\n", + " import test_sightings\n", + "\n", + " test_read_sightings_from_file_when_file_does_not_exist()\n", + " test_read_sightings_from_file_when_file_is_empty()\n", + " test_read_sightings_from_file_when_file_has_blank_line()\n", + " ...\n", + " test_count_wolverines_with_no_wolverines()\n", + " test_count_wolverines_with_one_wolverine()\n", + " ...\n", + "\n", + "You can imagine this file growing very large. You can also be sure that at some point, some developer is going to write a test but forget to add it to the \"runner\", thereby running the risk of missing a bug. Fortunately, there is a very popular package that works with Python called `nose`. `nose` \"sniffs out your tests\" and runs them for you. `nose` is installed by default with several Python distributions." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "There is a plugin for the IPython notebook that runs `nose` and produces colored cells and tracebacks within the notebooks themselves. The plugin is included in the directory containing this notebook." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "***\n", + "**Aside: Getting the plugin**\n", + "\n", + "If you don't have a clone of the git repo for this workshop, grab the nose plugin now." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "**OS X:**" + ] + }, + { + "cell_type": "code", + "collapsed": false, + "input": [ + "!curl -OLk https://github.com/swcarpentry/boot-camps/raw/2013-07-cmu/python/sw_engineering/ipython_nose.py" + ], + "language": "python", + "metadata": {}, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "**Linux/Windows:**" + ] + }, + { + "cell_type": "code", + "collapsed": false, + "input": [ + "!wget https://github.com/swcarpentry/boot-camps/raw/2013-07-cmu/python/sw_engineering/ipython_nose.py" + ], + "language": "python", + "metadata": {}, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "***" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Let's run the plugin." + ] + }, + { + "cell_type": "code", + "collapsed": false, + "input": [ + "%load_ext ipython_nose" + ], + "language": "python", + "metadata": {}, + "outputs": [], + "prompt_number": 13 + }, + { + "cell_type": "code", + "collapsed": false, + "input": [ + "%nose" + ], + "language": "python", + "metadata": {}, + "outputs": [ + { + "html": [ + "
" + ], + "output_type": "display_data" + }, + { + "javascript": [ + "document.ipython_nose_03b76e6ba0cf4782b455ee8924ef7c48 = $(\"#ipython_nose_03b76e6ba0cf4782b455ee8924ef7c48\");" + ], + "output_type": "display_data" + }, + { + "javascript": [ + "document.ipython_nose_03b76e6ba0cf4782b455ee8924ef7c48.append($(\".\"));" + ], + "output_type": "display_data" + }, + { + "javascript": [ + "document.ipython_nose_03b76e6ba0cf4782b455ee8924ef7c48.append($(\"F\"));" + ], + "output_type": "display_data" + }, + { + "javascript": [ + "delete document.ipython_nose_03b76e6ba0cf4782b455ee8924ef7c48;" + ], + "output_type": "display_data" + }, + { + "html": [ + " \n", + " \n", + " \n", + " \n", + "
\n", + "
\n", + "  \n", + "
\n", + "
\n", + "  \n", + "
\n", + "
\n", + "  \n", + "
\n", + " 1/2 tests passed; 1 failed\n", + "
\n", + " \n", + "
\n", + "
\n", + " failed: __main__.test_add_two_plus_two_equals_four\n", + " [toggle traceback]\n", + "
\n", + "
Traceback (most recent call last):\n",
+        "  File \"/Users/Jon/Applications/anaconda/lib/python2.7/unittest/case.py\", line 331, in run\n",
+        "    testMethod()\n",
+        "  File \"/Users/Jon/Applications/anaconda/lib/python2.7/site-packages/nose/case.py\", line 197, in runTest\n",
+        "    self.test(*self.arg)\n",
+        "  File \"<ipython-input-12-2d797333ff4c>\", line 5, in test_add_two_plus_two_equals_four\n",
+        "    assert 4 == add_two_plus_two(), \"2 + 2 didn't equal 4\"\n",
+        "AssertionError: 2 + 2 didn't equal 4\n",
+        "
\n", + "
\n", + " " + ], + "output_type": "pyout", + "prompt_number": 14, + "text": [ + "1/2 tests passed; 1 failed\n", + "========\n", + "__main__.test_add_two_plus_two_equals_four\n", + "========\n", + "Traceback (most recent call last):\n", + " File \"/Users/Jon/Applications/anaconda/lib/python2.7/unittest/case.py\", line 331, in run\n", + " testMethod()\n", + " File \"/Users/Jon/Applications/anaconda/lib/python2.7/site-packages/nose/case.py\", line 197, in runTest\n", + " self.test(*self.arg)\n", + " File \"\", line 5, in test_add_two_plus_two_equals_four\n", + " assert 4 == add_two_plus_two(), \"2 + 2 didn't equal 4\"\n", + "AssertionError: 2 + 2 didn't equal 4\n", + "\n" + ] + } + ], + "prompt_number": 14 + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Using `nose` allows you to run all of your tests with one command. It eliminates the need to remember to add tests to a dedicated runner script, and it will clearly highlight which test has failed and point you to a traceback. When working from the command line with standalone Python programs and modules, `nose` provides a command-line runner script called `nosetests`." + ] + }, + { + "cell_type": "code", + "collapsed": false, + "input": [ + "!nosetests" + ], + "language": "python", + "metadata": {}, + "outputs": [ + { + "output_type": "stream", + "stream": "stdout", + "text": [ + "\r\n", + "----------------------------------------------------------------------\r\n", + "Ran 0 tests in 0.001s\r\n", + "\r\n", + "OK\r\n" + ] + } + ], + "prompt_number": 15 + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "**Note:** `nose` works by adhering to some reasonably sane naming conventions. If you prefix the names of all files containing tests with `test_`, and prefix the names of all functions containing tests with `test_`, `nose` will generally find your tests. A good rule of thumb is to place tests for a module named `my_module.py` in a file named `test_my_module.py`, and to name tests using plain English-y descriptions of what the test verifies, such as `test_that_two_plus_two_equals_four`." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "***\n", + "**Aside: Fixing our failure example**\n", + "\n", + "Let's get that failed test out of the way by *redefining* the function to do fix the bug." + ] + }, + { + "cell_type": "code", + "collapsed": false, + "input": [ + "def add_two_plus_two():\n", + " return 2 + 2" + ], + "language": "python", + "metadata": {}, + "outputs": [], + "prompt_number": 16 + }, + { + "cell_type": "code", + "collapsed": false, + "input": [ + "%nose" + ], + "language": "python", + "metadata": {}, + "outputs": [ + { + "html": [ + "
" + ], + "output_type": "display_data" + }, + { + "javascript": [ + "document.ipython_nose_795d17e7741242bfb73ce09fd74d433b = $(\"#ipython_nose_795d17e7741242bfb73ce09fd74d433b\");" + ], + "output_type": "display_data" + }, + { + "javascript": [ + "document.ipython_nose_795d17e7741242bfb73ce09fd74d433b.append($(\".\"));" + ], + "output_type": "display_data" + }, + { + "javascript": [ + "document.ipython_nose_795d17e7741242bfb73ce09fd74d433b.append($(\".\"));" + ], + "output_type": "display_data" + }, + { + "javascript": [ + "delete document.ipython_nose_795d17e7741242bfb73ce09fd74d433b;" + ], + "output_type": "display_data" + }, + { + "html": [ + " \n", + " \n", + " \n", + " \n", + "
\n", + "
\n", + "  \n", + "
\n", + "
\n", + "  \n", + "
\n", + "
\n", + "  \n", + "
\n", + " 2/2 tests passed\n", + "
\n", + " " + ], + "output_type": "pyout", + "prompt_number": 17, + "text": [ + "2/2 tests passed\n" + ] + } + ], + "prompt_number": 17 + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "***" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "***\n", + "# Test-driven development\n", + "***\n", + "\n", + "We've seen how to refactor code, and we've seen how to write and run tests. We have code to read an animals file and to spit out the data. Next we need code to find the mean of a bunch of numbers, and we'll need code to filter the list of data to pull out only the animals that we want to count. We'll start by writing the `mean` function, and we'll start writing the `mean` function by writing a test.\n", + "\n", + "Why would we write the tests first? *Test-driven development (TDD)* is a fairly recent concept in software engineering. The idea is that you develop a particular piece of functionality in small, iterative chunks: write a test that fails, write some code to make the test pass, write another test that fails, write some more code to make the test pass, etcetera. This is believed to have the following benefits:\n", + "\n", + "* Test coverage is extremely high, because you won't write code unless prompted by a test.\n", + "* Code size is minimized, because you won't be tempted to \"throw in the kitchen sink\" when writing a function that strictly passes tests.\n", + "* Code complexity is reduced, because you can \"fearlessly refactor\" code if you have a battery of tests to prove that you haven't broken anything in the process.\n", + "* The tests serve as \"documentation\" for what a function does and does not do.\n", + "\n", + "The jury is still out on whether TDD is valuable enough to justify the extra ceremony, and some people consider it to be superfluous or trendy. The important thing is that your code is tested, whether the tests came first or not.\n", + "\n", + "Let's give it a try." + ] + }, + { + "cell_type": "code", + "collapsed": false, + "input": [ + "def test_mean_of_zero_is_zero():\n", + " assert 0 == mean([0])" + ], + "language": "python", + "metadata": {}, + "outputs": [], + "prompt_number": 18 + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "***\n", + "**Aside: Test names**\n", + "\n", + "Naming tests well will help you to quickly resolve bugs that may crop up when running your tests. Imagine that you have 100 tests for the `mean` function. When you run your test suite, would you rather see a failure in `test_mean37` or in `test_mean_of_zero_is_zero`?\n", + "***" + ] + }, + { + "cell_type": "code", + "collapsed": false, + "input": [ + "%nose" + ], + "language": "python", + "metadata": {}, + "outputs": [ + { + "html": [ + "
" + ], + "output_type": "display_data" + }, + { + "javascript": [ + "document.ipython_nose_409247d1400c40a89052ce519e1cf21d = $(\"#ipython_nose_409247d1400c40a89052ce519e1cf21d\");" + ], + "output_type": "display_data" + }, + { + "javascript": [ + "document.ipython_nose_409247d1400c40a89052ce519e1cf21d.append($(\"E\"));" + ], + "output_type": "display_data" + }, + { + "javascript": [ + "document.ipython_nose_409247d1400c40a89052ce519e1cf21d.append($(\".\"));" + ], + "output_type": "display_data" + }, + { + "javascript": [ + "document.ipython_nose_409247d1400c40a89052ce519e1cf21d.append($(\".\"));" + ], + "output_type": "display_data" + }, + { + "javascript": [ + "delete document.ipython_nose_409247d1400c40a89052ce519e1cf21d;" + ], + "output_type": "display_data" + }, + { + "html": [ + " \n", + " \n", + " \n", + " \n", + "
\n", + "
\n", + "  \n", + "
\n", + "
\n", + "  \n", + "
\n", + "
\n", + "  \n", + "
\n", + " 2/3 tests passed; 1 failed\n", + "
\n", + " \n", + "
\n", + "
\n", + " failed: __main__.test_mean_of_zero_is_zero\n", + " [toggle traceback]\n", + "
\n", + "
Traceback (most recent call last):\n",
+        "  File \"/Users/Jon/Applications/anaconda/lib/python2.7/unittest/case.py\", line 331, in run\n",
+        "    testMethod()\n",
+        "  File \"/Users/Jon/Applications/anaconda/lib/python2.7/site-packages/nose/case.py\", line 197, in runTest\n",
+        "    self.test(*self.arg)\n",
+        "  File \"<ipython-input-18-ee2ab148c1b0>\", line 2, in test_mean_of_zero_is_zero\n",
+        "    assert 0 == mean([0])\n",
+        "NameError: global name 'mean' is not defined\n",
+        "
\n", + "
\n", + " " + ], + "output_type": "pyout", + "prompt_number": 19, + "text": [ + "2/3 tests passed; 1 failed\n", + "========\n", + "__main__.test_mean_of_zero_is_zero\n", + "========\n", + "Traceback (most recent call last):\n", + " File \"/Users/Jon/Applications/anaconda/lib/python2.7/unittest/case.py\", line 331, in run\n", + " testMethod()\n", + " File \"/Users/Jon/Applications/anaconda/lib/python2.7/site-packages/nose/case.py\", line 197, in runTest\n", + " self.test(*self.arg)\n", + " File \"\", line 2, in test_mean_of_zero_is_zero\n", + " assert 0 == mean([0])\n", + "NameError: global name 'mean' is not defined\n", + "\n" + ] + } + ], + "prompt_number": 19 + }, + { + "cell_type": "code", + "collapsed": false, + "input": [ + "def mean():\n", + " pass" + ], + "language": "python", + "metadata": {}, + "outputs": [], + "prompt_number": 20 + }, + { + "cell_type": "code", + "collapsed": false, + "input": [ + "%nose" + ], + "language": "python", + "metadata": {}, + "outputs": [ + { + "html": [ + "
" + ], + "output_type": "display_data" + }, + { + "javascript": [ + "document.ipython_nose_9e170eabf781485c9228bd845c496af5 = $(\"#ipython_nose_9e170eabf781485c9228bd845c496af5\");" + ], + "output_type": "display_data" + }, + { + "javascript": [ + "document.ipython_nose_9e170eabf781485c9228bd845c496af5.append($(\"E\"));" + ], + "output_type": "display_data" + }, + { + "javascript": [ + "document.ipython_nose_9e170eabf781485c9228bd845c496af5.append($(\".\"));" + ], + "output_type": "display_data" + }, + { + "javascript": [ + "document.ipython_nose_9e170eabf781485c9228bd845c496af5.append($(\".\"));" + ], + "output_type": "display_data" + }, + { + "javascript": [ + "delete document.ipython_nose_9e170eabf781485c9228bd845c496af5;" + ], + "output_type": "display_data" + }, + { + "html": [ + " \n", + " \n", + " \n", + " \n", + "
\n", + "
\n", + "  \n", + "
\n", + "
\n", + "  \n", + "
\n", + "
\n", + "  \n", + "
\n", + " 2/3 tests passed; 1 failed\n", + "
\n", + " \n", + "
\n", + "
\n", + " failed: __main__.test_mean_of_zero_is_zero\n", + " [toggle traceback]\n", + "
\n", + "
Traceback (most recent call last):\n",
+        "  File \"/Users/Jon/Applications/anaconda/lib/python2.7/unittest/case.py\", line 331, in run\n",
+        "    testMethod()\n",
+        "  File \"/Users/Jon/Applications/anaconda/lib/python2.7/site-packages/nose/case.py\", line 197, in runTest\n",
+        "    self.test(*self.arg)\n",
+        "  File \"<ipython-input-18-ee2ab148c1b0>\", line 2, in test_mean_of_zero_is_zero\n",
+        "    assert 0 == mean([0])\n",
+        "TypeError: mean() takes no arguments (1 given)\n",
+        "
\n", + "
\n", + " " + ], + "output_type": "pyout", + "prompt_number": 21, + "text": [ + "2/3 tests passed; 1 failed\n", + "========\n", + "__main__.test_mean_of_zero_is_zero\n", + "========\n", + "Traceback (most recent call last):\n", + " File \"/Users/Jon/Applications/anaconda/lib/python2.7/unittest/case.py\", line 331, in run\n", + " testMethod()\n", + " File \"/Users/Jon/Applications/anaconda/lib/python2.7/site-packages/nose/case.py\", line 197, in runTest\n", + " self.test(*self.arg)\n", + " File \"\", line 2, in test_mean_of_zero_is_zero\n", + " assert 0 == mean([0])\n", + "TypeError: mean() takes no arguments (1 given)\n", + "\n" + ] + } + ], + "prompt_number": 21 + }, + { + "cell_type": "code", + "collapsed": false, + "input": [ + "def mean(numbers):\n", + " pass" + ], + "language": "python", + "metadata": {}, + "outputs": [], + "prompt_number": 22 + }, + { + "cell_type": "code", + "collapsed": false, + "input": [ + "%nose" + ], + "language": "python", + "metadata": {}, + "outputs": [ + { + "html": [ + "
" + ], + "output_type": "display_data" + }, + { + "javascript": [ + "document.ipython_nose_7f6986e6588f4a0093de14d4893743f7 = $(\"#ipython_nose_7f6986e6588f4a0093de14d4893743f7\");" + ], + "output_type": "display_data" + }, + { + "javascript": [ + "document.ipython_nose_7f6986e6588f4a0093de14d4893743f7.append($(\"F\"));" + ], + "output_type": "display_data" + }, + { + "javascript": [ + "document.ipython_nose_7f6986e6588f4a0093de14d4893743f7.append($(\".\"));" + ], + "output_type": "display_data" + }, + { + "javascript": [ + "document.ipython_nose_7f6986e6588f4a0093de14d4893743f7.append($(\".\"));" + ], + "output_type": "display_data" + }, + { + "javascript": [ + "delete document.ipython_nose_7f6986e6588f4a0093de14d4893743f7;" + ], + "output_type": "display_data" + }, + { + "html": [ + " \n", + " \n", + " \n", + " \n", + "
\n", + "
\n", + "  \n", + "
\n", + "
\n", + "  \n", + "
\n", + "
\n", + "  \n", + "
\n", + " 2/3 tests passed; 1 failed\n", + "
\n", + " \n", + "
\n", + "
\n", + " failed: __main__.test_mean_of_zero_is_zero\n", + " [toggle traceback]\n", + "
\n", + "
Traceback (most recent call last):\n",
+        "  File \"/Users/Jon/Applications/anaconda/lib/python2.7/unittest/case.py\", line 331, in run\n",
+        "    testMethod()\n",
+        "  File \"/Users/Jon/Applications/anaconda/lib/python2.7/site-packages/nose/case.py\", line 197, in runTest\n",
+        "    self.test(*self.arg)\n",
+        "  File \"<ipython-input-18-ee2ab148c1b0>\", line 2, in test_mean_of_zero_is_zero\n",
+        "    assert 0 == mean([0])\n",
+        "AssertionError\n",
+        "
\n", + "
\n", + " " + ], + "output_type": "pyout", + "prompt_number": 23, + "text": [ + "2/3 tests passed; 1 failed\n", + "========\n", + "__main__.test_mean_of_zero_is_zero\n", + "========\n", + "Traceback (most recent call last):\n", + " File \"/Users/Jon/Applications/anaconda/lib/python2.7/unittest/case.py\", line 331, in run\n", + " testMethod()\n", + " File \"/Users/Jon/Applications/anaconda/lib/python2.7/site-packages/nose/case.py\", line 197, in runTest\n", + " self.test(*self.arg)\n", + " File \"\", line 2, in test_mean_of_zero_is_zero\n", + " assert 0 == mean([0])\n", + "AssertionError\n", + "\n" + ] + } + ], + "prompt_number": 23 + }, + { + "cell_type": "code", + "collapsed": false, + "input": [ + "def mean(numbers):\n", + " return 0" + ], + "language": "python", + "metadata": {}, + "outputs": [], + "prompt_number": 24 + }, + { + "cell_type": "code", + "collapsed": false, + "input": [ + "%nose" + ], + "language": "python", + "metadata": {}, + "outputs": [ + { + "html": [ + "
" + ], + "output_type": "display_data" + }, + { + "javascript": [ + "document.ipython_nose_cbe1bb26217c42838d93cc8e5a95ca14 = $(\"#ipython_nose_cbe1bb26217c42838d93cc8e5a95ca14\");" + ], + "output_type": "display_data" + }, + { + "javascript": [ + "document.ipython_nose_cbe1bb26217c42838d93cc8e5a95ca14.append($(\".\"));" + ], + "output_type": "display_data" + }, + { + "javascript": [ + "document.ipython_nose_cbe1bb26217c42838d93cc8e5a95ca14.append($(\".\"));" + ], + "output_type": "display_data" + }, + { + "javascript": [ + "document.ipython_nose_cbe1bb26217c42838d93cc8e5a95ca14.append($(\".\"));" + ], + "output_type": "display_data" + }, + { + "javascript": [ + "delete document.ipython_nose_cbe1bb26217c42838d93cc8e5a95ca14;" + ], + "output_type": "display_data" + }, + { + "html": [ + " \n", + " \n", + " \n", + " \n", + "
\n", + "
\n", + "  \n", + "
\n", + "
\n", + "  \n", + "
\n", + "
\n", + "  \n", + "
\n", + " 3/3 tests passed\n", + "
\n", + " " + ], + "output_type": "pyout", + "prompt_number": 25, + "text": [ + "3/3 tests passed\n" + ] + } + ], + "prompt_number": 25 + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "When we follow this sequence, you can see how the tests *drive* the development of the `mean` function. You can also see that this can be tedious, especially for newbies." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Exercise: Write the `mean` function and its tests " + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Write a function called `mean` that accepts a list of numbers and returns the arithmetic mean. It should work like this:\n", + "\n", + " In [1]: mean([5, 7, 15, 9, 0, 18])\n", + " Out [1]: 9.0\n", + "\n", + "Make sure to write a suite of tests for your implementation - either first or after the fact, whichever seems more logical to you. Some ideas about what to test:\n", + "\n", + "* A list with one number\n", + "* A list of all zeros\n", + "* Integers with an integral mean\n", + "* Integers with a non-integral mean\n", + "* Floating point numbers" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Results\n", + "\n", + "When I first wrote this code, I wound up with something like this:" + ] + }, + { + "cell_type": "code", + "collapsed": false, + "input": [ + "def mean(numbers):\n", + " '''Returns the mean of the provided list of numbers.'''\n", + " total = 0.0\n", + " for number in numbers:\n", + " total = total + number\n", + " return total/float(len(numbers))" + ], + "language": "python", + "metadata": {}, + "outputs": [], + "prompt_number": 26 + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "How good was my testing?" + ] + }, + { + "cell_type": "code", + "collapsed": false, + "input": [ + "mean([0])" + ], + "language": "python", + "metadata": {}, + "outputs": [ + { + "output_type": "pyout", + "prompt_number": 27, + "text": [ + "0.0" + ] + } + ], + "prompt_number": 27 + }, + { + "cell_type": "code", + "collapsed": false, + "input": [ + "mean([1])" + ], + "language": "python", + "metadata": {}, + "outputs": [ + { + "output_type": "pyout", + "prompt_number": 28, + "text": [ + "1.0" + ] + } + ], + "prompt_number": 28 + }, + { + "cell_type": "code", + "collapsed": false, + "input": [ + "mean([1, 2, 3, 4, 5])" + ], + "language": "python", + "metadata": {}, + "outputs": [ + { + "output_type": "pyout", + "prompt_number": 29, + "text": [ + "3.0" + ] + } + ], + "prompt_number": 29 + }, + { + "cell_type": "code", + "collapsed": false, + "input": [ + "mean([2.7, 3.8, 7.2])" + ], + "language": "python", + "metadata": {}, + "outputs": [ + { + "output_type": "pyout", + "prompt_number": 30, + "text": [ + "4.566666666666666" + ] + } + ], + "prompt_number": 30 + }, + { + "cell_type": "code", + "collapsed": false, + "input": [ + "mean([])" + ], + "language": "python", + "metadata": {}, + "outputs": [ + { + "ename": "ZeroDivisionError", + "evalue": "float division by zero", + "output_type": "pyerr", + "traceback": [ + "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m\n\u001b[0;31mZeroDivisionError\u001b[0m Traceback (most recent call last)", + "\u001b[0;32m\u001b[0m in \u001b[0;36m\u001b[0;34m()\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0mmean\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m", + "\u001b[0;32m\u001b[0m in \u001b[0;36mmean\u001b[0;34m(numbers)\u001b[0m\n\u001b[1;32m 4\u001b[0m \u001b[0;32mfor\u001b[0m \u001b[0mnumber\u001b[0m \u001b[0;32min\u001b[0m \u001b[0mnumbers\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 5\u001b[0m \u001b[0mtotal\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mtotal\u001b[0m \u001b[0;34m+\u001b[0m \u001b[0mnumber\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m----> 6\u001b[0;31m \u001b[0;32mreturn\u001b[0m \u001b[0mtotal\u001b[0m\u001b[0;34m/\u001b[0m\u001b[0mfloat\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mlen\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mnumbers\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m", + "\u001b[0;31mZeroDivisionError\u001b[0m: float division by zero" + ] + } + ], + "prompt_number": 31 + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "***\n", + "# Exceptions\n", + "***\n", + "\n", + "An empty list has a length of zero. The loop doesn't iterate, so total stays at 0.0, which is what we initialized it to, and the code happily attempts to divide zero by zero, resulting in an error.\n", + "\n", + "`ZeroDivisionError` is called an \"exception\". Our code can raise exceptions as well, and we can even define our own exceptions if we want. While this exception works perfectly fine, it could be a bit more clear, and I obviously missed a test case, which means I missed implementing a piece of beneficial behavior in my `mean` function. So we need to add a test, and we need to update our `mean` function to die more gracefully in the event of an empty list. We'll start with the test." + ] + }, + { + "cell_type": "code", + "collapsed": false, + "input": [ + "import nose.tools\n", + "\n", + "@nose.tools.raises(ValueError)\n", + "def test_mean_of_empty_function_raises_value_error():\n", + " mean([])" + ], + "language": "python", + "metadata": {}, + "outputs": [], + "prompt_number": 32 + }, + { + "cell_type": "code", + "collapsed": false, + "input": [ + "%nose" + ], + "language": "python", + "metadata": {}, + "outputs": [ + { + "html": [ + "
" + ], + "output_type": "display_data" + }, + { + "javascript": [ + "document.ipython_nose_bc7dc15dd44f4dc390dce78a58efda99 = $(\"#ipython_nose_bc7dc15dd44f4dc390dce78a58efda99\");" + ], + "output_type": "display_data" + }, + { + "javascript": [ + "document.ipython_nose_bc7dc15dd44f4dc390dce78a58efda99.append($(\".\"));" + ], + "output_type": "display_data" + }, + { + "javascript": [ + "document.ipython_nose_bc7dc15dd44f4dc390dce78a58efda99.append($(\".\"));" + ], + "output_type": "display_data" + }, + { + "javascript": [ + "document.ipython_nose_bc7dc15dd44f4dc390dce78a58efda99.append($(\"E\"));" + ], + "output_type": "display_data" + }, + { + "javascript": [ + "document.ipython_nose_bc7dc15dd44f4dc390dce78a58efda99.append($(\".\"));" + ], + "output_type": "display_data" + }, + { + "javascript": [ + "delete document.ipython_nose_bc7dc15dd44f4dc390dce78a58efda99;" + ], + "output_type": "display_data" + }, + { + "html": [ + " \n", + " \n", + " \n", + " \n", + "
\n", + "
\n", + "  \n", + "
\n", + "
\n", + "  \n", + "
\n", + "
\n", + "  \n", + "
\n", + " 3/4 tests passed; 1 failed\n", + "
\n", + " \n", + "
\n", + "
\n", + " failed: __main__.test_mean_of_empty_function_raises_value_error\n", + " [toggle traceback]\n", + "
\n", + "
Traceback (most recent call last):\n",
+        "  File \"/Users/Jon/Applications/anaconda/lib/python2.7/unittest/case.py\", line 331, in run\n",
+        "    testMethod()\n",
+        "  File \"/Users/Jon/Applications/anaconda/lib/python2.7/site-packages/nose/case.py\", line 197, in runTest\n",
+        "    self.test(*self.arg)\n",
+        "  File \"/Users/Jon/Applications/anaconda/lib/python2.7/site-packages/nose/tools/nontrivial.py\", line 60, in newfunc\n",
+        "    func(*arg, **kw)\n",
+        "  File \"<ipython-input-32-bc417f346267>\", line 5, in test_mean_of_empty_function_raises_value_error\n",
+        "    mean([])\n",
+        "  File \"<ipython-input-26-16ccb79ef6d8>\", line 6, in mean\n",
+        "    return total/float(len(numbers))\n",
+        "ZeroDivisionError: float division by zero\n",
+        "
\n", + "
\n", + " " + ], + "output_type": "pyout", + "prompt_number": 33, + "text": [ + "3/4 tests passed; 1 failed\n", + "========\n", + "__main__.test_mean_of_empty_function_raises_value_error\n", + "========\n", + "Traceback (most recent call last):\n", + " File \"/Users/Jon/Applications/anaconda/lib/python2.7/unittest/case.py\", line 331, in run\n", + " testMethod()\n", + " File \"/Users/Jon/Applications/anaconda/lib/python2.7/site-packages/nose/case.py\", line 197, in runTest\n", + " self.test(*self.arg)\n", + " File \"/Users/Jon/Applications/anaconda/lib/python2.7/site-packages/nose/tools/nontrivial.py\", line 60, in newfunc\n", + " func(*arg, **kw)\n", + " File \"\", line 5, in test_mean_of_empty_function_raises_value_error\n", + " mean([])\n", + " File \"\", line 6, in mean\n", + " return total/float(len(numbers))\n", + "ZeroDivisionError: float division by zero\n", + "\n" + ] + } + ], + "prompt_number": 33 + }, + { + "cell_type": "code", + "collapsed": false, + "input": [ + "def mean(numbers):\n", + " '''Returns the mean of the provided list of numbers.'''\n", + " if len(numbers) == 0:\n", + " raise ValueError, \"Empty list received by mean\"\n", + " \n", + " total = 0.0\n", + " for number in numbers:\n", + " total = total + number\n", + " return total/float(len(numbers))" + ], + "language": "python", + "metadata": {}, + "outputs": [], + "prompt_number": 34 + }, + { + "cell_type": "code", + "collapsed": false, + "input": [ + "%nose" + ], + "language": "python", + "metadata": {}, + "outputs": [ + { + "html": [ + "
" + ], + "output_type": "display_data" + }, + { + "javascript": [ + "document.ipython_nose_c304c733e91a4802aeae5aefddcf1c56 = $(\"#ipython_nose_c304c733e91a4802aeae5aefddcf1c56\");" + ], + "output_type": "display_data" + }, + { + "javascript": [ + "document.ipython_nose_c304c733e91a4802aeae5aefddcf1c56.append($(\".\"));" + ], + "output_type": "display_data" + }, + { + "javascript": [ + "document.ipython_nose_c304c733e91a4802aeae5aefddcf1c56.append($(\".\"));" + ], + "output_type": "display_data" + }, + { + "javascript": [ + "document.ipython_nose_c304c733e91a4802aeae5aefddcf1c56.append($(\".\"));" + ], + "output_type": "display_data" + }, + { + "javascript": [ + "document.ipython_nose_c304c733e91a4802aeae5aefddcf1c56.append($(\".\"));" + ], + "output_type": "display_data" + }, + { + "javascript": [ + "delete document.ipython_nose_c304c733e91a4802aeae5aefddcf1c56;" + ], + "output_type": "display_data" + }, + { + "html": [ + " \n", + " \n", + " \n", + " \n", + "
\n", + "
\n", + "  \n", + "
\n", + "
\n", + "  \n", + "
\n", + "
\n", + "  \n", + "
\n", + " 4/4 tests passed\n", + "
\n", + " " + ], + "output_type": "pyout", + "prompt_number": 35, + "text": [ + "4/4 tests passed\n" + ] + } + ], + "prompt_number": 35 + }, + { + "cell_type": "code", + "collapsed": false, + "input": [ + "mean([])" + ], + "language": "python", + "metadata": {}, + "outputs": [ + { + "ename": "ValueError", + "evalue": "Empty list received by mean", + "output_type": "pyerr", + "traceback": [ + "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m\n\u001b[0;31mValueError\u001b[0m Traceback (most recent call last)", + "\u001b[0;32m\u001b[0m in \u001b[0;36m\u001b[0;34m()\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0mmean\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m", + "\u001b[0;32m\u001b[0m in \u001b[0;36mmean\u001b[0;34m(numbers)\u001b[0m\n\u001b[1;32m 2\u001b[0m \u001b[0;34m'''Returns the mean of the provided list of numbers.'''\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 3\u001b[0m \u001b[0;32mif\u001b[0m \u001b[0mlen\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mnumbers\u001b[0m\u001b[0;34m)\u001b[0m \u001b[0;34m==\u001b[0m \u001b[0;36m0\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m----> 4\u001b[0;31m \u001b[0;32mraise\u001b[0m \u001b[0mValueError\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;34m\"Empty list received by mean\"\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 5\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 6\u001b[0m \u001b[0mtotal\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0;36m0.0\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n", + "\u001b[0;31mValueError\u001b[0m: Empty list received by mean" + ] + } + ], + "prompt_number": 36 + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "***\n", + "**Aside: What are exceptions good for?**\n", + "\n", + "In addition to providing nice error messages, exceptions can be used to allow your program to handle unexpected - or *exceptional* - conditions gracefully. Python provides *exception handling*, which allows your code to respond to exceptions if they occur. Consider this example:\n", + "\n", + " while not success:\n", + " numbers = read_numbers_from_keyboard()\n", + " try:\n", + " result = mean(numbers)\n", + " success = True\n", + " except ValueError:\n", + " print 'Please try again'\n", + "\n", + "***" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "***\n", + "# Practice\n", + "***\n", + "\n", + "## Filtering animals\n", + "\n", + "We have a function to read animals from a file and return lists containing the columns, and we have a function that calculates a mean for a list of numbers. What else do we need?" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Exercise: Write the `filter_animals` function and its tests\n", + "\n", + "This function should:\n", + "\n", + "* Accept the name of the animal we are interested in catching in the filter\n", + "* Accept four lists containing unfiltered dates, times, animal names, and counts\n", + "* Return four lists containing dates, times, animal names, and counts, pertaining only to the animal that we wanted to catch in the filter\n", + "* Be tested!" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Results\n", + "\n", + "When you are done, your code should look something like this." + ] + }, + { + "cell_type": "code", + "collapsed": false, + "input": [ + "def filter_animals(filtered_animal, dates, times, animals, counts):\n", + " '''Given a particular species, filter out and return just the data for that species.'''\n", + "\n", + " filtered_dates = []\n", + " filtered_times = []\n", + " filtered_animals = []\n", + " filtered_counts = []\n", + " \n", + " for date, time, animal, count in zip(dates, times, animals, counts):\n", + " if animal == filtered_animal:\n", + " filtered_dates.append(date)\n", + " filtered_times.append(time)\n", + " filtered_animals.append(animal)\n", + " filtered_counts.append(count)\n", + " \n", + " return filtered_dates, filtered_times, filtered_animals, filtered_counts" + ], + "language": "python", + "metadata": {}, + "outputs": [], + "prompt_number": 37 + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Does it work?" + ] + }, + { + "cell_type": "code", + "collapsed": false, + "input": [ + "cat animals.txt" + ], + "language": "python", + "metadata": {}, + "outputs": [ + { + "output_type": "stream", + "stream": "stdout", + "text": [ + "2011-04-22 21:06 Grizzly 36\r\n", + "2011-04-23 14:12 Elk 25\r\n", + "2011-04-23 10:24 Elk 26\r\n", + "2011-04-23 20:08 Wolverine 31\r\n", + "2011-04-23 18:46 Muskox 20\r\n" + ] + } + ], + "prompt_number": 38 + }, + { + "cell_type": "code", + "collapsed": false, + "input": [ + "dates, times, animals, counts = read_sightings_from_file('animals.txt')\n", + "print filter_animals('Elk', dates, times, animals, counts)" + ], + "language": "python", + "metadata": {}, + "outputs": [ + { + "output_type": "stream", + "stream": "stdout", + "text": [ + "(['2011-04-23', '2011-04-23'], ['14:12', '10:24'], ['Elk', 'Elk'], [25, 26])" + ] + }, + { + "output_type": "stream", + "stream": "stdout", + "text": [ + "\n" + ] + } + ], + "prompt_number": 39 + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "What do your tests look like?" + ] + }, + { + "cell_type": "code", + "collapsed": false, + "input": [ + "def test_filter_finds_one_animal_in_list():\n", + " sample_dates = ['2011-04-23']\n", + " sample_times = ['14:12']\n", + " sample_animals = ['Elk']\n", + " sample_counts = [42]\n", + " \n", + " dates, times, animals, counts = filter_animals('Elk', sample_dates, sample_times, sample_animals, sample_counts)\n", + " \n", + " assert dates == sample_dates\n", + " assert times == sample_times\n", + " assert animals == sample_animals\n", + " assert counts == sample_counts" + ], + "language": "python", + "metadata": {}, + "outputs": [], + "prompt_number": 40 + }, + { + "cell_type": "code", + "collapsed": false, + "input": [ + "def test_filter_returns_empty_lists_if_animal_not_in_list():\n", + " sample_dates = ['2011-04-03', '2011-04-04', '2011-04-04']\n", + " sample_times = ['14:12', '18:32', '00:27']\n", + " sample_animals = ['Moose', 'Wolverine']\n", + " sample_counts = [18, 6]\n", + " \n", + " dates, times, animals, counts = filter_animals('Elk', sample_dates, sample_times, sample_animals, sample_counts)\n", + " \n", + " assert dates == []\n", + " assert times == []\n", + " assert animals == []\n", + " assert counts == []" + ], + "language": "python", + "metadata": {}, + "outputs": [], + "prompt_number": 41 + }, + { + "cell_type": "code", + "collapsed": false, + "input": [ + "%nose" + ], + "language": "python", + "metadata": {}, + "outputs": [ + { + "html": [ + "
" + ], + "output_type": "display_data" + }, + { + "javascript": [ + "document.ipython_nose_a86e5d40c774456c81bce2bba1108385 = $(\"#ipython_nose_a86e5d40c774456c81bce2bba1108385\");" + ], + "output_type": "display_data" + }, + { + "javascript": [ + "document.ipython_nose_a86e5d40c774456c81bce2bba1108385.append($(\".\"));" + ], + "output_type": "display_data" + }, + { + "javascript": [ + "document.ipython_nose_a86e5d40c774456c81bce2bba1108385.append($(\".\"));" + ], + "output_type": "display_data" + }, + { + "javascript": [ + "document.ipython_nose_a86e5d40c774456c81bce2bba1108385.append($(\".\"));" + ], + "output_type": "display_data" + }, + { + "javascript": [ + "document.ipython_nose_a86e5d40c774456c81bce2bba1108385.append($(\".\"));" + ], + "output_type": "display_data" + }, + { + "javascript": [ + "document.ipython_nose_a86e5d40c774456c81bce2bba1108385.append($(\".\"));" + ], + "output_type": "display_data" + }, + { + "javascript": [ + "document.ipython_nose_a86e5d40c774456c81bce2bba1108385.append($(\".\"));" + ], + "output_type": "display_data" + }, + { + "javascript": [ + "delete document.ipython_nose_a86e5d40c774456c81bce2bba1108385;" + ], + "output_type": "display_data" + }, + { + "html": [ + " \n", + " \n", + " \n", + " \n", + "
\n", + "
\n", + "  \n", + "
\n", + "
\n", + "  \n", + "
\n", + "
\n", + "  \n", + "
\n", + " 6/6 tests passed\n", + "
\n", + " " + ], + "output_type": "pyout", + "prompt_number": 42, + "text": [ + "6/6 tests passed\n" + ] + } + ], + "prompt_number": 42 + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "What other test cases did you define?" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "***\n", + "**Aside: Test refactoring**\n", + "\n", + "You can see that there is already some reptition just among two test cases. Tests can be refactored to extract common code into helper functions just like production code can. A number of test frameworks exist to aid you with developing clear, succinct test harnesses. Python includes one called `unittest`.\n", + "***" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Mean animals\n", + "\n", + "All that's left is to tie it all together. Fortunately, because we have focused on building small, cohesive, well-factored blocks of code up to this point, writing the final `mean_sightings` function should be straightforward." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Exercise: Write the `mean_sightings` function and its tests\n", + "\n", + "This function should:\n", + "\n", + "* Accept the name of the file from which we are to read animal data\n", + "* Accept the name of the animal we are interested in catching in the filter\n", + "* Return the mean number of animals seen per sighting\n", + "* Be tested!" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Results\n", + "\n", + "When you are done, your code should look something like this." + ] + }, + { + "cell_type": "code", + "collapsed": false, + "input": [ + "def mean_sightings(filename, species):\n", + " dates, times, animals, counts = read_sightings_from_file(filename)\n", + " dates, times, animals, counts = filter_animals(species, dates, times, animals, counts)\n", + " return mean(counts)" + ], + "language": "python", + "metadata": {}, + "outputs": [], + "prompt_number": 43 + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "And your tests:" + ] + }, + { + "cell_type": "code", + "collapsed": false, + "input": [ + "def test_mean_elk_count_in_animals_txt_is_25_5():\n", + " m = mean_sightings('animals.txt', 'Elk')\n", + " assert m == 25.5\n", + "\n", + "def test_mean_grizzly_count_in_animals_txt_is_36():\n", + " m = mean_sightings('animals.txt', 'Grizzly')\n", + " assert m == 36" + ], + "language": "python", + "metadata": {}, + "outputs": [], + "prompt_number": 44 + }, + { + "cell_type": "code", + "collapsed": false, + "input": [ + "%nose" + ], + "language": "python", + "metadata": {}, + "outputs": [ + { + "html": [ + "
" + ], + "output_type": "display_data" + }, + { + "javascript": [ + "document.ipython_nose_04edb2aa0a4f43878cd7f1a3a8006c04 = $(\"#ipython_nose_04edb2aa0a4f43878cd7f1a3a8006c04\");" + ], + "output_type": "display_data" + }, + { + "javascript": [ + "document.ipython_nose_04edb2aa0a4f43878cd7f1a3a8006c04.append($(\".\"));" + ], + "output_type": "display_data" + }, + { + "javascript": [ + "document.ipython_nose_04edb2aa0a4f43878cd7f1a3a8006c04.append($(\".\"));" + ], + "output_type": "display_data" + }, + { + "javascript": [ + "document.ipython_nose_04edb2aa0a4f43878cd7f1a3a8006c04.append($(\".\"));" + ], + "output_type": "display_data" + }, + { + "javascript": [ + "document.ipython_nose_04edb2aa0a4f43878cd7f1a3a8006c04.append($(\".\"));" + ], + "output_type": "display_data" + }, + { + "javascript": [ + "document.ipython_nose_04edb2aa0a4f43878cd7f1a3a8006c04.append($(\".\"));" + ], + "output_type": "display_data" + }, + { + "javascript": [ + "document.ipython_nose_04edb2aa0a4f43878cd7f1a3a8006c04.append($(\".\"));" + ], + "output_type": "display_data" + }, + { + "javascript": [ + "document.ipython_nose_04edb2aa0a4f43878cd7f1a3a8006c04.append($(\".\"));" + ], + "output_type": "display_data" + }, + { + "javascript": [ + "document.ipython_nose_04edb2aa0a4f43878cd7f1a3a8006c04.append($(\".\"));" + ], + "output_type": "display_data" + }, + { + "javascript": [ + "delete document.ipython_nose_04edb2aa0a4f43878cd7f1a3a8006c04;" + ], + "output_type": "display_data" + }, + { + "html": [ + " \n", + " \n", + " \n", + " \n", + "
\n", + "
\n", + "  \n", + "
\n", + "
\n", + "  \n", + "
\n", + "
\n", + "  \n", + "
\n", + " 8/8 tests passed\n", + "
\n", + " " + ], + "output_type": "pyout", + "prompt_number": 45, + "text": [ + "8/8 tests passed\n" + ] + } + ], + "prompt_number": 45 + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "You can have high confidence in the reliability of this function, because it is *composed* of calls to verified functions." + ] + } + ], + "metadata": {} + } + ] +} \ No newline at end of file -- 2.26.2