Update notebook to be self-contained instead of using standlone external modules
authorJon Speicher <jon.speicher@gmail.com>
Sat, 27 Jul 2013 14:36:24 +0000 (10:36 -0400)
committerW. Trevor King <wking@tremily.us>
Sat, 9 Nov 2013 18:27:51 +0000 (10:27 -0800)
python/sw_engineering/SoftwareEngineering.ipynb

index cd45d07e78f2eda94ffb9f8346fb3d049fbf8d61..a7b3f9f1bb89124a16f313db79af08379002b233 100644 (file)
@@ -43,7 +43,7 @@
       "* Provably correct\n",
       "* Cost-effective\n",
       "\n",
-      "Today we'll focus on building a small, trusted library of code. We will strive to achieve the bullet points identified above"
+      "Today we'll focus on building a small, trusted library of code. We will strive to achieve the bullet points identified above. Although we'll work in IPython notebooks, we'll talk about how these techniques can be extended to standalone Python libraries and scripts executed at the command line."
      ]
     },
     {
       "# Goal\n",
       "***\n",
       "\n",
-      "Up until now, most of our work has been in the IPython notebooks. As we discussed earlier, however, there are advantages to creating standalone Python modules and command-line scripts. These advantages include the ability to execute the programs frequently or in an automated fashion, as well as to centralize commonly-used bits of code for import into multiple programs.\n",
       "\n",
-      "Recalling the animal sighting data sets we worked with earlier, our goal will be to create a standalone utility, runnable from the command line, that finds the mean number of a particular animal seen per sighting of that animal. We will design our utility so that it will work with any properly-formatted data file and allow us to find the mean number of sightings for any animal contained within without requiring us to edit the code to do so. We'd like it to work like this:\n",
+      "Recalling the animal sighting data sets we worked with earlier, our goal will be to create a new function that finds the mean number of a particular animal seen per sighting of that animal. We will design our function so that it will work with any properly-formatted data file and allow us to find the mean number of sightings for any animal contained within without requiring us to edit the code to do so. We'd like it to work like this:\n",
       "\n",
-      "    $ ./mean_sighted.py big_animals.txt Wolverine\n",
-      "    11.7\n",
+      "    In [1]: mean_sightings('big_animals.txt', 'Wolverine')\n",
+      "    Out[1]: 11.7\n",
       "\n",
-      "To achieve our goal, we'll approach the problem in a fashion typical of modern software engineering."
+      "To achieve our goal, we'll approach the problem in a fashion typical of modern software engineering. Up until now, most of our work has been in the IPython notebooks. As we discussed earlier, however, there are advantages to creating standalone Python modules and command-line scripts. These advantages include the ability to execute the programs frequently or in an automated fashion, as well as to centralize commonly-used bits of code for import into multiple programs.\n"
      ]
     },
     {
       "\n",
       "Since the best programmer is a lazy programmer, we'll start by copying some code that we've already written.\n",
       "\n",
-      "* Using a text editor, create a file named `sightings.py` in this directory.\n",
-      "* Open the the file `count_animals.py` that you created in the Intro session.\n",
-      "* Copy the function `count_wolverines` that you created in Exercise 8 into the `sightings.py` file.\n",
-      "* Save `sightings.py`.\n",
+      "* Open the IPython notebook that we created in the Intro session.\n",
+      "* Find the function `count_wolverines` that you created in Exercise 8.\n",
+      "* Copy the function into a new cell.\n",
+      "* Save the notebook.\n",
       "\n",
       "***\n",
-      "**Aside: Why `sightings.py`?**\n",
+      "**Aside: A standalone module**\n",
       "\n",
-      "In the intro session, we named our module `count_animals`, because it counted animal total sightings. In this exercise, we are developing a reusable module that will at the very least allow us to find the average number of sightings per animal per day. Since our goal is to develop a reusable module of code to import into a number of programs, we can't say for sure that this is the **only** thing our module will do now and forever more. Furthermore, changing the module's name later, while not impossible, runs the risk of breaking programs that we've already written to rely on it, or will at least require us to touch each of them to update the name of the imported module. In this case, then, the name `sightings` seems reasonable for our module because whether we are counting total sightings or computing average sightings, we are still dealing with \"sightings,\" which, by the way, is the primary information that our data files contain. \n",
+      "When we talked about a standalone module in the Intro session, we named it `count_animals`, because it counted animal total sightings. In this exercise, although we are working in IPython for convenience, we are notionally developing a reusable module that will at the very least allow us to find the average number of sightings per animal per day. Since our goal in the real world would be to develop a reusable module of code to import into a number of programs, we can't say for sure that this is the **only** thing our module will do now and forever more. Furthermore, changing the module's name later, while not impossible, runs the risk of breaking programs that we've already written to rely on it, or will at least require us to touch each of them to update the name of the imported module. In this case, then, the name `sightings` would seem to be reasonable for our module because whether we are counting total sightings or computing average sightings, we are still dealing with \"sightings,\" which, by the way, is the primary information that our data files contain. \n",
       "***"
      ]
     },
      "source": [
       "## Results\n",
       "\n",
-      "When you are done, the file `sightings.py` should look something like this (run the cell below to create it if you need help)."
+      "When you are done, your notebook should look something like this."
      ]
     },
     {
      "cell_type": "code",
      "collapsed": false,
      "input": [
-      "%%file sightings.py\n",
       "def count_wolverines(filename):\n",
       "    '''Given a plain text file containing animal sighting data in the form \n",
       "           date time animal count\n",
      ],
      "language": "python",
      "metadata": {},
-     "outputs": [
-      {
-       "output_type": "stream",
-       "stream": "stdout",
-       "text": [
-        "Writing sightings.py\n"
-       ]
-      }
-     ],
+     "outputs": [],
      "prompt_number": 1
     },
     {
      "cell_type": "code",
      "collapsed": false,
      "input": [
-      "import sightings\n",
-      "sightings.count_wolverines('big_animals.txt')"
+      "count_wolverines('big_animals.txt')"
      ],
      "language": "python",
      "metadata": {},
      "source": [
       "## Results\n",
       "\n",
-      "When you are done, the file `sightings.py` should look something like this (run the cell below to create it if you need help)."
+      "When you are done, your code should look something like this."
      ]
     },
     {
      "cell_type": "code",
      "collapsed": false,
      "input": [
-      "%%file sightings.py\n",
       "def read_sightings_from_file(filename):\n",
       "    ''' Given a plain text file containing animal sighting data in the form\n",
       "            date time animal count\n",
       "        animals.append(animal)\n",
       "        counts.append(int(count_string))\n",
       "\n",
-      "    return dates, times, animals, counts\n",
-      "\n",
-      "def count_wolverines(filename):\n",
-      "    '''Given a plain text file containing animal sighting data in the form \n",
-      "           date time animal count\n",
-      "       returns the total count of wolverines sighted.'''\n",
-      "    animal_file = open(filename, 'r')\n",
-      "    animal_file_lines = animal_file.readlines()\n",
-      "    animal_file.close()\n",
-      "\n",
-      "    total_count = 0\n",
-      "    for line in animal_file_lines:\n",
-      "        date, time, animal, count_string = line.split()\n",
-      "        if animal == 'Wolverine':\n",
-      "            total_count = total_count + int(count_string)\n",
-      "    return total_count"
+      "    return dates, times, animals, counts"
      ],
      "language": "python",
      "metadata": {},
-     "outputs": [
-      {
-       "output_type": "stream",
-       "stream": "stdout",
-       "text": [
-        "Overwriting sightings.py\n"
-       ]
-      }
-     ],
+     "outputs": [],
      "prompt_number": 4
     },
     {
      "cell_type": "code",
      "collapsed": false,
      "input": [
-      "reload(sightings)\n",
-      "sightings.read_sightings_from_file('animals.txt')"
+      "read_sightings_from_file('animals.txt')"
      ],
      "language": "python",
      "metadata": {},
      "cell_type": "code",
      "collapsed": false,
      "input": [
-      "%%file sightings.py\n",
-      "def read_sightings_from_file(filename):\n",
-      "    ''' Given a plain text file containing animal sighting data in the form\n",
-      "            date time animal count\n",
-      "        returns four lists, each containing the data from one column.'''\n",
-      "    \n",
-      "    animal_file = open(filename, 'r')\n",
-      "    animal_file_lines = animal_file.readlines()\n",
-      "    animal_file.close()\n",
-      "    \n",
-      "    dates = []\n",
-      "    times = []\n",
-      "    animals = []\n",
-      "    counts = []\n",
-      "    \n",
-      "    for line in animal_file_lines:\n",
-      "        date, time, animal, count_string = line.split()\n",
-      "        dates.append(date)\n",
-      "        times.append(time)\n",
-      "        animals.append(animal)\n",
-      "        counts.append(int(count_string))\n",
-      "        \n",
-      "    return dates, times, animals, counts\n",
-      "\n",
       "def count_wolverines(filename):\n",
       "    '''Given a plain text file containing animal sighting data in the form \n",
       "           date time animal count\n",
      ],
      "language": "python",
      "metadata": {},
-     "outputs": [
-      {
-       "output_type": "stream",
-       "stream": "stdout",
-       "text": [
-        "Overwriting sightings.py\n"
-       ]
-      }
-     ],
+     "outputs": [],
      "prompt_number": 7
     },
     {
      "cell_type": "code",
      "collapsed": false,
      "input": [
-      "reload(sightings)\n",
-      "sightings.count_wolverines('big_animals.txt')"
+      "count_wolverines('big_animals.txt')"
      ],
      "language": "python",
      "metadata": {},
      "collapsed": false,
      "input": [
       "def test_read_sightings_from_file():\n",
-      "    dates, times, animals, counts = sightings.read_sightings_from_file('animals.txt')\n",
+      "    dates, times, animals, counts = read_sightings_from_file('animals.txt')\n",
       "    if dates[0] == '2011-04-22':\n",
       "        print 'Looks good!'\n",
       "    else:\n",
      "cell_type": "markdown",
      "metadata": {},
      "source": [
-      "This is the core concept of unit testing: you create code that calls your code and verifies the results. We'll look at a number of conveniences that Python offers to streamline the process as we develop towards our goal of finding the mean number of sightings per animal. We'll start with a pre-canned example and modify it from there. Execute the cell below to create a file called `test_sightings.py`"
+      "This is the core concept of unit testing: you create code that calls your code and verifies the results. We'll look at a number of conveniences that Python offers to streamline the process as we develop towards our goal of finding the mean number of sightings per animal. We'll start with a pre-canned example and modify it from there."
      ]
     },
     {
      "cell_type": "code",
      "collapsed": false,
      "input": [
-      "%%file test_sightings.py\n",
-      "import sightings\n",
-      "\n",
       "def test_read_sightings_from_file():\n",
       "    expected_dates = ['2011-04-22', '2011-04-23', '2011-04-23', '2011-04-23', '2011-04-23']\n",
       "    expected_times = ['21:06', '14:12', '10:24', '20:08', '18:46']\n",
       "    expected_animals = ['Grizzly', 'Elk', 'Elk', 'Wolverine', 'Muskox']\n",
       "    expected_counts = [36, 25, 26, 31, 20]\n",
       "    \n",
-      "    dates, times, animals, counts = sightings.read_sightings_from_file('animals.txt')\n",
+      "    dates, times, animals, counts = read_sightings_from_file('animals.txt')\n",
       "    \n",
       "    assert dates == expected_dates, 'Dates do not match!'\n",
       "    assert times == expected_times, 'Times do not match!'\n",
      ],
      "language": "python",
      "metadata": {},
-     "outputs": [
-      {
-       "output_type": "stream",
-       "stream": "stdout",
-       "text": [
-        "Writing test_sightings.py\n"
-       ]
-      }
-     ],
+     "outputs": [],
      "prompt_number": 10
     },
     {
      "cell_type": "code",
      "collapsed": false,
      "input": [
-      "import test_sightings\n",
-      "test_sightings.test_read_sightings_from_file()"
+      "test_read_sightings_from_file()"
      ],
      "language": "python",
      "metadata": {},
       "    test_count_wolverines_with_one_wolverine()\n",
       "    ...\n",
       "\n",
-      "You can imagine this file growing very large. You can also be sure that at some point, some developer is going to write a test but forget to add it to the \"runner\", thereby running the risk of missing a bug. Fortunately, there is a very popular package that works with Python called `nose`. `nose` \"sniffs out your tests\" and runs them for you. `nose` is installed by default with several Python distributions, and its primary interface is a command-line runner script called `nosetests`."
+      "You can imagine this file growing very large. You can also be sure that at some point, some developer is going to write a test but forget to add it to the \"runner\", thereby running the risk of missing a bug. Fortunately, there is a very popular package that works with Python called `nose`. `nose` \"sniffs out your tests\" and runs them for you. `nose` is installed by default with several Python distributions."
      ]
     },
-    {
-     "cell_type": "code",
-     "collapsed": false,
-     "input": [
-      "!nosetests"
-     ],
-     "language": "python",
-     "metadata": {},
-     "outputs": [
-      {
-       "output_type": "stream",
-       "stream": "stdout",
-       "text": [
-        ".........\r\n",
-        "----------------------------------------------------------------------\r\n",
-        "Ran 9 tests in 0.003s\r\n",
-        "\r\n",
-        "OK\r\n"
-       ]
-      }
-     ],
-     "prompt_number": 15
-    },
     {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
-      "There is even a plugin for the IPython notebook that runs `nose` and produces colored cells and tracebacks within the notebooks themselves. The plugin is included in the directory containing this notebook."
+      "There is a plugin for the IPython notebook that runs `nose` and produces colored cells and tracebacks within the notebooks themselves. The plugin is included in the directory containing this notebook."
      ]
     },
     {
      "language": "python",
      "metadata": {},
      "outputs": [],
-     "prompt_number": 16
+     "prompt_number": 13
     },
     {
      "cell_type": "code",
      "outputs": [
       {
        "html": [
-        "<div id=\"ipython_nose_a8474679bbd64af09651e2ad71d36bcc\"></div>"
+        "<div id=\"ipython_nose_bfec549d740f4b038703369ae3f06e87\"></div>"
        ],
        "output_type": "display_data"
       },
       {
        "javascript": [
-        "document.ipython_nose_a8474679bbd64af09651e2ad71d36bcc = $(\"#ipython_nose_a8474679bbd64af09651e2ad71d36bcc\");"
+        "document.ipython_nose_bfec549d740f4b038703369ae3f06e87 = $(\"#ipython_nose_bfec549d740f4b038703369ae3f06e87\");"
        ],
        "output_type": "display_data"
       },
       {
        "javascript": [
-        "document.ipython_nose_a8474679bbd64af09651e2ad71d36bcc.append($(\"<span>.</span>\"));"
+        "document.ipython_nose_bfec549d740f4b038703369ae3f06e87.append($(\"<span>.</span>\"));"
        ],
        "output_type": "display_data"
       },
       {
        "javascript": [
-        "document.ipython_nose_a8474679bbd64af09651e2ad71d36bcc.append($(\"<span>F</span>\"));"
+        "document.ipython_nose_bfec549d740f4b038703369ae3f06e87.append($(\"<span>F</span>\"));"
        ],
        "output_type": "display_data"
       },
       {
        "javascript": [
-        "delete document.ipython_nose_a8474679bbd64af09651e2ad71d36bcc;"
+        "delete document.ipython_nose_bfec549d740f4b038703369ae3f06e87;"
        ],
        "output_type": "display_data"
       },
         "    "
        ],
        "output_type": "pyout",
-       "prompt_number": 17,
+       "prompt_number": 14,
        "text": [
         "1/2 tests passed; 1 failed\n",
         "========\n",
        ]
       }
      ],
-     "prompt_number": 17
+     "prompt_number": 14
     },
     {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
-      "Using `nose` allows you to run all of your tests with one command. It eliminates the need to remember to add tests to a dedicated runner script, and it will clearly highlight which test has failed and point you to a traceback."
+      "Using `nose` allows you to run all of your tests with one command. It eliminates the need to remember to add tests to a dedicated runner script, and it will clearly highlight which test has failed and point you to a traceback. When working from the command line with standalone Python programs and modules, `nose` provides a command-line runner script called `nosetests`."
      ]
     },
+    {
+     "cell_type": "code",
+     "collapsed": false,
+     "input": [
+      "!nosetests"
+     ],
+     "language": "python",
+     "metadata": {},
+     "outputs": [
+      {
+       "output_type": "stream",
+       "stream": "stdout",
+       "text": [
+        "........\r\n",
+        "----------------------------------------------------------------------\r\n",
+        "Ran 8 tests in 0.003s\r\n",
+        "\r\n",
+        "OK\r\n"
+       ]
+      }
+     ],
+     "prompt_number": 15
+    },
     {
      "cell_type": "markdown",
      "metadata": {},
       "**Note:** `nose` works by adhering to some reasonably sane naming conventions. If you prefix the names of all files containing tests with `test_`, and prefix the names of all functions containing tests with `test_`, `nose` will generally find your tests. A good rule of thumb is to place tests for a module named `my_module.py` in a file named `test_my_module.py`, and to name tests using plain English-y descriptions of what the test verifies, such as `test_that_two_plus_two_equals_four`."
      ]
     },
+    {
+     "cell_type": "markdown",
+     "metadata": {},
+     "source": [
+      "***\n",
+      "**Aside: Clearing out our failure example**\n",
+      "\n",
+      "Let's get that failed test out of the way by *redefining* the function to do nothing. In Python, this uses the `pass` keyword (but note that `pass` doesn't actually mean \"pass the test\")."
+     ]
+    },
+    {
+     "cell_type": "code",
+     "collapsed": false,
+     "input": [
+      "def test_add_two_plus_two_equals_four():\n",
+      "    pass"
+     ],
+     "language": "python",
+     "metadata": {},
+     "outputs": [],
+     "prompt_number": 16
+    },
+    {
+     "cell_type": "code",
+     "collapsed": false,
+     "input": [
+      "%nose"
+     ],
+     "language": "python",
+     "metadata": {},
+     "outputs": [
+      {
+       "html": [
+        "<div id=\"ipython_nose_7affb7b5cb134255ac605e091f8bd433\"></div>"
+       ],
+       "output_type": "display_data"
+      },
+      {
+       "javascript": [
+        "document.ipython_nose_7affb7b5cb134255ac605e091f8bd433 = $(\"#ipython_nose_7affb7b5cb134255ac605e091f8bd433\");"
+       ],
+       "output_type": "display_data"
+      },
+      {
+       "javascript": [
+        "document.ipython_nose_7affb7b5cb134255ac605e091f8bd433.append($(\"<span>.</span>\"));"
+       ],
+       "output_type": "display_data"
+      },
+      {
+       "javascript": [
+        "document.ipython_nose_7affb7b5cb134255ac605e091f8bd433.append($(\"<span>.</span>\"));"
+       ],
+       "output_type": "display_data"
+      },
+      {
+       "javascript": [
+        "delete document.ipython_nose_7affb7b5cb134255ac605e091f8bd433;"
+       ],
+       "output_type": "display_data"
+      },
+      {
+       "html": [
+        "    <style type=\"text/css\">\n",
+        "        span.nosefailedfunc {\n",
+        "            font-family: monospace;\n",
+        "            font-weight: bold;\n",
+        "        }\n",
+        "        div.noseresults {\n",
+        "            width: 100%;\n",
+        "        }\n",
+        "        div.nosebar {\n",
+        "            float: left;\n",
+        "            padding: 1ex 0px 1ex 0px;\n",
+        "        }\n",
+        "        div.nosebar.fail {\n",
+        "            background: #ff3019; /* Old browsers */\n",
+        "            /* FF3.6+ */\n",
+        "            background: -moz-linear-gradient(top, #ff3019 0%, #cf0404 100%);\n",
+        "            /* Chrome,Safari4+ */\n",
+        "            background: -webkit-gradient(linear, left top, left bottom,\n",
+        "                                         color-stop(0%,#ff3019),\n",
+        "                                         color-stop(100%,#cf0404));\n",
+        "            /* Chrome10+,Safari5.1+ */\n",
+        "            background: -webkit-linear-gradient(top, #ff3019 0%,#cf0404 100%);\n",
+        "            /* Opera 11.10+ */\n",
+        "            background: -o-linear-gradient(top, #ff3019 0%,#cf0404 100%);\n",
+        "            /* IE10+ */\n",
+        "            background: -ms-linear-gradient(top, #ff3019 0%,#cf0404 100%);\n",
+        "            /* W3C */\n",
+        "            background: linear-gradient(to bottom, #ff3019 0%,#cf0404 100%);\n",
+        "        }\n",
+        "        div.nosebar.pass {\n",
+        "            background: #52b152;\n",
+        "            background: -moz-linear-gradient(top, #52b152 1%, #008a00 100%);\n",
+        "            background: -webkit-gradient(linear, left top, left bottom,\n",
+        "                                         color-stop(1%,#52b152),\n",
+        "                                         color-stop(100%,#008a00));\n",
+        "            background: -webkit-linear-gradient(top, #52b152 1%,#008a00 100%);\n",
+        "            background: -o-linear-gradient(top, #52b152 1%,#008a00 100%);\n",
+        "            background: -ms-linear-gradient(top, #52b152 1%,#008a00 100%);\n",
+        "            background: linear-gradient(to bottom, #52b152 1%,#008a00 100%);\n",
+        "        }\n",
+        "        div.nosebar.skip {\n",
+        "            background: #f1e767;\n",
+        "            background: -moz-linear-gradient(top, #f1e767 0%, #feb645 100%);\n",
+        "            background: -webkit-gradient(linear, left top, left bottom,\n",
+        "                                         color-stop(0%,#f1e767),\n",
+        "                                         color-stop(100%,#feb645));\n",
+        "            background: -webkit-linear-gradient(top, #f1e767 0%,#feb645 100%);\n",
+        "            background: -o-linear-gradient(top, #f1e767 0%,#feb645 100%);\n",
+        "            background: -ms-linear-gradient(top, #f1e767 0%,#feb645 100%);\n",
+        "            background: linear-gradient(to bottom, #f1e767 0%,#feb645 100%);\n",
+        "        }\n",
+        "        div.nosebar.leftmost {\n",
+        "            border-radius: 4px 0 0 4px;\n",
+        "        }\n",
+        "        div.nosebar.rightmost {\n",
+        "            border-radius: 0 4px 4px 0;\n",
+        "        }\n",
+        "        div.nosefailbanner {\n",
+        "            border-radius: 4px 0 0 4px;\n",
+        "            border-left: 10px solid #cf0404;\n",
+        "            padding: 0.5ex 0em 0.5ex 1em;\n",
+        "            margin-top: 1ex;\n",
+        "            margin-bottom: 0px;\n",
+        "        }\n",
+        "        div.nosefailbanner.expanded {\n",
+        "            border-radius: 4px 4px 0 0;\n",
+        "            border-top: 10px solid #cf0404;\n",
+        "        }\n",
+        "        pre.nosetraceback {\n",
+        "            border-radius: 0 4px 4px 4px;\n",
+        "            border-left: 10px solid #cf0404;\n",
+        "            padding: 1em;\n",
+        "            margin-left: 0px;\n",
+        "            margin-top: 0px;\n",
+        "            display: none;\n",
+        "        }\n",
+        "    </style>\n",
+        "    \n",
+        "    <script>\n",
+        "        setTimeout(function () {\n",
+        "            $('.nosefailtoggle').bind(\n",
+        "                'click',\n",
+        "                function () {\n",
+        "                    $(\n",
+        "                        $(this)\n",
+        "                            .parent().toggleClass('expanded')\n",
+        "                            .parent()\n",
+        "                            .children()\n",
+        "                            .filter('.nosetraceback')\n",
+        "                    ).toggle();\n",
+        "                }\n",
+        "            );},\n",
+        "            0);\n",
+        "    </script>\n",
+        "    \n",
+        "    <div class=\"noseresults\">\n",
+        "      <div class=\"nosebar fail leftmost\" style=\"width: 0%\">\n",
+        "          &nbsp;\n",
+        "      </div>\n",
+        "      <div class=\"nosebar skip\" style=\"width: 0%\">\n",
+        "          &nbsp;\n",
+        "      </div>\n",
+        "      <div class=\"nosebar pass rightmost\" style=\"width: 100%\">\n",
+        "          &nbsp;\n",
+        "      </div>\n",
+        "      2/2 tests passed\n",
+        "    </div>\n",
+        "    "
+       ],
+       "output_type": "pyout",
+       "prompt_number": 17,
+       "text": [
+        "2/2 tests passed\n"
+       ]
+      }
+     ],
+     "prompt_number": 17
+    },
+    {
+     "cell_type": "markdown",
+     "metadata": {},
+     "source": [
+      "***"
+     ]
+    },
     {
      "cell_type": "code",
      "collapsed": false,