Preparing material for Version 5 novice release
authorGreg Wilson <gvwilson@third-bit.com>
Fri, 7 Feb 2014 17:05:02 +0000 (12:05 -0500)
committerW. Trevor King <wking@tremily.us>
Tue, 11 Mar 2014 04:55:29 +0000 (21:55 -0700)
W. Trevor King: I removed everything except the sql/novice changes
from the original b66851c [1].

[1]: https://github.com/swcarpentry/bc/commit/b66851c2b765440e5c7ccb23383aff15eec31ff2

14 files changed:
sql/novice/.gitignore [new file with mode: 0644]
sql/novice/01-select.ipynb
sql/novice/02-sort-dup.ipynb
sql/novice/03-filter.ipynb
sql/novice/04-calc.ipynb
sql/novice/05-null.ipynb
sql/novice/06-agg.ipynb
sql/novice/07-join.ipynb
sql/novice/08-create.ipynb
sql/novice/09-prog.ipynb
sql/novice/README.txt
sql/novice/guide.md [deleted file]
sql/novice/index.md
sql/novice/reference.md [deleted file]

diff --git a/sql/novice/.gitignore b/sql/novice/.gitignore
new file mode 100644 (file)
index 0000000..b22cd77
--- /dev/null
@@ -0,0 +1,9 @@
+01-select.md
+02-sort-dup.md
+03-filter.md
+04-calc.md
+05-null.md
+06-agg.md
+07-join.md
+08-create.md
+09-prog.md
index 4455598580c7b871485405e7f64c5798d7d20ad8..85f074a482af1afb5c7abe95699cced60b2449d0 100644 (file)
@@ -9,7 +9,7 @@
    "cells": [
     {
      "cell_type": "heading",
-     "level": 1,
+     "level": 2,
      "metadata": {},
      "source": [
       "Selecting Data"
       "and these lessons will show how to do that."
      ]
     },
+    {
+     "cell_type": "markdown",
+     "metadata": {
+      "cell_tags": [
+       "objectives"
+      ]
+     },
+     "source": [
+      "#### Objectives\n",
+      "\n",
+      "*   Explain the difference between a table, a record, and a field.\n",
+      "*   Explain the difference between a database and a database manager.\n",
+      "*   Write a query to select all values for specific fields from a single table."
+     ]
+    },
     {
      "cell_type": "heading",
-     "level": 2,
+     "level": 3,
      "metadata": {},
      "source": [
       "A Few Definitions"
       "  <tr> <td>734</td> <td>DR-3</td> <td>1939-01-07</td> </tr>\n",
       "  <tr> <td>735</td> <td>DR-3</td> <td>1930-01-12</td> </tr>\n",
       "  <tr> <td>751</td> <td>DR-3</td> <td>1930-02-26</td> </tr>\n",
-      "  <tr> <td>752</td> <td>DR-3</td> <td>NULL</td> </tr>\n",
+      "  <tr> <td>752</td> <td>DR-3</td> <td bgcolor=\"red\">&nbsp;</td> </tr>\n",
       "  <tr> <td>837</td> <td>MSK-4</td> <td>1932-01-14</td> </tr>\n",
       "  <tr> <td>844</td> <td>DR-1</td> <td>1932-03-22</td> </tr>\n",
       "</table>\n",
       "  <tr> <td>734</td> <td>lake</td> <td>sal</td> <td>0.05</td> </tr>\n",
       "  <tr> <td>734</td> <td>pb</td> <td>temp</td> <td>-21.5</td> </tr>\n",
       "  <tr> <td>735</td> <td>pb</td> <td>rad</td> <td>7.22</td> </tr>\n",
-      "  <tr> <td>735</td> <td>NULL</td> <td>sal</td> <td>0.06</td> </tr>\n",
-      "  <tr> <td>735</td> <td>NULL</td> <td>temp</td> <td>-26.0</td> </tr>\n",
+      "  <tr> <td>735</td> <td bgcolor=\"red\">&nbsp;</td> <td>sal</td> <td>0.06</td> </tr>\n",
+      "  <tr> <td>735</td> <td bgcolor=\"red\">&nbsp;</td> <td>temp</td> <td>-26.0</td> </tr>\n",
       "  <tr> <td>751</td> <td>pb</td> <td>rad</td> <td>4.35</td> </tr>\n",
       "  <tr> <td>751</td> <td>pb</td> <td>temp</td> <td>-18.5</td> </tr>\n",
       "  <tr> <td>751</td> <td>lake</td> <td>sal</td> <td>0.1</td> </tr>\n",
      "metadata": {},
      "source": [
       "Notice that three entries&mdash;one in the `Visited` table,\n",
-      "and two in the `Survey` table&mdash;are shown as `NULL`.\n",
-      "We'll return to these values [later](#s:null).\n",
+      "and two in the `Survey` table&mdash;are shown in red\n",
+      "because they don't contain any actual data:\n",
+      "we'll return to these missing values [later](#s:null).\n",
       "For now,\n",
       "let's write an SQL query that displays scientists' names.\n",
       "We do this using the SQL command `select`,\n",
        "metadata": {},
        "output_type": "display_data",
        "text": [
-        "<IPython.core.display.HTML at 0x1023c6050>"
+        "<IPython.core.display.HTML at 0x102353a10>"
        ]
       }
      ],
        "metadata": {},
        "output_type": "display_data",
        "text": [
-        "<IPython.core.display.HTML at 0x1023c41d0>"
+        "<IPython.core.display.HTML at 0x102353c90>"
        ]
       }
      ],
-     "prompt_number": 5
+     "prompt_number": 3
     },
     {
      "cell_type": "markdown",
        "metadata": {},
        "output_type": "display_data",
        "text": [
-        "<IPython.core.display.HTML at 0x1023c4fd0>"
+        "<IPython.core.display.HTML at 0x102353a10>"
        ]
       }
      ],
-     "prompt_number": 6
+     "prompt_number": 4
     },
     {
      "cell_type": "markdown",
        "metadata": {},
        "output_type": "display_data",
        "text": [
-        "<IPython.core.display.HTML at 0x1023c4090>"
+        "<IPython.core.display.HTML at 0x102353b10>"
        ]
       }
      ],
-     "prompt_number": 7
+     "prompt_number": 5
     },
     {
      "cell_type": "markdown",
        "metadata": {},
        "output_type": "display_data",
        "text": [
-        "<IPython.core.display.HTML at 0x1023c3b10>"
+        "<IPython.core.display.HTML at 0x102353bd0>"
        ]
       }
      ],
-     "prompt_number": 8
+     "prompt_number": 6
     },
     {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
-      "### Challenges\n",
+      "#### Challenges\n",
       "\n",
       "1.  Write a query that selects only site names from the `Site` table.\n",
       "\n",
       "\n",
       "    What style do you find easiest to read, and why?"
      ]
+    },
+    {
+     "cell_type": "markdown",
+     "metadata": {
+      "cell_tags": [
+       "keypoints"
+      ]
+     },
+     "source": [
+      "#### Key Points\n",
+      "\n",
+      "*   A relational database stores information in tables,\n",
+      "    each of which has a fixed set of columns and a variable number of records.\n",
+      "*   A database manager is a program that manipulates information stored in a database.\n",
+      "*   We write queries in a specialized language called SQL to extract information from databases.\n",
+      "*   SQL is case-insensitive."
+     ]
     }
    ],
    "metadata": {}
index f09cdd02d068ee474b87961e1e9f4fb8ac7ace57..8ffe0dbeab69b6cbaf9486c78c51d8a4cdde1ff7 100644 (file)
@@ -9,12 +9,26 @@
    "cells": [
     {
      "cell_type": "heading",
-     "level": 1,
+     "level": 2,
      "metadata": {},
      "source": [
       "Sorting and Removing Duplicates"
      ]
     },
+    {
+     "cell_type": "markdown",
+     "metadata": {
+      "cell_tags": [
+       "objectives"
+      ]
+     },
+     "source": [
+      "#### Objectives\n",
+      "\n",
+      "*   Write queries that display results in a particular order.\n",
+      "*   Write queries that eliminate duplicate values from data."
+     ]
+    },
     {
      "cell_type": "markdown",
      "metadata": {},
@@ -77,7 +91,7 @@
        "metadata": {},
        "output_type": "display_data",
        "text": [
-        "<IPython.core.display.HTML at 0x1023cbe10>"
+        "<IPython.core.display.HTML at 0x102358c90>"
        ]
       }
      ],
        "metadata": {},
        "output_type": "display_data",
        "text": [
-        "<IPython.core.display.HTML at 0x1023cbf10>"
+        "<IPython.core.display.HTML at 0x102358d90>"
        ]
       }
      ],
        "metadata": {},
        "output_type": "display_data",
        "text": [
-        "<IPython.core.display.HTML at 0x1023cbf10>"
+        "<IPython.core.display.HTML at 0x102353c90>"
        ]
       }
      ],
      "cell_type": "markdown",
      "metadata": {},
      "source": [
-      "### Challenges\n",
+      "#### Challenges\n",
       "\n",
       "1.  Write a query that selects distinct dates from the `Site` table."
      ]
        "metadata": {},
        "output_type": "display_data",
        "text": [
-        "<IPython.core.display.HTML at 0x1023cbe90>"
+        "<IPython.core.display.HTML at 0x102353b10>"
        ]
       }
      ],
        "metadata": {},
        "output_type": "display_data",
        "text": [
-        "<IPython.core.display.HTML at 0x1023cbe10>"
+        "<IPython.core.display.HTML at 0x102353c50>"
        ]
       }
      ],
        "metadata": {},
        "output_type": "display_data",
        "text": [
-        "<IPython.core.display.HTML at 0x1023c4090>"
+        "<IPython.core.display.HTML at 0x1023557d0>"
        ]
       }
      ],
        "metadata": {},
        "output_type": "display_data",
        "text": [
-        "<IPython.core.display.HTML at 0x1023c96d0>"
+        "<IPython.core.display.HTML at 0x102353b10>"
        ]
       }
      ],
      "cell_type": "markdown",
      "metadata": {},
      "source": [
-      "### Challenges\n",
+      "#### Challenges\n",
+      "\n",
+      "1.  Write a query that returns the distinct dates in the `Visited` table.\n",
+      "\n",
+      "2.  Write a query that displays the full names of the scientists in the `Person` table, ordered by family name."
+     ]
+    },
+    {
+     "cell_type": "markdown",
+     "metadata": {
+      "cell_tags": [
+       "keypoints"
+      ]
+     },
+     "source": [
+      "#### Key Points\n",
       "\n",
-      "1.  Write a query that displays scientists' full names, but orders them by surname."
+      "*   The records in a database table are not intrinsically ordered:\n",
+      "    if we want to display them in some order,\n",
+      "    we must specify that explicitly.\n",
+      "*   The values in a database are not guaranteed to be unique:\n",
+      "    if we want to eliminate duplicates,\n",
+      "    we must specify that explicitly as well."
      ]
     }
    ],
index c9a3e95fbef1df9423e2a9f8d052b6ff2fce1c8d..f06d40e586b3a655ea984e3572e5d3a062ad7de3 100644 (file)
@@ -9,12 +9,26 @@
    "cells": [
     {
      "cell_type": "heading",
-     "level": 1,
+     "level": 2,
      "metadata": {},
      "source": [
       "Filtering"
      ]
     },
+    {
+     "cell_type": "markdown",
+     "metadata": {
+      "cell_tags": [
+       "objectives"
+      ]
+     },
+     "source": [
+      "#### Objectives\n",
+      "\n",
+      "*   Write queries that select records that satisfy user-specified conditions.\n",
+      "*   Explain the order in which the clauses in a query are executed."
+     ]
+    },
     {
      "cell_type": "markdown",
      "metadata": {},
      "cell_type": "markdown",
      "metadata": {},
      "source": [
-      "### Challenges\n",
+      "#### Challenges\n",
       "\n",
-      "1.  Suppose we want to select all sites that lie within 30&deg; of the equator.\n",
+      "1.  Suppose we want to select all sites that lie more than 30&deg; from the poles.\n",
       "    Our first query is:\n",
       "\n",
       "    ~~~\n",
-      "    select * from Site where (lat > -30) or (lat < 30);\n",
+      "    select * from Site where (lat > -60) or (lat < 60);\n",
       "    ~~~\n",
       "\n",
       "    Explain why this is wrong,\n",
       "      <tr> <td><code>'a' like '%a'</code></td> <td>True</td> </tr>\n",
       "      <tr> <td><code>'b' like '%a'</code></td> <td>False</td> </tr>\n",
       "      <tr> <td><code>'alpha' like 'a%'</code></td> <td>True</td> </tr>\n",
-      "      <tr> <td><code>'alpha' like 'a%p%'</code> <td>True</td> </tr>\n",
+      "      <tr> <td><code>'alpha' like 'a%p%'</code></td> <td>True</td> </tr>\n",
       "    </table>\n",
       "    The expression `*column-name* not like *pattern*`\n",
       "    inverts the test.\n",
       "    write a query that finds all the records in `Visited`\n",
       "    that *aren't* from sites labelled 'DR-something'."
      ]
+    },
+    {
+     "cell_type": "markdown",
+     "metadata": {
+      "cell_tags": [
+       "keypoints"
+      ]
+     },
+     "source": [
+      "#### Key Points\n",
+      "\n",
+      "*   Use `where` to filter records according to Boolean conditions.\n",
+      "*   Filtering is done on whole records,\n",
+      "    so conditions can use fields that are not actually displayed."
+     ]
     }
    ],
    "metadata": {}
index 79c64cb58328275ddbe26b6ad39b519ec4d00341..ad8cbf40d4a2e7368501569048d4293d1cebf5d4 100644 (file)
@@ -9,12 +9,25 @@
    "cells": [
     {
      "cell_type": "heading",
-     "level": 1,
+     "level": 2,
      "metadata": {},
      "source": [
       "Calculating New Values"
      ]
     },
+    {
+     "cell_type": "markdown",
+     "metadata": {
+      "cell_tags": [
+       "objectives"
+      ]
+     },
+     "source": [
+      "#### Objectives\n",
+      "\n",
+      "*   Write queries that calculate new values for each selected record."
+     ]
+    },
     {
      "cell_type": "markdown",
      "metadata": {},
      "cell_type": "markdown",
      "metadata": {},
      "source": [
-      "### Challenges\n",
+      "#### Challenges\n",
       "\n",
       "1.  After further reading,\n",
       "    we realize that Valentina Roerich\n",
       "(For this data,\n",
       "the list should contain only \"DR\" and \"MSK\")."
      ]
+    },
+    {
+     "cell_type": "markdown",
+     "metadata": {
+      "cell_tags": [
+       "keypoints"
+      ]
+     },
+     "source": [
+      "#### Key Points\n",
+      "\n",
+      "*   SQL can perform calculations using the values in a record as part of a query."
+     ]
     }
    ],
    "metadata": {}
index 5173fb3b66b5d60992a418dd153c4ed58bfff0fc..0b0f3c35fb76d20cc6153c5e4bf6dda70ed5f068 100644 (file)
@@ -9,12 +9,27 @@
    "cells": [
     {
      "cell_type": "heading",
-     "level": 1,
+     "level": 2,
      "metadata": {},
      "source": [
       "Missing Data"
      ]
     },
+    {
+     "cell_type": "markdown",
+     "metadata": {
+      "cell_tags": [
+       "objectives"
+      ]
+     },
+     "source": [
+      "#### Objectives\n",
+      "\n",
+      "*   Explain how databases represent missing information.\n",
+      "*   Explain the three-valued logic databases use when manipulating missing information.\n",
+      "*   Write queries that handle missing information correctly."
+     ]
+    },
     {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
       "Null values cause headaches wherever they appear.\n",
       "For example,\n",
-      "suppose we want to find the all of salinity measurements\n",
+      "suppose we want to find all the salinity measurements\n",
       "that weren't taken by Dyer.\n",
       "It's natural to write the query like this:"
      ]
      "cell_type": "markdown",
      "metadata": {},
      "source": [
+      "#### Challenges\n",
+      "\n",
       "1.  Write a query that sorts the records in `Visited` by date,\n",
       "    omitting entries for which the date is not known\n",
       "    (i.e., is null).\n",
       "    What does this simplify?\n",
       "    What burdens or risks does it introduce?"
      ]
+    },
+    {
+     "cell_type": "markdown",
+     "metadata": {
+      "cell_tags": [
+       "keypoints"
+      ]
+     },
+     "source": [
+      "#### Key Points\n",
+      "\n",
+      "*   Databases use `null` to represent missing information.\n",
+      "*   Any arithmetic or Boolean operation involving `null` produces `null` as a result.\n",
+      "*   The only operators that can safely be used with `null` are `is null` and `is not null`."
+     ]
     }
    ],
    "metadata": {}
index 4982a975f6cce759478430f9bffdf39a73d7a7ed..f54e01f3249893cae0076a8a5077af1496b0c0f2 100644 (file)
@@ -9,12 +9,28 @@
    "cells": [
     {
      "cell_type": "heading",
-     "level": 1,
+     "level": 2,
      "metadata": {},
      "source": [
       "Aggregation"
      ]
     },
+    {
+     "cell_type": "markdown",
+     "metadata": {
+      "cell_tags": [
+       "objectives"
+      ]
+     },
+     "source": [
+      "#### Objectives\n",
+      "\n",
+      "*   Define \"aggregation\" and give examples of its use.\n",
+      "*   Write queries that compute aggregated values.\n",
+      "*   Trace the execution of a query that performs aggregation.\n",
+      "*   Explain how missing data is handled during aggregation."
+     ]
+    },
     {
      "cell_type": "markdown",
      "metadata": {},
      "cell_type": "markdown",
      "metadata": {},
      "source": [
-      "### Challenges\n",
+      "#### Challenges\n",
       "\n",
       "1.  How many temperature readings did Frank Pabodie record,\n",
       "    and what was their average value?\n",
       "    We write the query:\n",
       "\n",
       "    ~~~\n",
-      "    select reading-avg(reading) from Survey where quant='rad';\n",
+      "    select reading - avg(reading) from Survey where quant='rad';\n",
       "    ~~~\n",
       "\n",
       "    What does this actually produce, and why?\n",
       "\n",
       "    Can you find a way to order the list by surname?"
      ]
+    },
+    {
+     "cell_type": "markdown",
+     "metadata": {
+      "cell_tags": [
+       "keypoints"
+      ]
+     },
+     "source": [
+      "#### Key Points\n",
+      "\n",
+      "*   An aggregation function combines many values to produce a single new value.\n",
+      "*   Aggregation functions ignore `null` values.\n",
+      "*   Aggregation happens after filtering."
+     ]
     }
    ],
    "metadata": {}
index a5afa8f0410cef6036584f2ff6c8daef8c95f75c..93761c51a1821a27219ee179f622e1642408ff25 100644 (file)
@@ -9,12 +9,29 @@
    "cells": [
     {
      "cell_type": "heading",
-     "level": 1,
+     "level": 2,
      "metadata": {},
      "source": [
       "Combining Data"
      ]
     },
+    {
+     "cell_type": "markdown",
+     "metadata": {
+      "cell_tags": [
+       "objectives"
+      ]
+     },
+     "source": [
+      "#### Objectives\n",
+      "\n",
+      "*   Explain the operation of a query that joins two tables.\n",
+      "*   Explain how to restrict the output of a query containing a join to only include meaningful combinations of values.\n",
+      "*   Write queries that join tables on equal keys.\n",
+      "*   Explain what primary and foreign keys are, and why they are useful.\n",
+      "*   Explain what atomic values are, and why database fields should only contain atomic values."
+     ]
+    },
     {
      "cell_type": "markdown",
      "metadata": {},
     },
     {
      "cell_type": "heading",
-     "level": 2,
+     "level": 3,
      "metadata": {},
      "source": [
       "Data Hygiene"
       "we have to change multiple records in the database.\n",
       "What's worse,\n",
       "we may have to guess which records to change,\n",
-      "since some other sites may actually have been visited on that date.\n",
+      "since other sites may also have been visited on that date.\n",
       "\n",
       "The fourth rule is that the units for every value should be stored explicitly.\n",
       "Our database doesn't do this,\n",
       "and that's a problem:\n",
       "Roerich's salinity measurements are several orders of magnitude larger than anyone else's,\n",
       "but we don't know if that means she was using parts per million instead of parts per thousand,\n",
-      "or whether there actually was a saline anomaly at that site in 1932."
+      "or whether there actually was a saline anomaly at that site in 1932.\n",
+      "\n",
+      "Stepping back,\n",
+      "data and the tools used to store it have a symbiotic relationship:\n",
+      "we use tables and joins because it's efficient,\n",
+      "provided our data is organized a certain way,\n",
+      "but organize our data that way because we have tools to manipulate it efficiently\n",
+      "if it's in a certain form.\n",
+      "As anthropologists say,\n",
+      "the tool shapes the hand that shapes the tool."
      ]
     },
     {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
-      "### Challenges\n",
+      "#### Challenges\n",
       "\n",
       "1.  Write a query that lists all radiation readings from the DR-1 site.\n",
       "\n",
       "    on Site.lat<-49.0 and Site.name=Visited.site and Visited.dated>='1932-00-00';\n",
       "    ~~~"
      ]
+    },
+    {
+     "cell_type": "markdown",
+     "metadata": {
+      "cell_tags": [
+       "keypoints"
+      ]
+     },
+     "source": [
+      "#### Key Points\n",
+      "\n",
+      "*   Every fact should be represented in a database exactly once.\n",
+      "*   A join produces all combinations of records from one table with records from another.\n",
+      "*   A primary key is a field (or set of fields) whose values uniquely identify the records in a table.\n",
+      "*   A foreign key is a field (or set of fields) in one table whose values are a primary key in another table.\n",
+      "*   We can eliminate meaningless combinations of records by matching primary keys and foreign keys between tables.\n",
+      "*   Keys should be atomic values to make joins simpler and more efficient."
+     ]
     }
    ],
    "metadata": {}
index c695563abd3f134f5f9fabba1af853dd34b45154..94da1e94996aa71e7e21155408e61a8cf51d6475 100644 (file)
@@ -9,12 +9,26 @@
    "cells": [
     {
      "cell_type": "heading",
-     "level": 1,
+     "level": 2,
      "metadata": {},
      "source": [
       "Creating and Modifying Data"
      ]
     },
+    {
+     "cell_type": "markdown",
+     "metadata": {
+      "cell_tags": [
+       "objectives"
+      ]
+     },
+     "source": [
+      "#### Objectives\n",
+      "\n",
+      "*   Write queries that creates tables.\n",
+      "*   Write queries to insert, modify, and delete records."
+     ]
+    },
     {
      "cell_type": "markdown",
      "metadata": {},
       "We can also insert values into one table directly from another:\n",
       "\n",
       "~~~\n",
-      "create table JustLatLong(lat text, long TEXT);\n",
+      "create table JustLatLong(lat text, long text);\n",
       "insert into JustLatLong select lat, long from site;\n",
       "~~~\n",
       "\n",
      "cell_type": "markdown",
      "metadata": {},
      "source": [
-      "### Challenges\n",
+      "#### Challenges\n",
       "\n",
       "1.  Write an SQL statement to replace all uses of `null`\n",
       "    in `Survey.person`\n",
       "    which reads a file created by `.dump` and restores the database.\n",
       "    A colleague of yours thinks that storing dump files (which are text) in version control\n",
       "    is a good way to track and manage changes to the database.\n",
-      "    What are the pros and cons of this approach?"
+      "    What are the pros and cons of this approach?\n",
+      "    (Hint: records aren't stored in any particular order.)"
+     ]
+    },
+    {
+     "cell_type": "markdown",
+     "metadata": {
+      "cell_tags": [
+       "keypoints"
+      ]
+     },
+     "source": [
+      "#### Key Points\n",
+      "\n",
+      "*   Database tables are created using queries that specify their names and the names and properties of their fields.\n",
+      "*   Records can be inserted, updated, or deleted using queries.\n",
+      "*   It is simpler and safer to modify data when every record has a unique primary key."
      ]
     }
    ],
index 6105f8c62b0f51b014eb8d87bd5e49e7149eaf03..63461bae5271a9292a030ba84c68049a309a9ecc 100644 (file)
@@ -9,12 +9,27 @@
    "cells": [
     {
      "cell_type": "heading",
-     "level": 1,
+     "level": 2,
      "metadata": {},
      "source": [
       "Programming with Databases"
      ]
     },
+    {
+     "cell_type": "markdown",
+     "metadata": {
+      "cell_tags": [
+       "objectives"
+      ]
+     },
+     "source": [
+      "#### Objectives\n",
+      "\n",
+      "*   Write short programs that execute SQL queries.\n",
+      "*   Trace the execution of a program that contains an SQL query.\n",
+      "*   Explain why most database applications are written in a general-purpose language rather than in SQL."
+     ]
+    },
     {
      "cell_type": "markdown",
      "metadata": {},
      "cell_type": "markdown",
      "metadata": {},
      "source": [
-      "### Challenges\n",
+      "#### Challenges\n",
       "\n",
       "1.  Write a Python program that creates a new database\n",
       "    in a file called `original.db`\n",
       "    filtering values in the query,\n",
       "    or reading everything into memory and filtering in Python?"
      ]
+    },
+    {
+     "cell_type": "markdown",
+     "metadata": {
+      "cell_tags": [
+       "keypoints"
+      ]
+     },
+     "source": [
+      "#### Key Points\n",
+      "\n",
+      "*   We usually write database applications in a general-purpose language, and embed SQL queries in it.\n",
+      "*   To connect to a database, a program must use a library specific to that database manager.\n",
+      "*   A program may open one or more connections to a single database, and have one or more cursors active in each.\n",
+      "*   Programs can read query results in batches or all at once."
+     ]
     }
    ],
    "metadata": {}
index 1a67353ed7f9a8106b4fc8c641a83391c5c8b29e..58dd4fb88a4edf1b584403d4b54a740844be3f99 100644 (file)
@@ -1 +1,8 @@
-Note: run 'sqlite3 survey.db < gen-survey-database.sql' to re-create survey database before loading notebooks.
+# Resources
+
+*   `gen-survey-database.sql`: re-generate survey database used in examples.
+*   `sqlitemagic.py`: IPython Notebook plugin to support SQLite interaction.
+
+# Notes
+
+* Run 'sqlite3 survey.db < gen-survey-database.sql' to re-create survey database before loading notebooks.
diff --git a/sql/novice/guide.md b/sql/novice/guide.md
deleted file mode 100644 (file)
index 22fb503..0000000
+++ /dev/null
@@ -1,59 +0,0 @@
----
-layout: lesson
-root: ../..
-title: Instructor's Guide
-level: novice
----
-Relational databases are not as widely used in science as in business,
-but they are still a common way to store large data sets with complex structure.
-Even when the data itself isn't in a database,
-the metadata could be:
-for example,
-meteorological data might be stored in files on disk,
-but data about when and where observations were made,
-data ranges,
-and so on could be in a database
-to make it easier for scientists to find what they want to.
-
-Teaching Notes
---------------
-
-*   The first few sections (up to "Missing Data") usually go very quickly.
-    The pace usually slows down a bit when null values are discussed
-    mostly because learners have a lot of details to keep straight by this point.
-    Things *really* slow down during the discussion of joins,
-    but this is the key idea in the whole lesson:
-    important ideas like primary keys and referential integrity
-    only make sense once learners have seen how they're used in joins.
-    It's worth going over things a couple of times if necessary (with lots of examples).
-
-*   The sections on creating and modifying data,
-    and programming with databases,
-    can be dropped if time is short.
-    Of the two,
-    people seem to care most about how to add data (which only takes a few minutes to demonstrate).
-
-
-*   Overall,
-    this material takes three hours to present assuming that a short exercise is done with each topic.
-
-*   Simple calculations are actually easier to do in a spreadsheet;
-    the advantages of using a database become clear as soon as filtering and joins are needed.
-    Instructors may therefore want to show a spreadsheet with the information from the four database tables
-    consolidated into a single sheet,
-    and demonstrate what's needed in both systems to answer questions like,
-    "What was the average radiation reading in 1931?"
-
-*   Some learners may have heard that NoSQL databases
-    (i.e., ones that don't use the relational model)
-    are the next big thing,
-    and ask why we're not teaching those.
-    The answers are:
-    1.  Relational databases are far more widely used than NoSQL databases.
-    2.  We have far more experience with relational databases than with any other kind,
-        so we have a better idea of what to teach and how to teach it.
-    3.  NoSQL databases are as different from each other as they are from relational databases.
-        Until a leader emerges, it isn't clear *which* NoSQL database we should teach.
-
-*   Run `sqlite3 survey.db < gen-survey-database.sql`
-    to re-create survey database before loading notebooks.
index 1f316761df2b871d387b3dc5fd8219b15c68d762..aded47f2cfc9ddbdb24818913d53293b983cc0e3 100644 (file)
@@ -21,8 +21,8 @@ knowing how they work will help us understand why so many of the systems we use
 behave the way we do,
 and why they insist on structuring data in certain ways.
 
-Topics
-------
+<div class="toc" markdown="1">
+
 1.  [Selecting Data](01-select.html)
 2.  [Sorting and Removing Duplicates](02-sort-dup.html)
 3.  [Filtering](03-filter.html)
@@ -33,12 +33,4 @@ Topics
 8.  [Creating and Modifying Data](08-create.html)
 9.  [Programming with Databases](09-prog.html)
 
-See Also
---------
-*   [Instructor's Guide](guide.html)
-*   [Reference](reference.html)
-
-Resources
----------
-*   `gen-survey-database.sql`: re-generate survey database used in examples.
-*   `sqlitemagic.py`: IPython Notebook plugin to support SQLite interaction.
+</div>
diff --git a/sql/novice/reference.md b/sql/novice/reference.md
deleted file mode 100644 (file)
index ea860f3..0000000
+++ /dev/null
@@ -1,122 +0,0 @@
----
-layout: lesson
-root: ../..
-title: SQL Reference
-level: novice
----
-Basic Queries
--------------
-
-Select one or more columns from a table:
-
-    SELECT column_name_1, column_name_2 FROM table_name;
-
-Select all columns from a table:
-
-    SELECT * FROM table_name;
-
-Get only unique results in a query:
-
-    SELECT DISTINCT column_name FROM table_name;
-
-Perform calculations in a query:
-
-    SELECT column_name_1, ROUND(column_name_2 / 1000.0) FROM table_name;
-
-Sort results in ascending order:
-
-    SELECT * FROM table_name ORDER BY column_name_1;
-
-Sort results in ascending and descending order:
-
-    SELECT * FROM table_name ORDER BY column_name_1 ASC, column_name_2 DESC;
-
-Filtering
----------
-
-Select only data meeting a condition:
-
-    SELECT * FROM table_name WHERE column_name > 0;
-
-Select only data meeting a combination of conditions:
-
-    SELECT * FROM table_name WHERE (column_name_1 >= 1000) AND (column_name_2 = 'A' OR column_name_2 = 'B');
-
-Missing Data
-------------
-
-Use `NULL` to represent missing data.
-
-`NULL` is not 0, false, or any other specific value.
-Operations involving `NULL` produce `NULL`, so `1+NULL`, `2>NULL`, and `3=NULL` are all `NULL`.
-
-Test whether a value is null:
-
-    SELECT * FROM table_name WHERE column_name IS NULL;
-
-Test whether a value is not null:
-
-    SELECT * FROM table_name WHERE column_name IS NOT NULL;
-
-Grouping and Aggregation
-------------------------
-
-Combine values using aggregation functions:
-
-    SELECT SUM(column_name_1) FROM table_name;
-
-Combine data into groups and calculate combined values in groups:
-
-    SELECT column_name_1, SUM(column_name_2), COUNT(*) FROM table_name GROUP BY column_name_1;
-
-Joins
------
-
-Join data from two tables:
-
-    SELECT * FROM table_name_1 JOIN table_name_2 ON table_name_1.column_name = table_name_2.column_name;
-
-Writing Queries
----------------
-
-SQL commands must be combined in the following order:
-`SELECT`, `FROM`, `JOIN`, `ON`, `WHERE`, `GROUP BY`, `ORDER BY`.
-
-Creating Tables
----------------
-
-Create tables by specifying column names and types.
-Include primary and foreign key relationships and other constraints.
-
-    CREATE TABLE survey(
-        taken   INTEGER NOT NULL,
-        person  TEXT,
-        quant   REAL NOT NULL,
-        PRIMARY KEY(taken, quant),
-        FOREIGN KEY(person) REFERENCES person(ident)
-    );
-
-Programming
------------
-
-Execute queries in a general-purpose programming language by:
-
-*   loading the appropriate library
-*   creating a connection
-*   creating a cursor
-*   repeatedly:
-    *   execute a query
-    *   fetch some or all results
-*   disposing of the cursor
-*   closing the connection
-
-Python example:
-
-    import sqlite3
-    connection = sqlite3.connect("database_name")
-    cursor = connection.cursor()
-    cursor.execute("...query...")
-    for r in cursor.fetchall():
-        ...process result r...
-    cursor.close()
-    connection.close()