1 {% extends "_base.html" %}
3 {% block file_metadata %}
4 <meta name="title" content="Version Control With Subversion" />
5 {% endblock file_metadata %}
9 <li><a href="#s:basics">Basic Use</a></li>
10 <li><a href="#s:merge">Merging Conflicts</a></li>
11 <li><a href="#s:rollback">Recovering Old Versions</a></li>
12 <li><a href="#s:setup">Setting up a Repository</a></li>
13 <li><a href="#s:provenance">Provenance</a></li>
14 <li><a href="#s:summary">Summing Up</a></li>
18 Suppose that Wolfman and Dracula have been hired by Universal Monsters Inc.
19 to figure out where the company should put its next secret lair.
20 They want to be able to work on the plans at the same time,
21 but they have run into problems doing this in the past.
23 each one will spend a lot of time waiting for the other to finish.
25 if they work on their own copies and email changes back and forth
26 they know that things will be lost, overwritten, or duplicated.
30 The right solution is to use a <a href="glossary.html#version-control-system">version control system</a>
32 Version control is better than mailing files back and forth because:
38 It's hard (but not impossible) to accidentally overlook or overwrite someone's changes,
39 because the version control system highlights them automatically.
43 There are no arguments about whose copy is the most up to date.
47 Nothing that is committed to version control is ever lost.
48 This means it can be used like the "undo" feature in an editor,
49 and since all old versions of files are saved
50 it's always possible to go back in time to see exactly who wrote what on a particular day,
51 or what version of a program was used to generate a particular set of results.
57 Version control systems do have one important shortcoming.
58 While it is easy for them to find, display, and merge differences in text files,
59 images, MP3s, PDFs, or Microsoft Word or Excel files aren't stored as text—they
60 use specialized binary data formats.
61 Most version control systems don't know how to deal with these formats,
62 so all they can say is, "These files differ."
63 The rest is up to you.
67 Even with this limitation,
68 version control is one of the most important concepts in this book.
69 The rest of this chapter will explore how to use Subversion,
70 a popular open source version control system.
73 <section id="s:basics">
77 <div class="understand" id="u:basics">
80 <li>Where version control stores information.</li>
81 <li>How to check out a working copy of a repository.</li>
82 <li>How to view the history of changes to a project.</li>
83 <li>Why working copies of different projects should not overlap.</li>
84 <li>How to add files to a project.</li>
85 <li>How to submit changes made locally to a project's master copy.</li>
86 <li>How to update a working copy to get changes made to the master.</li>
87 <li>How to check the status of a working copy.</li>
92 A version control system keeps the master copy of a file
93 in a <a href="glossary.html#repository">repository</a>
94 located on a <a href="glossary.html#server">server</a>—a computer
95 that is never used directly by people,
96 but only by their programs
97 (<a href="#f:repository">Figure XXX</a>).
98 No-one ever edits the master copy directly.
100 Wolfman and Dracula each have a <a href="glossary.html#working-copy">working copy</a>
101 on their own computer.
102 This lets them make whatever changes they want whenever they want.
105 <figure id="f:repository">
106 <img src="svn/repository.png" alt="A Version Control Repository" />
110 As soon Wolfman is ready to share his changes,
111 he <a href="glossary.html#commit">commits</a> his work to the repository
112 (<a href="#f:workflow">Figure XXX</a>).
113 Dracula can then <a href="glossary.html#update">update</a> his working copy to get those changes.
115 when Dracula finishes working on something,
116 he can commit and then Wolfman can update.
119 <figure id="f:workflow">
120 <img src="svn/workflow.png" alt="Version Control Workflow" />
124 But what if Dracula and Wolfman make changes to the same part of their working copies?
125 Old-fashioned version control systems prevented this from happening
126 by <a href="glossary.html#lock">locking</a> the master copy
127 whenever someone was working on it.
128 This <a href="glossary.html#pessimistic-concurrency">pessimistic</a> strategy
129 guaranteed that a second person (or monster)
130 could never make changes to the same file at the same time,
131 but it also meant that people had to take turns.
135 Most of today's version control systems use
136 an <a href="glossary.html#optimistic-concurrency">optimistic</a> strategy instead.
137 Nothing is ever locked—everyone is always allowed to edit their working copy.
138 This means that people can make changes to the same part of the paper,
139 but that's actually fairly uncommon in a well-run project,
140 and when it <em>does</em> happen,
141 the version control system helps people reconcile their changes.
146 if Wolfman and Dracula are making changes at the same time,
147 and Wolfman commits first,
148 his changes are simply copied to the repository
149 (<a href="#f:merge_first_commit">Figure XXX</a>):
152 <figure id="f:merge_first_commit">
153 <img src="svn/merge_first_commit.png" alt="Wolfman Commits First" />
157 If Dracula now tries to commit something that would overwrite Wolfman's changes
158 the version control system stops him
159 and points out the <a href="glossary.html#conflict">conflict</a>
160 (<a href="#f:merge_second_commit">Figure XXX</a>):
163 <figure id="f:merge_second_commit">
164 <img src="svn/merge_second_commit.png" alt="Dracula Has a Conflict" />
168 Dracula must <a href="glossary.html#resolve">resolve</a> that conflict
169 before the version control system will allow him to commit his work.
170 He can accept what Wolfman did,
171 replace it with what he has done,
172 or write something new that combines the two—that's up to him.
173 Once he has fixed things, he can go ahead and commit.
177 Let's start by looking at the basic workflow we use
178 when working with a version control system.
179 To keep things simple,
180 we'll assume that the Mummy has already put some notes in a version control repository
181 on the <code>universal.software-carpentry.org</code> server.
182 The full URL for this repository is <code>https://universal.software-carpentry.org/monsters</code>.
183 Every repository has an address like this that uniquely identifies the location of the master copy.
188 In order to get a working copy on his computer,
189 Dracula has to <a href="glossary.html#check-out">check out</a> a copy of the repository.
190 He only has to do this once per project:
191 once he has a working copy,
192 he can update it over and over again to get other people's work:
196 <h3>There's More Than One Way To Do It</h3>
199 We will drive Subversion from the command line in our examples,
200 but if you prefer using a GUI,
201 there are many for you to choose from:
207 <a href="http://tortoisesvn.net/">TortoiseSVN</a>
208 is integrated into the Windows desktop,
209 so there's no separate GUI as such.
213 <a href="http://rapidsvn.tigris.org/">RapidSVN</a> is free,
214 and runs on many platforms,
215 but some users report difficulties installing it.
219 Syntevo's <a href="http://www.syntevo.com/smartsvn/index.html">SmartSVN</a> isn't free,
220 but it costs less than most textbooks,
221 and is more stable (and has a friendlier interface) than RapidSVN.
229 While in his home directory,
230 Dracula types the command:
234 $ <span class="in">svn checkout https://universal.software-carpentry.org/monsters</span>
238 This creates a new directory called <code>monsters</code>
239 and fills it with a copy of the repository's contents
240 (<a href="#f:example_repo">Figure XXX</a>).
244 <span class="out">A monsters/jupiter
246 A monsters/mars/mons-olympus.txt
247 A monsters/mars/cydonia.txt
249 A monsters/earth/himalayas.txt
250 A monsters/earth/antarctica.txt
251 A monsters/earth/carlsbad.txt
252 Checked out revision 6.</span>
255 <figure id="f:example_repo">
256 <img src="svn/example_repo.png" alt="Example Repository" />
260 Dracula can then go into this directory
261 and use regular shell commands to view the files:
265 $ <span class="in">cd monsters</span>
266 $ <span class="in">ls</span>
267 <span class="out">earth jupiter mars</span>
268 $ <span class="in">ls *</span>
269 <span class="out">earth:
270 antarctica.txt carlsbad.txt himalayas.txt
275 cydonia.txt mons-olympus.txt</span>
280 <h3>Don't Let the Working Copies Overlap</h3>
283 It's very important that the working copies of different project do not overlap;
285 we should never try to check out one project inside a working copy of another project.
286 The reason is that Subversion stories information about
287 the current state of a working copy
288 in special sub-directories called <code>.svn</code>:
292 $ <span class="in">pwd</span>
293 <span class="out">/home/vlad/monsters</span>
294 $ <span class="in">ls -a</span>
295 <span class="out">. .. .svn earth jupiter mars</span>
296 $ <span class="in">ls -F .svn</span>
297 <span class="out">entries prop-base/ props/ text-base/ tmp/</span>
301 If two working copies overlap,
302 the files in the <code>.svn</code> directories for one repository
303 will be clobbered by the other repository's <code>.svn</code> files,
304 and Subversion will become hopelessly confused.
310 Dracula can find out more about the history of the project
311 using Subversion's <code>log</code> command:
315 $ <span class="in">svn log</span>
316 <span class="out">------------------------------------------------------------------------
317 r6 | mummy | 2010-07-26 09:21:10 -0400 (Mon, 26 Jul 2010) | 1 line
319 Damn the budget---the Jovian moons would be a _perfect_ place for a lair.
320 ------------------------------------------------------------------------
321 r5 | mummy | 2010-07-26 09:19:39 -0400 (Mon, 26 Jul 2010) | 1 line
323 The budget might not even stretch to a deep-sea lair... :-(
324 ------------------------------------------------------------------------
325 r4 | mummy | 2010-07-26 09:17:46 -0400 (Mon, 26 Jul 2010) | 1 line
327 Budget cuts may force us to reconsider Earth as a base.
328 ------------------------------------------------------------------------
329 r3 | mummy | 2010-07-26 09:14:14 -0400 (Mon, 26 Jul 2010) | 1 line
331 Converting to wiki-formatted text.
332 ------------------------------------------------------------------------
333 r2 | mummy | 2010-07-26 09:11:55 -0400 (Mon, 26 Jul 2010) | 1 line
335 Hide near the face in Cydonia, perhaps?
336 ------------------------------------------------------------------------
337 r1 | mummy | 2010-07-26 09:08:23 -0400 (Mon, 26 Jul 2010) | 1 line
339 Thoughts on Mons Olympus (probably too obvious)
340 ------------------------------------------------------------------------</span>
344 Subversion displays a summary of all the changes made to the project so far.
345 This list includes the
346 <a href="glossary.html#revision-number">revision number</a>,
347 the name of the person who made the change,
348 the date the change was made,
349 and whatever comment the user provided when the change was submitted.
351 the <code>monsters</code> project is currently at revision 6,
352 and all changes so far have been made by the Mummy.
356 Notice how detailed the comments on the updates are.
357 Good comments are as important in version control as they are in coding.
358 Without them, it can be very difficult to figure out who did what, when, and why.
359 We can use comments like "Changed things" and "Fixed it" if we want,
360 or even no comments at all,
361 but we'll only be making more work for our future selves.
365 Another thing to notice is that the revision number applies to the whole repository,
366 not to a particular file.
367 When we talk about "version 61" we mean
368 "the state of all files and directories at that point."
369 Older version control systems like CVS gave each file a new version number when it was updated,
370 which meant that version 38 of one file could correspond in time to version 17 of another
371 (<a href="#f:version_numbering">Figure XXX</a>).
372 Experience shows that
373 global version numbers that apply to everything in the repository
374 are easier to manage than
375 per-file version numbers,
376 so that's what Subversion uses.
379 <figure id="f:version_numbering">
380 <img src="svn/version_numbering.png" alt="Version Numbering in CVS and Subversion" />
384 A couple of cubicles away,
385 Wolfman also runs <code>svn checkout</code>
386 to get a working copy of the repository.
387 He also gets version 6,
388 so the files on his machine are the same as the files on Dracula's.
389 While he is looking through the files,
390 Dracula decides to add some information to the repository about Jupiter's moons.
391 Using his favorite editor,
392 he creates a file in the <code>jupiter</code> directory called <code>moons.txt</code>,
393 and fills it with information about Io, Europa, Ganymede, and Callisto:
396 <pre src="svn/moons_initial.txt">
397 Name Orbital Radius Orbital Period Mass Radius
398 Io 421.6 1.769138 893.2 1821.6
399 Europa 670.9 3.551181 480.0 1560.8
400 Ganymede 1070.4 7.154553 1481.9 2631.2
401 Calisto 1882.7 16.689018 1075.9 2410.3
405 After double-checking his data,
406 he wants to commit the file to the repository so that everyone else on the project can see it.
407 The first step is to add the file to his working copy using <code>svn add</code>:
411 $ <span class="in">svn add jupiter/moons.txt</span>
412 <span class="out">A jupiter/moons.txt</span>
416 Adding a file is not the same as creating it—he has already done that.
418 the <code>svn add</code> command tells Subversion to add the file to
419 the list of things it's supposed to manage.
421 particularly in programming projects,
422 to have backup files or intermediate files in a directory
423 that aren't worth storing in the repository.
424 This is why version control requires us to explicitly tell it which files are to be managed.
428 Once he has told Subversion to add the file,
429 Dracula can go ahead and commit his changes to the repository.
430 He uses the <code>-m</code> flag to provide a one-line message explaining what he's doing;
432 Subversion would open his default editor
433 so that he could type in something longer.
437 $ <span class="in">svn commit -m "Some basic facts about the Galilean moons of Jupiter." jupiter/moons.txt</span>
438 <span class="out">Adding jupiter/moons.txt
439 Transmitting file data .
440 Committed revision 7.</span>
444 When Dracula runs this command,
445 Subversion establishes a connection to the server,
446 copies over his changes,
447 and updates the revision number from 6 to 7
448 (<a href="#f:updated_repo">Figure XXX</a>).
450 this version number applies to the <em>whole</em> repository,
451 not just to files that have changed.
454 <figure id="f:updated_repo">
455 <img src="svn/updated_repo.png" alt="Updated Repository" />
458 <p id="a:define-head">
460 Wolfman uses <code>svn update</code> to update his working copy.
461 It tells him that a new file has been added
462 and brings his working copy up to date with version 7 of the repository,
463 because this is now the most recent revision
464 (also called the <a href="glossary.html#head">head</a>).
465 <code>svn update</code> updates an existing working copy,
466 rather than checking out a new one.
467 While <code>svn checkout</code> is usually only run once per project per machine,
468 <code>svn update</code> may be run many times a day.
472 Looking in the new file <code>jupiter/moons.txt</code>,
473 Wolfman notices that Dracula has misspelled "Callisto"
474 (it is supposed to have two L's.)
475 Wolfman edits that line of the file:
478 <pre src="svn/moons_spelling.txt">
479 Name Orbital Radius Orbital Period Mass Radius
480 Io 421.6 1.769138 893.2 1821.6
481 Europa 670.9 3.551181 480.0 1560.8
482 Ganymede 1070.4 7.154553 1481.9 2631.2
483 <span class="highlight">Callisto 1882.7 16.689018 1075.9 2410.3</span>
487 He also adds a line about Amalthea,
488 which he thinks might be a good site for a secret lair despite its small size:
491 <pre src="svn/moons_amalthea.txt">
492 Name Orbital Radius Orbital Period Mass Radius
493 <span class="highlight">Amalthea 181.4 0.498179 0.075 125.0</span>
494 Io 421.6 1.769138 893.2 1821.6
495 Europa 670.9 3.551181 480.0 1560.8
496 Ganymede 1070.4 7.154553 1481.9 2631.2
497 Callisto 1882.7 16.689018 1075.9 2410.3
501 uses the <code>svn status</code> command to check that he hasn't accidentally changed anything else:
505 $ <span class="in">svn status</span>
506 <span class="out">M jupiter/moons.txt</span>
510 and then runs <code>svn commit</code>.
511 Since has hasn't used the <code>-m</code> flag to provide a message on the command line,
512 Subversion launches his default editor and shows him:
517 --This line, and those below, will be ignored--
523 He changes this to be
527 1. Fixed typo in moon's name: 'Calisto' -> 'Callisto'.
528 2. Added information about Amalthea.
529 --This line, and those below, will be ignored--
535 When he saves this temporary file and exits the editor,
536 Subversion commits his changes:
540 <span class="out">Sending jupiter/moons.txt
541 Transmitting file data .
542 Committed revision 8.</span>
546 Note that since Wolfman didn't specify a particular file to commit,
547 Subversion commits <em>all</em> of his changes.
548 This is why he ran the <code>svn status</code> command first.
551 <div class="box" id="b:basics:transaction">
553 <h3>Working With Multiple Files</h3>
556 Our example only includes one file,
557 but version control can work on any number of files at once.
559 if Wolfman noticed that a dozen data files had the same incorrect header,
560 he could change it in all 12 files,
561 then commit all those changes at once.
562 This is actually the best way to work:
563 every logical change to the project should be a single commit,
564 and every commit should include everything involved in one logical change.
571 when Dracula rises from his coffin to start work,
572 the first thing he wants to do is get Wolfman's changes.
573 Before updating his working copy with <code>svn update</code>,
575 he wants to see the differences between what he has
576 and what he <em>will</em> have if he updates.
578 Dracula uses <code>svn diff</code>.
579 When run without arguments,
580 it compares what's in his working copy to what he started with,
581 and shows no differences:
585 $ <span class="in">svn diff</span>
590 To compare his working copy to the master,
591 Dracula uses <code>svn diff -r HEAD</code>.
592 The <code>-r</code> flag is used to specify a revision,
593 while <code>HEAD</code> means
594 "<a href="#a:define-head">the latest version of the master</a>".
598 $ <span class="in">svn diff -r HEAD</span>
599 <span class="out">--- moons.txt(revision 8)
600 +++ moons.txt(working copy)
602 Name Orbital Radius Orbital Period Mass Radius
603 +Amalthea 181.4 0.498179 0.075 125.0
604 Io 421.6 1.769138 893.2 1821.6
605 Europa 670.9 3.551181 480.0 1560.8
606 Ganymede 1070.4 7.154553 1481.9 2631.2
607 -Calisto 1882.7 16.689018 1075.9 2410.3
608 +Callisto 1882.7 16.689018 1075.9 2410.3
613 After looking over the changes,
614 Dracula goes ahead and does the update.
618 <h3>Reading a Diff</h3>
621 The output of <code>diff</code> isn't particularly user-friendly,
622 but actually isn't that hard to figure out.
627 --- moons.txt(revision 9)
628 +++ moons.txt(working copy)
632 signal that '-' will be used to show content from revision 9
633 and '+' to show content from the user's working copy.
634 The next line, with the '@' markers,
635 indicates where lines were inserted or removed.
636 This isn't really intended for human consumption:
637 a variety of other software tools will use this information.
641 The most important parts of what follows are the lines marked with '+' and '-',
642 which show insertions and deletions respectively.
644 we can see that the line for Amalthea was inserted,
645 and that the line for Callisto was changed
646 (which is indicated by an add and a delete right next to one another).
647 Many editors and other tools can display diffs like this in a two-column display,
648 highlighting changes.
654 This is a very common workflow,
655 and is the basic heartbeat of most developers' days.
663 Check to see if there are changes in the repository to download.
667 Update our working copy with those changes.
675 Commit our changes to the repository so that other people can get them.
681 It's worth noticing here how important Wolfman's comments about his changes were.
682 It's hard to see the difference between "Calisto" with one 'L' and "Callisto" with two,
683 even if the line containing the difference has been highlighted.
684 Without Wolfman's comments,
685 Dracula might have wasted time wondering what the difference was.
690 Wolfman should probably have committed his two changes separately,
691 since there's no logical connection between
692 fixing a typo in Callisto's name
693 and adding information about Amalthea to the same file.
694 Just as a function or program should do one job and one job only,
695 a single commit to version control should have a single logical purpose so that it's easier to find,
697 and if necessary undo later on.
700 <div class="keypoints" id="k:basics">
703 <li>Version control is a better way to manage shared files than email or shared folders.</li>
704 <li>The master copy is stored in a repository.</li>
705 <li>Nobody ever edits the master directory: instead, each person edits a local working copy.</li>
706 <li>People share changes by committing them to the master or updating their local copy from the master.</li>
707 <li idea="paranoia">The version control system prevents people from overwriting each other's work by forcing them to merge concurrent changes before committing.</li>
708 <li idea="perf">It also keeps a complete history of changes made to the master so that old versions can be recovered reliably.</li>
709 <li>Version control systems work best with text files, but can also handle binary files such as images and Word documents.</li>
710 <li>Every repository is identified by a URL.</li>
711 <li>Working copies of different repositories may not overlap.</li>
712 <li>Each changed to the master copy is identified by a unique revision number.</li>
713 <li>Revisions identify snapshots of the entire repository, not changes to individual files.</li>
714 <li idea="perf">Each change should be commented to make the history more readable.</li>
715 <li>Commits are transactions: either all changes are successfully committed, or none are.</li>
716 <li>The basic workflow for version control is update-change-commit.</li>
717 <li><code>svn add <em>things</em></code> tells Subversion to start managing particular files or directories.</li>
718 <li><code>svn checkout <em>url</em></code> checks out a working copy of a repository.</li>
719 <li><code>svn commit -m "<em>message</em>" <em>things</em></code> sends changes to the repository.</li>
720 <li><code>svn diff</code> compares the current state of a working copy to the state after the most recent update.</li>
721 <li><code>svn diff -r HEAD</code> compares the current state of a working copy to the state of the master copy.</li>
722 <li><code>svn history</code> shows the history of a working copy.</li>
723 <li><code>svn status</code> shows the status of a working copy.</li>
724 <li><code>svn update</code> updates a working copy from the repository.</li>
730 <section id="s:merge">
732 <h2>Merging Conflicts</h2>
734 <div class="understand" id="u:merge">
737 <li>What a conflict in an update is.</li>
738 <li>How to resolve conflicts when updating.</li>
743 Dracula and Wolfman have both synchronized their working copies of <code>monsters</code>
744 with version 8 of the repository.
745 Dracula now edits his copy to change Amalthea's radius
746 from a single number to a triple to reflect its irregular shape:
749 <pre src="svn/moons_dracula_triple.txt">
750 Name Orbital Radius Orbital Period Mass Radius
751 <span class="highlight">Amalthea 181.4 0.498179 0.075 131 x 73 x 67</span>
752 Io 421.6 1.769138 893.2 1821.6
753 Europa 670.9 3.551181 480.0 1560.8
754 Ganymede 1070.4 7.154553 1481.9 2631.2
755 Callisto 1882.7 16.689018 1075.9 2410.3
759 He then commits his work,
760 creating revision 9 of the repository
761 (<a href="#f:after_dracula_commits">Figure XXX</a>).
764 <figure id="f:after_dracula_commits">
765 <img src="svn/after_dracula_commits.png" alt="After Dracula Commits" />
769 But while he is doing this,
770 Wolfman is editing <em>his</em> copy
771 to add information about two other minor moons,
775 <pre src="svn/moons_wolfman_extras.txt">
776 Name Orbital Radius Orbital Period Mass Radius
777 Amalthea 181.4 0.498179 0.075 131
778 Io 421.6 1.769138 893.2 1821.6
779 Europa 670.9 3.551181 480.0 1560.8
780 Ganymede 1070.4 7.154553 1481.9 2631.2
781 Callisto 1882.7 16.689018 1075.9 2410.3
782 <span class="highlight">Himalia 11460 250.5662 0.095 85.0
783 Elara 11740 259.6528 0.008 40.0</span>
787 When Wolfman tries to commit his changes to the repository,
788 Subversion won't let him:
792 $ <span class="in">svn commit -m "Added data for Himalia, Elara"</span>
793 <span class="out">Sending jupiter/moons.txt
794 svn: Commit failed (details follow):
795 svn: File or directory 'moons.txt' is out of date; try updating
796 svn: resource out of date; try updating</span>
801 Wolfman's changes were based on revision 8,
802 but the repository is now at revision 9,
803 and the file that Wolfman is trying to overwrite
804 is different in the later revision.
806 one of version control's main jobs is to make sure that
807 people don't trample on each other's work.)
808 Wolfman has to update his working copy to get Dracula's changes before he can commit.
810 Dracula edited a line that Wolfman didn't change,
811 so Subversion can merge the differences automatically.
815 This does <em>not</em> mean that Wolfman's changes have been committed to the repository:
816 Subversion only does that when it's ordered to.
817 Wolfman's changes are still in his working copy,
818 and <em>only</em> in his working copy.
819 But since Wolfman's version of the file now includes
820 the lines that Dracula added,
821 Wolfman can go ahead and commit them as usual to create revision 10.
825 Wolfman's working copy is now in sync with the master,
826 but Dracula's is one behind at revision 9.
828 they independently decide to add measurement units
829 to the columns in <code>moons.txt</code>.
830 Wolfman is quicker off the mark this time;
831 he adds a line to the file:
834 <pre src="svn/moons_wolfman_units.txt">
835 Name Orbital Radius Orbital Period Mass Radius
836 <span class="highlight"> (10**3 km) (days) (10**20 kg) (km)</span>
837 Amalthea 181.4 0.498179 0.075 131 x 73 x 67
838 Io 421.6 1.769138 893.2 1821.6
839 Europa 670.9 3.551181 480.0 1560.8
840 Ganymede 1070.4 7.154553 1481.9 2631.2
841 Callisto 1882.7 16.689018 1075.9 2410.3
842 Himalia 11460 250.5662 0.095 85.0
843 Elara 11740 259.6528 0.008 40.0
847 and commits it to create revision 11.
848 While he is doing this,
850 Dracula inserts a different line at the top of the file:
853 <pre src="svn/moons_dracula_units.txt">
854 Name Orbital Radius Orbital Period Mass Radius
855 <span class="highlight"> * 10^3 km * days * 10^20 kg * km</span>
856 Amalthea 181.4 0.498179 0.075 131 x 73 x 67
857 Io 421.6 1.769138 893.2 1821.6
858 Europa 670.9 3.551181 480.0 1560.8
859 Ganymede 1070.4 7.154553 1481.9 2631.2
860 Callisto 1882.7 16.689018 1075.9 2410.3
861 Himalia 11460 250.5662 0.095 85.0
862 Elara 11740 259.6528 0.008 40.0
867 when Dracula tries to commit,
868 Subversion tells him he can't.
870 when Dracula does updates his working copy,
871 he doesn't just get the line Wolfman added to create revision 11.
872 There is an actual conflict in the file,
873 so Subversion asks Dracula what he wants to do:
876 <pre src="svn/moons_dracula_conflict.txt">
877 $ <span class="in">svn update</span>
878 <span class="out">Conflict discovered in 'jupiter/moons.txt'.
879 Select: (p) postpone, (df) diff-full, (e) edit,
880 (mc) mine-conflict, (tc) theirs-conflict,
881 (s) show all options:</span>
885 Dracula choose <code>p</code> for "postpone",
886 which tells Subversion that he'll deal with the problem later.
887 Once the update is finished,
888 he opens <code>moons.txt</code> in his editor and sees:
892 Name Orbital Radius Orbital Period Mass
893 +<<<<<<< .mine
894 + * 10^3 km * days * 10^20 kg
896 + (10**3 km) (days) (10**20 kg)
897 +>>>>>>> .r11
898 Amalthea 181.4 0.498179 0.074
899 Io 421.6 1.769138 893.2
900 Europa 670.9 3.551181 480.0
901 Ganymede 1070.4 7.154553 1481.9
902 Callisto 1882.7 16.689018 1075.9
907 Subversion has inserted
908 <a href="glossary.html#conflict-marker">conflict markers</a>
909 in <code>moons.txt</code>
910 wherever there is a conflict.
911 The line <code><<<<<<< .mine</code> shows the start of the conflict,
912 and is followed by the lines from the local copy of the file.
913 The separator <code>=======</code> is then
914 followed by the lines from the repository's file that are in conflict with that section,
915 while <code>>>>>>>> .r11</code> marks the end of the conflict.
919 Before he can commit,
920 Dracula has to edit his copy of the file to get rid of those markers.
924 <pre src="svn/moons_dracula_resolved.txt">
925 Name Orbital Radius Orbital Period Mass Radius
926 <span class="highlight"> (10^3 km) (days) (10^20 kg) (km)</span>
927 Amalthea 181.4 0.498179 0.075 131 x 73 x 67
928 Io 421.6 1.769138 893.2 1821.6
929 Europa 670.9 3.551181 480.0 1560.8
930 Ganymede 1070.4 7.154553 1481.9 2631.2
931 Callisto 1882.7 16.689018 1075.9 2410.3
932 Himalia 11460 250.5662 0.095 85.0
933 Elara 11740 259.6528 0.008 40.0
937 then uses the <code>svn resolved</code> command to tell Subversion that
938 he has fixed the problem.
939 Subversion will now let him commit to create revision 12.
944 <h3>Auxiliary Files</h3>
947 When Dracula did his update and Subversion detected the conflict in <code>moons.txt</code>,
948 it created three temporary files to help Dracula resolve it.
949 The first is called <code>moons.txt.r9</code>;
950 it is the file as it was in Dracula's local copy
951 before he started making changes,
952 i.e., the common ancestor for his work
953 and whatever he is in conflict with.
957 The second file is <code>moons.txt.r11</code>.
958 This is the most up-to-date revision from the repository—the
959 file as it is including Wolfman's changes.
960 The third temporary file, <code>moons.txt.mine</code>,
961 is the file as it was in Dracula's working copy before he did the Subversion update.
965 Subversion creates these auxiliary files primarily
966 to help people merge conflicts in binary files.
967 It wouldn't make sense to insert <code><<<<<<<</code>
968 and <code>>>>>>>></code> characters into an image file
969 (it would almost certainly result in a corrupted image).
970 The <code>svn resolved</code> command deletes these three extra files
971 as well as telling Subversion that the conflict has been taken care of.
977 Some power users prefer to work with interpolated conflict markers directly,
978 but for the rest of us,
979 there are several tools for displaying differences and helping to merge them,
980 including <a href="http://diffuse.sourceforge.net/">Diffuse</a> and <a href="http://winmerge.org/">WinMerge</a>.
981 If Dracula launches Diffuse,
982 it displays his file,
983 the common base that he and Wolfman were working from,
984 and Wolfman's file in a three-pane view
985 (<a href="#f:diff_viewer">Figure XXX</a>):
988 <figure id="f:diff_viewer">
989 <img src="svn/diff_viewer.png" alt="A Difference Viewer" />
993 Dracula can use the buttons to merge changes from either of the edited versions
994 into the common ancestor,
995 or edit the central pane directly.
998 he uses <code>svn resolved</code> and <code>svn commit</code>
999 to create revision 12 of the repository.
1003 In this case, the conflict was small and easy to fix.
1004 However, if two or more people on a team are repeatedly creating conflicts for one another,
1005 it's usually a signal of deeper communication problems:
1006 either they aren't talking as often as they should, or their responsibilities overlap.
1008 the version control system can help the team find and fix these issues
1009 so that it will be more productive in future.
1014 <h3>Working With Multiple Files</h3>
1017 As mentioned <a href="#a:transaction">earlier</a>,
1018 every logical change to a project should result in a single commit,
1019 and every commit should represent one logical change.
1020 This is especially true when resolving conflicts:
1021 the work done to reconcile one person's changes with another are often complicated,
1022 so it should be a single entry in the project's history,
1023 with other, later, changes coming after it.
1028 <div class="keypoints" id="k:merge">
1031 <li>Conflicts must be resolved before a commit can be completed.</li>
1032 <li>Subversion puts markers in text files to show regions of conflict.</li>
1033 <li>For each conflicted file, Subversion creates auxiliary files containing the common parent, the master version, and the local version.</li>
1034 <li><code>svn resolve <em>files</em></code> tells Subversion that conflicts have been resolved.</li>
1040 <section id="s:rollback">
1042 <h2>Recovering Old Versions</h2>
1044 <div class="understand" id="u:rollback">
1045 <h3>Understand:</h3>
1047 <li>How to undo changes to a working copy.</li>
1048 <li>How to recover old versions of files.</li>
1049 <li>What a branch is.</li>
1054 Now that we have seen how to merge files and resolve conflicts,
1055 we can look at how to use version control as an "infinite undo".
1056 Suppose that when Wolfman starts work late one night,
1057 his copy of <code>monsters</code> is in sync with the head at revision 12.
1058 He decides to edit the file <code>moons.txt</code>;
1059 unfortunately, he forgot that there was a full moon,
1060 so his changes don't make a lot of sense:
1063 <pre src="svn/poetry.txt">
1064 Just one moon can make me growl
1065 Four would make me want to howl
1070 When he's back in human form the next day,
1071 he wants to undo his changes.
1072 Without version control, his choices would be grim:
1073 he could try to edit them back into their original state by hand
1074 (which for some reason hardly ever seems to work),
1075 or ask his colleagues to send him their copies of the files
1076 (which is almost as embarrassing as chasing the neighbor's cat when in wolf form).
1080 Since he's using Subversion, though,
1081 and hasn't committed his work to the repository,
1082 all he has to do is <a href="glossary.html#revert">revert</a> his local changes.
1083 <code>svn revert</code> simply throws away local changes to files
1084 and puts things back the way they were before those changes were made.
1085 This is a purely local operation:
1086 since Subversion stores the history of the project inside every working copy,
1087 Wolfman doesn't need to be connected to the network to do this.
1092 Wolfman uses <code>svn diff</code> <em>without</em> the <code>-r HEAD</code> flag
1093 to take a look at the differences between his file
1094 and the master copy in the repository.
1095 Since he doesn't want to keep his changes,
1096 his next command is <code>svn revert moons.txt</code>.
1100 $ <span class="in">cd jupiter</span>
1101 $ <span class="in">svn revert moons.txt</span>
1102 <span class="out">Reverted moons.txt</span>
1106 What if someone <em>has</em> committed their changes,
1107 but still wants to undo them?
1109 suppose Dracula decides that the numbers in <code>moons.txt</code> would look better with commas.
1110 He edits the file to put them in:
1113 <pre src="svn/moons_commas.txt">
1114 Name Orbital Radius Orbital Period Mass Radius
1115 (10^3 km) (days) (10^20 kg) (km)
1116 Amalthea 181.4 0.498179 0.075 131 x 73 x 67
1117 Io 421.6 1.769138 893.2 1<span class="highlight">,</span>821.6
1118 Europa 670.9 3.551181 480.0 1<span class="highlight">,</span>560.8
1119 Ganymede 1<span class="highlight">,</span>070.4 7.154553 1<span class="highlight">,</span>481.9 2<span class="highlight">,</span>631.2
1120 Callisto 1<span class="highlight">,</span>882.7 16.689018 1<span class="highlight">,</span>075.9 2<span class="highlight">,</span>410.3
1121 Himalia 11<span class="highlight">,</span>460 250.5662 0.095 85.0
1122 Elara 11<span class="highlight">,</span>740 259.6528 0.008 40.0
1125 <p class="continue">
1126 then commits his changes to create revision 13.
1127 A little while later,
1128 the Mummy sees the change and orders Dracula to put things back the way they were.
1129 What should Dracula do?
1133 We can draw the sequence of events leading up to revision 13
1134 as shown in <a href="#f:before_undoing">Fixture XXX</a>:
1137 <figure id="f:before_undoing">
1138 <img src="svn/before_undoing.png" alt="Before Undoing" />
1141 <p class="continue">
1142 Dracula wants to erase revision 13 from the repository,
1143 but he can't actually do that:
1144 once a change is in the repository,
1146 What he can do instead is merge the old revision with the current revision
1147 to create a new revision
1148 (<a href="#f:merging_history">Fixture XXX</a>).
1151 <figure id="f:merging_history">
1152 <img src="svn/merging_history.png" alt="Merging History" />
1155 <p class="continue">
1156 This is exactly like merging changes made by two different people;
1157 the only difference is that the "other person" is his past self.
1162 Dracula must merge revision 12 (the one before his change)
1163 with revision 13 (the current head revision)
1164 using <code>svn merge</code>:
1168 $ <span class="in">svn merge -r HEAD:12 moons.txt</span>
1169 <span class="out">-- Reverse-merging r13 into 'moons.txt'
1173 <p class="continue">
1174 The <code>-r</code> flag specifies the range of revisions to merge:
1175 to undo the changes from revision 12 to revision 13,
1176 he uses either <code>13:12</code> or <code>HEAD:12</code>
1177 (since he is going backward in time from the most recent revision to revision 12).
1178 This is called a <a href="glossary.html#reverse-merge">reverse</a> merge
1179 because he's going backward in time.
1183 After he runs this command,
1184 he must run <code>svn commit</code> to save the changes to the repository.
1185 This creates a new revision, number 14,
1186 rather than erasing revision 13.
1188 the changes he made to create revision 13 are still there
1189 if he can ever convince the Mummy that numbers should have commas.
1193 Merging can be used to recover older revisions of files,
1194 not just the most recent,
1195 and to recover many files or directories at a time.
1196 The most frequent use, though,
1197 is to manage parallel streams of development in large projects.
1198 This is outside the scope of this chapter,
1199 but the basic idea is simple.
1203 Suppose that Universal Monsters has just released a new program for designing secret lairs.
1204 Dracula and Wolfman are supposed to start adding a few features
1205 that had to be left out of the first release because time ran short.
1207 Frankenstein and the Mummy are doing technical support:
1208 their job is to fix any bugs that users find.
1209 All sorts of things could go wrong if both teams tried to work on the same code at the same time.
1211 if Frankenstein fixed a bug and sent a new copy of the program to a user in Greenland,
1212 it would be all too easy for him to accidentally include
1213 the half-completed shark tank control feature that Wolfman was working on.
1217 The usual way to handle this situation is
1218 to create a <a href="glossary.html#branch">branch</a>
1219 in the repository for each major sub-project
1220 (<a href="#f:branch_merge">Figure XXX</a>).
1221 While Wolfman and Dracula work on
1222 the <a href="glossary.html#main-line">main line</a>,
1223 Frankenstein and the Mummy create a branch,
1224 which is just another copy of the repository's files and directories
1225 that is also under version control.
1226 They can work in their branch without disturbing Wolfman and Dracula and vice versa:
1229 <figure id="f:branch_merge">
1230 <img src="svn/branch_merge.png" alt="Branching and Merging" />
1234 Branches in version control repositories are often described as "parallel universes".
1235 Each branch starts off as a clone of the project at some moment in time
1236 (typically each time the software is released,
1237 or whenever work starts on a major new feature).
1238 Changes made to a branch only affect that branch,
1239 just as changes made to the files in one directory don't affect files in other directories.
1241 the branch and the main line are both stored in the same repository,
1242 so their revision numbers are always in step.
1246 If someone decides that a bug fix in one branch should also be made in another,
1247 all they have to do is merge the files in question.
1248 This is exactly like merging an old version of a file with the current one,
1249 but instead of going backward in time,
1250 the change is brought sideways from one branch to another.
1254 Branching helps projects scale up by letting sub-teams work independently,
1255 but too many branches can cause as many problems as they solve.
1256 Karl Fogel's excellent book
1257 <a href="bib.html#fogel-producing-oss"><cite>Producing Open Source Software</cite></a>,
1258 and Laura Wingerd and Christopher Seiwald's paper
1259 "<a href="bib.html#wingerd-seiwald-scm">High-level Best Practices in Software Configuration Management</a>",
1260 talk about branches in much more detail.
1261 Projects usually don't need to do this until they have a dozen or more developers,
1262 or until several versions of their software are in simultaneous use,
1263 but using branches is a key part of switching from software carpentry to software engineering.
1266 <div class="keypoints" id="k:rollback">
1269 <li>Old versions of files can be recovered by merging their old state with their current state.</li>
1270 <li>Recovering an old version of a file does not erase the intervening changes.</li>
1271 <li>Use branches to support parallel independent development.</li>
1272 <li><code>svn merge</code> merges two revisions of a file.</li>
1273 <li><code>svn revert</code> undoes local changes to files.</li>
1279 <section id="s:setup">
1281 <h2>Setting up a Repository</h2>
1283 <div class="understand" id="u:setup">
1284 <h3>Understand:</h3>
1286 <li>How to create a repository.</li>
1291 It is finally time to see how to create a repository.
1293 we will keep the master copy of our work in a repository
1294 on a server that we can access from other machines on the internet.
1295 That master copy consists of files and directories that no-one ever edits directly.
1296 Instead, a copy of Subversion running on that machine
1297 manages updates for us and watches for conflicts.
1298 Our working copy is a mirror image of the master sitting on our computer.
1299 When our Subversion client needs to communicate with the master,
1300 it exchanges data with the copy of Subversion running on the server.
1303 <figure id="f:repo_four_things">
1304 <img src="svn/repo_four_things.png" alt="What's Needed for a Repository" />
1308 To make this to work, we need four things
1309 (<a href="#f:repo_four_things">Figure XXX</a>):
1315 The repository itself.
1316 It's not enough to create an empty directory and start filling it with files:
1317 Subversion needs to create a lot of other structure
1318 in order to keep track of old revisions, who made what changes, and so on.
1322 The full URL of the repository.
1323 This includes the URL of the server
1324 and the path to the repository on that machine.
1325 (The second part is needed because a single server can,
1327 host many repositories.)
1331 Permission to read or write the master copy.
1332 Many open source projects give the whole world permission to read from their repository,
1333 but very few allow strangers to write to it:
1334 there are just too many possibilities for abuse.
1335 Somehow, we have to set up a password or something like it
1336 so that users can prove who they are.
1340 A working copy of the repository on our computer.
1341 Once the first three things are in place,
1342 this just means running the <code>checkout</code> command.
1348 To keep things simple,
1349 we will start by creating a repository on the machine that we're working on.
1350 This won't let us share our work with other people,
1351 but it <em>will</em> allow us to save the history of our work as we go along.
1355 The command to create a repository is <code>svnadmin create</code>,
1356 followed by the path to the repository.
1357 If we want to create a repository called <code>lair_repo</code>
1358 directly under our home directory,
1359 we just <code>cd</code> to get home
1360 and run <code>svnadmin create lair_repo</code>.
1361 This command creates a directory called <code>lair_repo</code> to hold our repository,
1362 and fills it with various files that Subversion uses
1363 to keep track of the project's history:
1367 $ <span class="in">cd</span>
1368 $ <span class="in">svnadmin create lair_repo</span>
1369 $ <span class="in">ls -F lair_repo</span>
1370 <span class="out">README.txt conf/ db/ format hooks/ locks/</span>
1373 <p class="continue">
1374 We should <em>never</em> edit anything in this repository directly.
1375 Doing so probably won't shred our sanity and leave us gibbering in mindless horror,
1376 but it will almost certainly make the repository unusable.
1380 To get a working copy of this repository,
1381 we use Subversion's <code>checkout</code> command.
1382 If our home directory is <code>/users/mummy</code>,
1383 then the full path to the repository we just created is <code>/users/mummy/lair_repo</code>,
1384 so we run <code>svn checkout file:///users/mummy/lair lair_working</code>.
1389 the second argument,
1390 <code>lair_working</code>,
1391 specifies where the working copy is to be put.
1392 The first argument is the URL of our repository,
1393 and it has two parts.
1394 <code>/users/mummy/lair_repo</code> is the path to repository directory.
1395 <code>file://</code> specifies the <a href="glossary.html#protocol">protocol</a>
1396 that Subversion will use to communicate with the repository—in this case,
1397 it says that the repository is part of the local machine's filesystem.
1398 Notice that the protocol ends in two slashes,
1399 while the absolute path to the repository starts with a slash,
1400 making three in total.
1401 A very common mistake is to type only two, since that's what web URLs normally have.
1405 When we're doing a checkout,
1406 it is <em>very</em> important that we provide the second argument,
1407 which specifies the name of the directory we want the working copy to be put in.
1409 Subversion will try to use the name of the repository,
1410 <code>lair_repo</code>,
1411 as the name of the working copy.
1412 Since we're in the directory that contains the repository,
1413 this means that Subversion will try to overwrite the repository with a working copy.
1415 there isn't much risk of our sanity being torn to shreds,
1416 but this could ruin our repository.
1420 To avoid this problem,
1421 most people create a sub-directory in their account called something like <code>repos</code>,
1422 and then create their repositories in that.
1424 we could create our repository in <code>/users/mummy/repos/lair</code>,
1425 then check out a working copy as <code>/users/mummy/lair</code>.
1426 This practice makes both names easier to read.
1430 The obvious next steps are
1431 to put our repository on a server,
1432 rather than on our personal machine,
1433 and to give other people access to the repository we have just created
1434 so that they can work with us.
1435 We'll discuss the first in <a href="web.html#s:svn">a later chapter</a>,
1437 the second really does require things that we are not going to cover in this course.
1438 If you want to do this, you can:
1444 ask your system administrator to set it up for you;
1448 use an open source hosting service like <a href="http://www.sf.net">SourceForge</a>,
1449 <a href="http://code.google.com">Google Code</a>,
1450 <a href="https://github.com/">GitHub</a>,
1451 or <a href="https://bitbucket.org/">BitBucket</a>; or
1455 spend a few dollars a month on a commercial hosting service like <a href="http://dreamhost.com">DreamHost</a>
1456 that provides web-based GUIs for creating and managing repositories.
1462 If you choose the second or third option,
1463 please check with whoever handles intellectual property at your institution
1464 to make sure that putting your work on a commercially-operated machine
1465 that is probably in some other legal jurisdiction
1466 isn't going to cause trouble.
1467 Many people assume that it's "just OK",
1468 while others act as if not having asked will be an acceptable defence later on.
1470 neither is true…
1473 <div class="keypoints" id="k:setup">
1476 <li>Repositories can be hosted locally, on local (departmental) servers, on hosting services, or on their owners' own domains.</li>
1477 <li><code>svnadmin create <em>name</em></code> creates a new repository.</li>
1483 <section id="s:provenance">
1487 <div class="understand" id="u:provenance">
1488 <h3>Understand:</h3>
1490 <li>What data provenance is.</li>
1491 <li>How to embed version numbers and other information in files managed by version control.</li>
1492 <li>How to record version information about a program in its output.</li>
1498 the <a href="glossary.html#provenance">provenance</a> of a work
1499 is the history of who owned it, when, and where.
1501 it's the record of how a particular result came to be:
1502 what raw data was processed by what version of what program to create which intermediate files,
1503 what was used to turn those files into which figures of which papers,
1508 One of the central ideas of this course is that
1509 wen can automatically track the provenance of scientific data.
1511 suppose we have a text file <code>combustion.dat</code> in a Subversion repository.
1512 Run the following two commands:
1516 $ svn propset svn:keywords Revision combustion.dat
1517 $ svn commit -m "Turning on the 'Revision' keyword" combustion.dat
1521 Now open the file in an editor
1522 and add the following line somewhere near the top:
1530 The '#' sign isn't important:
1531 it's just what <code>.dat</code> files use to show comments.
1532 The <code>$Revision:$</code> string,
1534 means something special to Subversion.
1535 Save the file, and commit the change:
1539 $ svn commit -m "Inserting the 'Revision' keyword" combustion.dat
1543 When we open the file again,
1544 we'll see that Subversion has changed that line to something like:
1551 <p class="continue">
1552 i.e., Subversion has inserted the version number
1553 after the colon and before the closing <code>$</code>.
1557 Here's what just happened.
1558 First, Subversion allows you to set
1559 <a href="glossary.html#property-subversion">properties</a>
1560 for files and and directories.
1561 These properties aren't in the files or directories themselves,
1562 but live in Subversion's database.
1563 One of those properties,
1564 <code>svn:keywords</code>,
1565 tells Subversion to look in files that are being changed
1566 for strings of the form <code>$propertyname: …$</code>,
1567 where <code>propertyname</code> is a string like <code>Revision</code> or <code>Author</code>.
1568 (About half a dozen such strings are supported.)
1572 If it sees such a string,
1573 Subversion rewrites it as the commit is taking place to replace <code>…</code>
1574 with the current version number,
1575 the name of the person making the change,
1576 or whatever else the property's name tells it to do.
1577 You only have to add the string to the file once;
1579 Subversion updates it for you every time the file changes.
1583 Putting the version number in the file this way can be pretty handy.
1584 If you copy the file to another machine,
1586 it carries its version number with it,
1587 so you can tell which version you have even if it's outside version control.
1588 We'll see some more useful things we can do with this information in
1589 <a href="python.html">the next chapter</a>.
1594 <h3>When <em>Not</em> to Use Version Control</h3>
1597 Despite the rapidly decreasing cost of storage,
1598 it is still possible to run out of disk space.
1600 people can easy go through 2 TB/month if they're not careful.
1601 Since version control tools usually store revisions in terms of lines,
1602 with binary data files,
1603 they end up essentially storing every revision separately.
1605 (it's what we'd be doing anyway),
1606 but it means version control isn't doing what it likes to do,
1607 and the repository can get very large very quickly.
1608 Another concern is that if very old data will no longer be used,
1609 it can be nice to archive or delete old data files.
1610 This is not possible if our data is version controlled:
1611 information can only be added to a repository,
1612 so it can only ever increase in size.
1618 We can use this trick with shell scripts too,
1619 or with almost any other kind of program.
1620 Going back to Nelle Nemo's data processing from the previous chapter,
1622 suppose she writes a shell script that uses <code>gooclean</code>
1623 to tidy up data files.
1624 Her first version looks like this:
1630 gooclean -b 0 100 < $filename > cleaned-$filename
1634 <p class="continue">
1635 i.e., it runs <code>gooclean</code> with bounding values of 0 and 100
1636 for each specified file,
1637 putting the result in a temporary file with a well-defined name.
1638 Assuming that '#' is the comment character for those kinds of data files,
1639 she could instead write:
1645 <span class="highlight">echo "gooclean $Revision: 901$ -b 0 100" > $filename</span>
1646 gooclean -b 0 100 < $filename <span class="highlight">>></span> cleaned-$filename
1651 The first change puts a line in the output file
1652 that describes how that file was created.
1653 The second change is to use <code>>></code> instead of <code>></code>
1654 to redirect <code>gooclean</code>'s output to the file.
1655 <code>>></code> means "append to":
1656 instead of overwriting whatever is in the file,
1657 it adds more content to it.
1658 This ensures that the first line of the file is the provenance record,
1659 with the actual output of <code>gooclean</code> after it.
1662 <div class="keypoints" id="k:provenance">
1665 <li><code>$Keyword:$</code> in a file can be filled in with a property value each time the file is committed.</li>
1666 <li idea="paranoia">Put version numbers in programs' output to establish provenance for data.</li>
1667 <li><code>svn propset svn:keywords <em>property</em> <em>files</em></code> tells Subversion to start filling in property values.</li>
1673 <section id="s:summary">
1678 Correlation does not imply causality,
1679 but there is a very strong correlation between
1680 using version control
1681 and doing good computational science.
1682 There's an equally strong correlation
1683 between <em>not</em> using it and wasting effort,
1684 so today (the middle of 2012),
1685 I will not review a paper if the software used in it
1686 is not under version control.
1687 Its authors' work might be interesting,
1688 but without the kind of record-keeping that version control provides,
1689 there's no way to know exactly what they did and when.
1690 Just as importantly,
1691 if someone doesn't know enough about computing to use version control,
1692 the odds are good that they don't know enough
1693 to do the programming right either.
1697 {% endblock content %}