doc/rcs/details.mdwn

   1 A few bits about the RCS backends
   2
   3 [[!toc ]]
   4
   5 ## Terminology
   6
   7 ``web-edit'' means that a page is edited by using the web (CGI) interface
   8 as opposed to using a editor and the RCS interface.
   9
  10
  11 ## [[svn]]
  12
  13 Subversion was the first RCS to be supported by ikiwiki.
  14
  15 ### How does it work internally?
  16
  17 Master repository M.
  18
  19 RCS commits from the outside are installed into M.
  20
  21 There is a working copy of M (a checkout of M): W.
  22
  23 HTML is generated from W.  rcs_update() will update from M to W.
  24
  25 CGI operates on W.  rcs_commit() will commit from W to M.
  26
  27 For all the gory details of how ikiwiki handles this behind the scenes,
  28 see [[commit-internals]].
  29
  30 You browse and web-edit the wiki on W.
  31
  32 W "belongs" to ikiwiki and should not be edited directly.
  33
  34
  35 ## [darcs](http://darcs.net/) (not yet included)
  36
  37 Support for using darcs as a backend is being worked on by [Thomas
  38 Schwinge](mailto:tschwinge@gnu.org), although development is on hold curretly.
  39 There is a patch in [[todo/darcs]].
  40
  41 ### How will it work internally?
  42
  43 ``Master'' repository R1.
  44
  45 RCS commits from the outside are installed into R1.
  46
  47 HTML is generated from R1.  HTML is automatically generated (by using a
  48 ``post-hook'') each time a new change is installed into R1.  It follows
  49 that rcs_update() is not needed.
  50
  51 There is a working copy of R1: R2.
  52
  53 CGI operates on R2.  rcs_commit() will push from R2 to R1.
  54
  55 You browse the wiki on R1 and web-edit it on R2.  This means for example
  56 that R2 needs to be updated from R1 if you are going to web-edit a page,
  57 as the user otherwise might be irritated otherwise...
  58
  59 How do changes get from R1 to R2?  Currently only internally in
  60 rcs\_commit().  Is rcs\_prepedit() suitable?
  61
  62 It follows that the HTML rendering and the CGI handling can be completely
  63 separated parts in ikiwiki.
  64
  65 What repository should [[RecentChanges]] and History work on?  R1?
  66
  67 #### Rationale for doing it differently than in the Subversion case
  68
  69 darcs is a distributed RCS, which means that every checkout of a
  70 repository is equal to the repository it was checked-out from.  There is
  71 no forced hierarchy.
  72
  73 R1 is nevertheless called the master repository.  It's used for
  74 collecting all the changes and publishing them: on the one hand via the
  75 rendered HTML and on the other via the standard darcs RCS interface.
  76
  77 R2, the repository the CGI operates on, is just a checkout of R1 and
  78 doesn't really differ from the other checkouts that people will branch
  79 off from R1.
  80
  81 (To be continued.)
  82
  83 #### Another possible approach
  84
  85 Here's what I (tuomov) think, would be a “cleaner” approach:
  86
  87  1. Upon starting to edit, Ikiwiki gets a copy of the page, and `darcs changes --context`.
  88      This context _and_ the present version of the page are stored in as the “version” of the
  89      page in a hidden control of the HTML.
  90      Thus the HTML includes all that is needed to generate a patch wrt. to the state of the
  91      repository at the time the edit was started. This is of course all that darcs needs.
  92  2. Once the user is done with editing, _Ikiwiki generates a patch bundle_ for darcs.
  93      This should be easy with existing `Text::Diff` or somesuch modules, as the Web edits
  94      only concern single files. The reason why the old version of the page is stored in
  95      the HTML (possibly compressed) is that the diff can be generated.
  96  3. Now this patch bundle is applied with `darcs apply`, or sent by email for moderation…
  97      there are many possibilities.
  98
  99 This approach avoids some of the problems of concurrent edits that the previous one may have,
 100 although there may be conflicts, which may or may not propagate to the displayed web page.
 101 (Unfortunately there is not an option to `darcs apply` to generate some sort of ‘confliction resolution
 102 bundle’.) Also, only one repository is needed, as it is never directly modified
 103 by Ikiwiki.
 104
 105 This approach might be applicable to other distributed VCSs as well, although they're not as oriented
 106 towards transmitting changes with standalone patch bundles (often by email) as darcs is.
 107
 108 > The mercurial plugin seems to just use one repo and edit it directly - is
 109 > there some reason that's okay there but not for darcs? I agree with tuomov
 110 > that having just the one repo would be preferable; the point of a dvcs is
 111 > that there's no difference between one repo and another. I've got a
 112 > darcs.pm based on mercurial.pm, that's almost usable... --bma
 113
 114 >> IMHO it comes down to whatever works well for a given RCS. Seems like
 115 >> the darcs approach _could_ be done with most any distributed system, but
 116 >> it might be overkill for some (or all?) While there is the incomplete darcs
 117 >> plugin in [[todo/darcs]], if you submit one that's complete, I will
 118 >> probably accept it into ikiwiki.. --[[Joey]]
 119
 120 >>> I'd like to help make a robust darcs (2) backend. I also think ikiwiki should use
 121 >>> exactly one darcs repo. I think we can simplify and say conflicting web
 122 >>> edits are not allowed, like most current wiki engines. I don't see that
 123 >>> saving (so much) context in the html is necessary, then.
 124 >>> bma, I would like to see your code. --[[Simon_Michael]]
 125 >>> PS ah, there it is. Let's continue on the [[todo/darcs]] page.
 126
 127
 128 ## [[Git]]
 129
 130 Regarding the Git support, Recai says:
 131
 132 I have been testing it for the past few days and it seems satisfactory.  I
 133 haven't observed any race condition regarding the concurrent blog commits
 134 and it handles merge conflicts gracefully as far as I can see.
 135
 136 (After about a year, git support is nearly as solid as subversion support --[[Joey]])
 137
 138 As you may notice from the patch size, GIT support is not so trivial to
 139 implement (for me, at least). It has some drawbacks (especially wrt merge
 140 which was the hard part).  GIT doesn't have a similar functionality like
 141 'svn merge -rOLD:NEW FILE' (please see the relevant comment in `_merge_past`
 142 for more details), so I had to invent an ugly hack just for the purpose.
 143
 144 > I was looking at this, and WRT the problem of uncommitted local changes,
 145 > it seems to me you could just git-stash them now that git-stash exists.
 146 > I think it didn't when you first added the git support.. --[[Joey]]
 147
 148
 149 >> Yes,  git-stash had not existed before.  What about sth like below?  It
 150 >> seems to work (I haven't given much thought on the specific implementation
 151 details).  --[[roktas]]
 152
 153 >>          # create test files
 154 >>          cd /tmp
 155 >>          seq 6 >page
 156 >>          cat page
 157 >>          1
 158 >>          2
 159 >>          3
 160 >>          4
 161 >>          5
 162 >>          6
 163 >>          sed -e 's/2/2ME/' page >page.me # my changes
 164 >>          cat page
 165 >>          1
 166 >>          2ME
 167 >>          3
 168 >>          4
 169 >>          5
 170 >>          6
 171 >>          sed -e 's/5/5SOMEONE/' page >page.someone # someone's changes
 172 >>          cat page
 173 >>          1
 174 >>          2
 175 >>          3
 176 >>          4
 177 >>          5SOMEONE
 178 >>          6
 179 >>
 180 >>          # create a test repository
 181 >>          mkdir t
 182 >>          cd t
 183 >>          cp ../page .
 184 >>          git init
 185 >>          git add .
 186 >>          git commit -m init
 187 >>
 188 >>          # save the current HEAD
 189 >>          ME=$(git rev-list HEAD -- page)
 190 >>          $EDITOR page # assume that I'm starting to edit page via web
 191 >>
 192 >>          # simulates someone's concurrent commit
 193 >>          cp ../page.someone page
 194 >>          git commit -m someone -- page
 195 >>
 196 >>          # My editing session ended, the resulting content is in page.me
 197 >>          cp ../page.me page
 198 >>          cat page
 199 >>          1
 200 >>          2ME
 201 >>          3
 202 >>          4
 203 >>          5
 204 >>          6
 205 >>
 206 >>          # let's start to save my uncommitted changes
 207 >>          git stash clear
 208 >>          git stash save "changes by me"
 209 >>          # we've reached a clean state
 210 >>          cat page
 211 >>          1
 212 >>          2
 213 >>          3
 214 >>          4
 215 >>          5SOMEONE
 216 >>          6
 217 >>
 218 >>          # roll-back to the $ME state
 219 >>          git reset --soft $ME
 220 >>          # now, the file is marked as modified
 221 >>          git stash save "changes by someone"
 222 >>
 223 >>          # now, we're at the $ME state
 224 >>          cat page
 225 >>          1
 226 >>          2
 227 >>          3
 228 >>          4
 229 >>          5
 230 >>          6
 231 >>          git stash list
 232 >>          stash@{0}: On master: changes by someone
 233 >>          stash@{1}: On master: changes by me
 234 >>
 235 >>          # first apply my changes
 236 >>          git stash apply stash@{1}
 237 >>          cat page
 238 >>          1
 239 >>          2ME
 240 >>          3
 241 >>          4
 242 >>          5
 243 >>          6
 244 >>          # ... and commit
 245 >>          git commit -m me -- page
 246 >>
 247 >>          # apply someone's changes
 248 >>          git stash apply stash@{0}
 249 >>          cat page
 250 >>          1
 251 >>          2ME
 252 >>          3
 253 >>          4
 254 >>          5SOMEONE
 255 >>          6
 256 >>          # ... and commit
 257 >>          git commit -m me+someone -- page
 258
 259 By design, Git backend uses a "master-clone" repository pair approach in contrast
 260 to the single repository approach (here, _clone_ may be considered as the working
 261 copy of a fictious web user).  Even though a single repository implementation is
 262 possible, it somewhat increases the code complexity of backend (I couldn't figure
 263 out a uniform method which doesn't depend on the prefered repository model, yet).
 264 By exploiting the fact that the master repo and _web user_'s repo (`srcdir`) are all
 265 on the same local machine, I suggest to create the latter with the "`git clone -l -s`"
 266 command to save disk space.
 267
 268 Note that, as a rule of thumb, you should always put the rcs wrapper (`post-update`)
 269 into the master repository (`.git/hooks/`).
 270
 271 Here is how a web edit works with ikiwiki and git:
 272
 273 * ikiwiki cgi modifies the page source in the clone
 274 * git-commit in the clone
 275 * git push origin master, pushes the commit from the clone to the master repo
 276 * the master repo's post-update hook notices this update, and runs ikiwiki
 277 * ikiwiki notices the modifies page source, and compiles it
 278
 279 Here is a how a commit from a remote repository works:
 280
 281 * git-commit in the remote repository
 282 * git-push, pushes the commit to the master repo on the server
 283 * the master repo's post-update hook notices this update, and runs ikiwiki
 284 * ikiwiki notices the modifies page source, and compiles it
 285
 286 ## [[Mercurial]]
 287
 288 The Mercurial backend is still in a early phase, so it may not be mature
 289 enough, but it should be simple to understand and use.
 290
 291 As Mercurial is a distributed RCS, it lacks the distinction between
 292 repository and working copy (every wc is a repo).
 293
 294 This means that the Mercurial backend uses directly the repository as
 295 working copy (the master M and the working copy W described in the svn
 296 example are the same thing).
 297
 298 You only need to specify 'srcdir' (the repository M) and 'destdir' (where
 299 the HTML will be generated).
 300
 301 Master repository M.
 302
 303 RCS commit from the outside are installed into M.
 304
 305 M is directly used as working copy (M is also W).
 306
 307 HTML is generated from the working copy in M. rcs_update() will update
 308 to the last committed revision in M (the same as 'hg update').
 309 If you use an 'update' hook you can generate automatically the HTML
 310 in the destination directory each time 'hg update' is called.
 311
 312 CGI operates on M. rcs_commit() will commit directly in M.
 313
 314 If you have any question or suggestion about the Mercurial backend
 315 please refer to [Emanuele](http://nerd.ocracy.org/em/)
 316
 317 ## [[tla]]
 318
 319 ## rcs
 320
 321 There is a patch that needs a bit of work linked to from [[todo/rcs]].
 322
 323 ## [[Monotone]]
 324
 325 In normal use, monotone has a local database as well as a workspace/working copy.
 326 In ikiwiki terms, the local database takes the role of the master repository, and
 327 the srcdir is the workspace.  As all monotone workspaces point to a default
 328 database, there is no need to tell ikiwiki explicitly about the "master" database.  It
 329 will know.
 330
 331 The backend currently supports normal committing and getting the history of the page.
 332 To understand the parallel commit approach, you need to understand monotone's
 333 approach to conflicts:
 334
 335 Monotone allows multiple micro-branches in the database.  There is a command,
 336 `mtn merge`, that takes the heads of all these branches and merges them back together
 337 (turning the tree of branches into a dag).  Conflicts in monotone (at time of writing)
 338 need to be resolved interactively during this merge process.
 339 It is important to note that having multiple heads is not an error condition in a
 340 monotone database.  This condition will occur in normal use.  In this case
 341 'update' will choose a head if it can, or complain and tell the user to merge.
 342
 343 For the ikiwiki plugin, the monotone ikiwiki plugin borrows some ideas from the svn ikiwiki plugin.
 344 On prepedit() we record the revision that this change is based on (I'll refer to this as the prepedit revision).  When the web user
 345 saves the page, we check if that is still the current revision.  If it is, then we commit.
 346 If it isn't then we check to see if there were any changes by anyone else to the file
 347 we're editing while we've been editing (a diff bewteen the prepedit revision and the current rev).
 348 If there were no changes to the file we're editing then we commit as normal.
 349
 350 It is only if there have been parallel changes to the file we're trying to commit that
 351 things get hairy.  In this case the current approach is to
 352 commit the web changes as a branch from the prepedit revision.  This
 353 will leave the repository with multiple heads.  At this point, all data is saved.
 354 The system then tries to merge the heads with a merger that will fail if it cannot
 355 resolve the conflict.  If the merge succeeds then everything is ok.
 356
 357 If that merge failed then there are conflicts.  In this case, the current code calls
 358 merge again with a merger that inserts conflict markers.  It commits this new
 359 revision with conflict markers to the repository.  It then returns the text to the
 360 user for cleanup.  This is less neat than it could be, in that a conflict marked
 361 revision gets committed to the repository.
 362
 363 ## [[bzr]]