src/sawsim/methods.tex

   1 \section{Methods}
   2 \label{sec:sawsim:methods}
   3
   4 % simulation overview
   5 In simulating the mechanical unfolding process, a force curve is
   6 generated by calculating the amount of cantilever bending as the
   7 substrate surface moves away from the tip.  The cantilever bending is
   8 obtained by balancing the tension in the protein polymer and the
   9 Hookean force of the bent cantilever.  The unfolding probability of
  10 the protein molecules in the polymer is then calculated for that
  11 tension, and whether an unfolding event occurs is determined according
  12 to a Monte Carlo method.  The simulation was implemented in
  13 \citetalias{kernighan88}, and there are a number of
  14 \citetalias{python} modules to facilitate running several simulations
  15 in parallel\footnote{
  16   Source code available at \url{http://blog.tremily.us/posts/sawsim/}.
  17 }.
  18
  19 In the following sections, we'll discuss models used to determine the
  20 tension of a chain composed of several types of ``domains'' (e.g.~one
  21 cantilever, three folded I27 domains, and seven unfolded I27 domains)
  22 (\cref{sec:sawsim:tension}).  We'll also work through a number of
  23 models for calculating the probability that a domain will transition
  24 from one state (e.g.~folded I27) to another (e.g.~unfolded I27)
  25 (\cref{sec:sawsim:rate}).
  26
  27 \subsection{Modeling polymer tension}
  28 \label{sec:sawsim:tension}
  29
  30 % introduce domains and groups.
  31 The fundamental abstraction of the simulation is the ``domain'', which
  32 represents a discrete chunk of the flexible chain between the
  33 substrate and the cantilever holder (\cref{fig:sawsim:domain-chain}).
  34 Each of these domains is assigned a particular state; for example, the
  35 domain representing the cantilever is assigned to the ``cantilever''
  36 state, and the domains representing protein molecules are assigned to
  37 either the ``folded'' or the ``unfolded'' state.  When balancing the
  38 tension along the chain, we assume that the spatial order of domains
  39 along the chain is irrelevant\citep{li00}, so the domains can be
  40 rearranged and grouped by state (\cref{fig:sawsim:domain-states}).  To
  41 determine the tension in the chain and the amount of cantilever
  42 bending when $N$ states are populated, a system of $N+1$ equations
  43 with $N+1$ unknowns must be solved
  44 \begin{align}
  45   F_i(x_i) &= F_t        \label{eq:sawsim:tension-balance} \\
  46   \sum_i x_i &= x_t \;,  \label{eq:sawsim:x-total}
  47 \end{align}
  48 where $F$ are tensions, $x$ are extensions, and the subscripts $i$ and
  49 $t$ represent a particular state group and the total chain
  50 respectively (\cref{fig:unfolding-schematic}).  $F(x_t)$ may be
  51 computed from this system of equations using any multi-dimensional
  52 root-finding algorithm.
  53
  54 \begin{figure}
  55   \begin{center}
  56     \subfloat[][]{\label{fig:sawsim:domain-chain}
  57       \begin{tikzpicture}[->,node distance=2cm,font=\footnotesize]
  58         \tikzstyle{every state}=[fill,draw=red!50,very thick,fill=red!20]
  59         \node[state] (A)              {domain 1};
  60         \node[state] (B) [below of=A] {domain 2};
  61         \node[state] (C) [below of=B] {.~.~.};
  62         \node[state] (D) [below of=C] {domain $N$};
  63         \node        (S) [below of=D] {Surface};
  64         \node        (E) [above of=A] {};
  65
  66         \path[-] (A) edge (B)
  67           (B) edge node [right] {Tension} (C)
  68           (C) edge (D)
  69           (D) edge (S);
  70         \path[->,green] (A) edge node [right,black] {Extension} (E);
  71       \end{tikzpicture}}
  72     \hspace{.25in}%
  73     \subfloat[][]{\label{fig:sawsim:domain-states}
  74       \begin{tikzpicture}[->,node distance=2.5cm,shorten <=1pt,shorten >=1pt,font=\footnotesize]
  75         \tikzstyle{every state}=[fill,draw=blue!50,very thick,fill=blue!20]
  76         \node[state] (A)              {cantilever};
  77         \node[state] (C) [below of=A] {transition};
  78         \node[state] (B) [left of=C]  {folded};
  79         \node[state] (D) [right of=C] {unfolded};
  80
  81         \path (B) edge [bend left] node [above] {$k_1$} (C)
  82               (C) edge [bend left] node [below] {$k_1'$} (B)
  83                   edge [bend left] node [above] {$k_2$} (D)
  84               (D) edge [bend left] node [below] {$k_2'$} (C);
  85       \end{tikzpicture}}
  86     \caption{\protect\subref{fig:sawsim:domain-chain} Extending a
  87       chain of domains.  One end of the chain is fixed, while the
  88       other is extended at a constant speed.  The domains are coupled
  89       with rigid linkers, so the domains themselves must stretch to
  90       accomodate the extension.  Compare with
  91       \cref{fig:unfolding-schematic}.
  92       \protect\subref{fig:sawsim:domain-states} Each domain exists in
  93       a discrete state.  At each timestep, it may transition into
  94       another state following a user-defined state matrix such as this
  95       one, showing a metastable transition state and an explicit
  96       ``cantilever'' domain.\label{fig:sawsim:domains}}
  97   \end{center}
  98 \end{figure}
  99
 100 \subsubsection{Hooke's law}
 101 \label{sec:sawsim:tension:hooke}
 102
 103 % introduce particular models, and mention parameter aggregation
 104 Inside this framework, we chose a particular extension model
 105 $F_i(x_i)$ for each domain state.  Cantilever elasticity is described
 106 by Hooke's law, which gives
 107 \index{Hooke's law}
 108 \begin{equation}
 109   F = \kappa_c x_c \;, \label{eq:sawsim:hooke}
 110 \end{equation}
 111 where $\kappa_c$ is the bending spring constant and $x_c$ is the
 112 deflection of the cantilever (\cref{fig:unfolding-schematic}).
 113
 114 \subsubsection{Wormlike chains}
 115 \label{sec:sawsim:tension:wlc}
 116
 117 \index{Wormlike chains}
 118 Unfolded domains are modeled as wormlike chains
 119 (WLCs)\citep{rief97a,carrion-vazquez00}, which treat the unfolded
 120 polymer as an elastic rod of persistence length $p$ and contour length
 121 $L$ (\cref{fig:wlc}).  The relationship between tension $F$ and
 122 extension (end-to-end distance) $x$ is given by Bustamante's
 123 interpolation formula\citep{marko95,bustamante94}.
 124 \nomenclature{WLC}{Wormlike chain, an entropic spring model}
 125 \nomenclature{$p$}{Persistence length of a wormlike chain
 126   (\cref{eq:sawsim:wlc})).}
 127 \nomenclature{$L$}{Contour length in a polymer tension model
 128   (\cref{eq:sawsim:wlc,eq:sawsim:fjc})}
 129 \begin{equation}
 130   F_\text{WLC}(x,p,L) = \frac{k_B T}{p}
 131       \p[{  \frac{1}{4}\p({ \frac{1}{(1-x/L)^2} - 1 })
 132             + \frac{x}{L}  }] \;,
 133       \label{eq:sawsim:wlc}
 134 \end{equation}
 135 where $p$ is the persistence length.  This interpolation formula is
 136 accurate to within 7\% of the exact $F_\text{WLC}$ for
 137 $F_\text{WLC}\approx k_B T/p_u$\citep{marko95}.  Because most unfolded
 138 proteins studied have persistence lengths on the order of the size of
 139 an amino acid
 140 ($p_u\approx3.8\U{\AA}$\citep{rief97a,carrion-vazquez99b,carrion-vazquez00}),
 141 this characteristic force works out to be around $11\U{pN}$.  Most
 142 proteins studied using force spectroscopy have unfolding forces in the
 143 hundreds of piconewtons, by which point the interpolation formula is
 144 in it's more accurate high-extension regime.
 145 \nomenclature{\AA}{{\AA}ngstr{\"o}m, a unit of length.
 146   $1\U{\AA}=1\E{-10}\U{m}$}
 147
 148 For chain with $N_u$ unfolded domains sharing a persistence length
 149 $p_u$ and per-domain contour lengths $L_{u1}$, the tension of the WLC
 150 is determine by summing the contour lengths
 151 \begin{equation}
 152   F(x, p_u, L_u, N_u) = F_\text{WLC}(x, p_u, N_u L_{u1}) \;.
 153   \label{eq:sawsim:multi-wlc}
 154 \end{equation}
 155
 156 \begin{figure}
 157   \begin{center}
 158     \subfloat[][]{\asyinclude{figures/schematic/wlc-model}%
 159       \label{fig:wlc-model}}
 160     \hspace{.25in}%
 161     \subfloat[][]{\asyinclude{figures/schematic/wlc-extension}%
 162       \label{fig:wlc-extension}}
 163     \caption{\protect\subref{fig:wlc-model} The wormlike chain models
 164       a polymer as an elastic rod with persistence length $p$ and
 165       contour length $L$.
 166       \protect\subref{fig:wlc-extension} Force vs.~extension for a WLC
 167       using Bustamante's interpolation formula.\label{fig:wlc}}
 168   \end{center}
 169 \end{figure}
 170
 171 \subsubsection{Folded domains}
 172 \label{sec:sawsim:tension:folded}
 173
 174 Short chains of folded proteins, however, are not easily described by
 175 polymer models.  Several studies have used WLC and FJC models to fit
 176 the elastic properties of the modular protein
 177 titin\citep{granzier97,linke98a},
 178 % TODO: check it really is folded domains \& bulk titin
 179 but native titin contains hundreds of folded and unfolded domains.
 180 For the short protein polymers common in mechanical unfolding
 181 experiments (\cref{sec:polymer-synthesis}), the cantilever dominates
 182 the elasticity of the polymer-cantilever system before any protein
 183 molecules unfold.  After the first unfolding event occurs, the
 184 unfolded portion of the chain is already longer and softer than the
 185 sum of all the remaining folded domains, and dominates the elasticity
 186 of the whole chain.  Therefore, the details of the tension model
 187 chosen for the folded domains has negligible effect on the unfolding
 188 forces (\cref{eq:sawsim:x-total}), which was also suggested by
 189 \citet{staple08}.  Force curves simulated using different models to
 190 describe the folded domains yielded almost identical unfolding force
 191 distributions (data not shown).% TODO: show data
 192
 193 As an alternative to modeling the folded domains explicitly or
 194 ignoring them completely, another approach is to subtract the
 195 end-to-end length of the folded protein from the contour length of the
 196 unfolded protein to create an effective contour length for the
 197 unfolding\citep{carrion-vazquez99b}.  This effectively models the
 198 folded domains as WLCs with the same persistence length as the
 199 unfolded domains.
 200
 201 %The chain of $N_f$ folded domains is modeled as a string, free to
 202 %assume any extension up to some fixed contour length $L_f=N_fL_{f1}$
 203 %\begin{equation}
 204 %  F = \begin{cases}
 205 %         0 & \text{if $x_f<L_f$} \;, \\
 206 %         \infty & \text{if $x_f>L_f$} \;,
 207 %      \end{cases}  \label{eq:sawsim:piston}
 208 %\end{equation}
 209 %where $L_{f1}$ is the separation of the two linking points of a folded
 210 %domain, and $x_f$ is the end-to-end length of the chain of folded
 211 %domains.  In this model, any non-zero tension will fully extend these
 212 %folded domains.  Because the range of possible extensions for folded
 213 %domains is so short, the contribution of the folded domains to the
 214 %elastic behavior of the polymer-cantilever system is relatively
 215 %insignificant.
 216
 217 \subsubsection{Other models}
 218 \label{sec:sawsim:tension:other}
 219
 220 \index{FJC}
 221 \index{Kuhn length}
 222 The unfolded polypeptide chain has been shown to follow the WLC model
 223 quite well\citep{rief97a} (\cref{sec:sawsim:tension:wlc}), though
 224 other polymer models have been tried.  One alternative is the
 225 freely-jointed chain
 226 (FJC)\citep{kellermayer97,linke98a,janshoff00,verdier70}, which models
 227 the polymer as a series of $N$ rigid links, each of length $l$ (the
 228 Kuhn length), which are free to rotate about their joints
 229 (\cref{fig:fjc}).
 230 \index{Langevin function}
 231 \begin{equation}
 232   F_\text{FJC}(x,l,L) = \frac{k_B T}{l} \Langevin^{-1}\p({\frac{x}{L}}) \;,
 233   \label{eq:sawsim:fjc}
 234 \end{equation}
 235 where $L=Nl$ is the total length of the chain, and
 236 $\Langevin(\alpha)\equiv\coth{\alpha}-\frac{1}{\alpha}$ is the
 237 Langevin function\citep{hatfield99}.
 238 %
 239 \nomenclature{FJC}{Freely-jointed chain, an entropic spring model
 240   (\cref{eq:sawsim:fjc}).}
 241 \nomenclature{$\Langevin$}{The Langevin function,
 242   $\Langevin(\alpha)\equiv\coth{\alpha}-\frac{1}{\alpha}$}
 243 \nomenclature{$\coth$}{Hyperbolic cotangent,
 244   \begin{equation}
 245     \coth(x) = \frac{\exp{x} + \exp{-x}}{\exp{x} - \exp{-x}} \;.
 246   \end{equation}
 247 }
 248 \nomenclature{$l$}{Kuhn length in the freely-jointed chain
 249   (\cref{fig:fjc-model,eq:sawsim:fjc}).}
 250
 251 \begin{figure}
 252   \begin{center}
 253     \subfloat[][]{\asyinclude{figures/schematic/fjc-model}%
 254       \label{fig:fjc-model}}
 255     \hspace{.25in}%
 256     \subfloat[][]{\asyinclude{figures/schematic/fjc-extension}%
 257       \label{fig:fjc-extension}}
 258     \caption{\protect\subref{fig:fjc-model} The freely-jointed chain
 259       models the polymer as a series of $N$ rigid links, each of
 260       length $l$, which are free to rotate about their joints.  Each
 261       polymer state is a random walk, and the density of states for a
 262       given end-to-end distance is determined by the number of random
 263       walks that have such an end-to-end distance.
 264       \protect\subref{fig:fjc-extension} Force vs.~extension for a
 265       hundred-segment FJC.  The WLC extension curve (with $p=l$) is
 266       shown as a dashed line for comparison.\label{fig:fjc}}
 267   \end{center}
 268 \end{figure}
 269
 270 More exotic models such as elastic WLCs\citep{janshoff00,puchner08},
 271 elastic FJCs\citep{fisher99a,janshoff00}, and freely rotating
 272 chains\citep{puchner08} (FRCs) have also been used to model DNA and
 273 polysaccharides, but are rarely used to model the relatively short and
 274 inextensible synthetic proteins used in force spectroscopy.
 275 \nomenclature{FRC}{Freely-rotating chain (like the FJC, except that
 276   the bond angles are fixed.  The torsional angles are not
 277   restricted)}
 278
 279
 280 \subsubsection{Assumptions}
 281 \label{sec:sawsim:tension:assumptions}
 282
 283 % address assumptions & caveats
 284 In the simulation, the protein polymer is assumed to be stretched in
 285 the direction perpendicular to the substrate surface, which is a good
 286 approximation in most experimental situations, because the unfolded
 287 length of a protein molecule is much larger than that of the folded
 288 form.  Therefore, after one molecule unfolds, the polymer becomes much
 289 longer and the angle between the polymer and the surface approaches
 290 $90$ degrees\citep{carrion-vazquez00}.
 291
 292 The joints between domain groups are also assumed to lie along a line
 293 between the surface tether point and the position of the tip
 294 (\cref{eq:sawsim:x-total} is scalar, not vector, addition).  The
 295 effects of this assumption are also minimized due to greater length of
 296 the unfolded domain compared with the other domains (folded proteins
 297 and cantilever deflection).  For example, a $0.050\U{N/m}$ cantilever
 298 under $200\U{pN}$ of tension will bend $x_c=F/\kappa_c=4\U{nm}$.  The
 299 entire end-to-end length of folded domains such as I27 are also around
 300 $5\U{nm}$ (\cref{fig:I27}).  A single unfolded I27, with its 89 amino
 301 acids\citep{improta96}, should have an unfolded contour length of
 302 $89\U{aa}\cdot0.38\U{nm}=33.8\U{nm}$, equivalent to a cantilever and
 303 five folded domains.
 304
 305 \subsubsection{Velocity-clamp example}
 306 \label{sec:sawsim:velocity-clamp}
 307
 308 % introduce constant velocity and walk through explicit example pull
 309 Consider an experiment pulling a polymer with $N$ identical protein
 310 domains at a constant speed.  At the start of an experiment, the
 311 chain is unstretched ($x_t=0$), which means all the domains are
 312 unstretched, the cantilever is undeflected, and the tip is in contact
 313 with the surface.  There is one domain in the cantilever state, $N$ in
 314 the folded state, and none in the unfolded state.  As the surface
 315 moves away from the tip at a constant speed $v$, the chain becomes
 316 more extended (\cref{fig:unfolding-schematic}), such that
 317 \begin{equation}
 318   x_t = \sum_i x_i = vt  \label{eq:sawsim:const-v} \;.
 319 \end{equation}
 320 The simulation assumes that the pulling takes discrete steps in space
 321 and treats $x_t$ as constant over the duration of one time step
 322 $\Delta t$.  Because of the adaptive time steps discussed in
 323 \cref{sec:sawsim:timesteps}, the space steps $\Delta x_t = v\Delta t$
 324 may have different sizes, but each step will be ``small''.  At each
 325 step, the total extension is calculated using
 326 \cref{eq:sawsim:const-v}, and the tension $F(x_t=vt)$ is determined by
 327 numerically solving \cref{eq:sawsim:tension-balance,eq:sawsim:x-total}
 328 using the models \cref{eq:sawsim:hooke,eq:sawsim:wlc}
 329 %,eq:sawsim:piston}
 330 for known values of the parameters in the various states $(N_u, N_f,
 331 v, \kappa_c,
 332 % L_{f1},
 333 L_{u1}, p_u)$.  When one of the molecules in the
 334 polymer unfolds (\cref{sec:sawsim:rate}), there will be
 335 one domain in the unfolded state and $N-1$ in the folded state.  In
 336 the next step, a newly balanced tension between the cantilever and the
 337 polymer is determined by solving for $F(x_t)$ as discussed above, but
 338 with the total extension $x_t$ incremented by $v\Delta t$ and the new
 339 unfolded contour length $L_{u1}$ and folded contour length
 340 $(N-1)L_{f1}$.  The sudden lengthening of the polymer chain results in
 341 a corresponding abrupt drop in the force, leading to the formation of
 342 one sawtooth in the force curve.  As the pulling continues and more
 343 domains unfold, force curves with a series of sawteeth are generated
 344 (\cref{fig:sawsim:sim-sawtooth}).
 345 %
 346 \nomenclature{$v$}{Cantilever retraction speed in velocity-clamp
 347   unfolding experiments.}
 348
 349 \subsubsection{Equlibration time scales}
 350 \label{sec:sawsim:timescales}
 351
 352 The tension calculation assumes an equilibrated chain, so
 353 consideration must be given to the chain's relaxation time, which
 354 should be short compared to the loading time scale.  The relaxation
 355 time for a WLC\index{WLC!relaxation time} is given by
 356 \begin{equation}
 357  \tau \approx \eta \frac{k_BT p}{F^2}
 358 %   < \eta \frac{k_BT p}{(k_BTx/pL)^2}
 359 %     Note:  <  because  F > k_BTx/pL
 360 %   < \frac{\eta p^3 L^2}{k_BT x^2}
 361 %   < \frac{\eta p^2 L}{k_BT} % for x/L > \sqrt{p/L}
 362    \;, \label{eq:sawsim:tau-wlc}
 363 \end{equation}
 364 where $\eta$ is the dynamic viscosity, $F$ is the tension, and $p$ is
 365 the persistence length\citep{evans99}.  For forces greater than
 366 $1\U{pN}$, with $\eta_\text{water}/k_BT=2.45\E{-10}\U{s/nm$^3$}$,
 367 $\tau<2\U{ns}$ for the protein polymer used in the simulation.
 368 %                  eta/(k_BT)        * (k_B     *T  )**2 * p      / F**2
 369 % python -c 'print(2.45e-10*(1e9)**3 * (1.38e-23*300)**2 * 0.4e-9 / (1e-12)**2)'
 370 %                  s/m**3            * (J/K     *K  )**2 * m      / N**2         = s
 371 % 1.68...e-09
 372 Therefore, the polymer chain is equilibrated almost instantaneously
 373 within a time step, which is on the order of tens of microseconds.
 374 The relaxation time of the cantilever can be determined by measuring
 375 the cantilever deflection induced by liquid motion and fitting the
 376 time dependence of the deflection to an exponential
 377 function\citep{jones05}.  For a $200\U{$\mu$m}$ rectangular cantilever
 378 with a bending spring constant of $20\U{pN/nm}$, the measured
 379 relaxation time in water is $\sim50\U{$\mu$/s}$ (data not shown).
 380 % TODO: show data
 381 This relatively large relaxation time constant makes the cantilever
 382 act as a low-pass filter and also causes a lag in the force
 383 measurement.
 384 %
 385 \nomenclature{$\eta$}{Dynamic viscocity (\cref{eq:sawsim:tau-wlc}).}
 386
 387 \subsection{Unfolding protein molecules by force}
 388 \label{sec:sawsim:rate}
 389
 390 In the previous section, we discussed methods for calculating the
 391 tension of a chain composed of several domains in series
 392 (\cref{fig:sawsim:domain-chain}).  Those methods allow us to calculate
 393 the tension of the chain for any given extension.  We use that tension
 394 to calculate transition rates between states
 395 (\cref{fig:sawsim:domain-states}).  In this section, we'll introduce
 396 the Bell model for unfolding (\cref{sec:sawsim:rate:bell}) and mention
 397 a few more exotic models.  We'll wrap up by pointing out some of the
 398 approximations and assumptions we make when we use these simple models
 399 (\cref{sec:sawsim:rate:assumptions}).
 400
 401 \subsubsection{Bell model}
 402 \label{sec:sawsim:rate:bell}
 403
 404 \index{Bell model}
 405 % introduce Bell, probability calculations, and MC comparison
 406 According to the theory developed by \citet{bell78} and extended by
 407 \citet{evans99}, an external stretching force $F$ increases the
 408 unfolding rate constant of a protein molecule\footnote{
 409   Also in \xref{hummer03}{equation}{1}, the first paragraphs of
 410   \citet{dudko06} and \citet{dudko07}, and many other SMFS articles.
 411 }
 412 \index{Bell model}
 413 \begin{equation}
 414   k_u = k_{u0} \exp{\frac{F\Delta x_u}{k_B T}} \;, \label{eq:sawsim:bell}
 415 \end{equation}
 416 where $k_{u0}$ is the unfolding rate in the absence of an external
 417 force, and $\Delta x_u$ is the distance between the native state and
 418 the transition state along the pulling direction.
 419 %
 420 \nomenclature{$\exp{x}$}{Exponential function,
 421   \begin{equation}
 422     \exp{x} = \sum_{n=0}^{\infty} \frac{x^n}{n"!}
 423       = 1 + x + \frac{x^2}{2"!} + \ldots \;.
 424   \end{equation}
 425 }
 426 \nomenclature{$e$}{Euler's number, $e=2.718\ldots$.}
 427
 428 \begin{figure}
 429   \asyinclude{figures/schematic/landscape-bell}
 430   \caption{Energy landscape schematic for Bell model unfolding
 431     (\cref{eq:sawsim:bell}), which models folded domains as two-state
 432     systems parameterized by an unforced unfolding rate $k_{u0}$ and a
 433     distance $\Delta x$ between the folded and transition
 434     states.\label{fig:bell-landscape}}
 435 \end{figure}
 436
 437 \subsubsection{Monte Carlo transitions}
 438 \label{sec:sawsim:monte-carlo}
 439
 440 We can use the Bell model (or other models, see
 441 \cref{sec:sawsim:rate:other}) to calculate the unfolding rate $k_u$ at
 442 a given force for a single domain.  The probability for that single
 443 protein domain to unfold under applied force is
 444 \begin{equation}
 445   P_1 = k_u \Delta t \;,  \label{eq:sawsim:prob-one}
 446 \end{equation}
 447 where $\Delta t$ is the time duration for each pulling step, over
 448 which $F$ is constant.  This expression is accurate for $P_1 \ll 1$.
 449 From the binomial distribution, the probability of at least one of a
 450 group of $N_f$ identical domains to unfold in a given time step is
 451 \begin{equation}
 452   P = 1 - (1-P_1)^{N_f} \approx N_fP_1 \;,  \label{eq:sawsim:prob-n}
 453 \end{equation}
 454 where the approximation is valid when $N_fP_1 \ll 1$.
 455 %
 456 \nomenclature{$k$}{Rate constant for general state transitions
 457   (inverse seconds)}
 458 \nomenclature{$k_u$}{Unfolding rate constant}
 459 \nomenclature{$k_{u0}$}{Unforced unfolding rate constant}
 460 \nomenclature{$\Delta x_u$}{Distance between a domain's native state
 461   and the transition state along the pulling direction.}
 462 \nomenclature{$P$}{Probability for at least one domain unfolding in a
 463   given time step (\cref{eq:sawsim:prob-n}).}
 464
 465 \begin{figure}
 466   \asyinclude{figures/schematic/monte-carlo}
 467   \caption{Once the unfolding probability has been caculated, we need
 468     to determine whether or not a domain should unfold.  We do this by
 469     generating a random number, and comparing that number to the
 470     unfolding probability $P$.  The random number determines which of
 471     the possible paths we should follow for the current simulation.
 472     Such ``statistical sampling'' is the hallmark of the Monte Carlo
 473     approach\citep{metropolis87}.  This cartoon translates the idea
 474     into the more familiar doors (possible paths) and dice (random
 475     numbers).%
 476     \label{fig:monte-carlo}}
 477 \end{figure}
 478
 479 To determine if an unfolding event occurs in a particular time step,
 480 the probability calculated using \cref{eq:sawsim:prob-n} is compared
 481 with a randomly generated number uniformly distributed between $0$ and
 482 $1$ (\cref{fig:monte-carlo}).  If $P$ is bigger than the random
 483 number, a domain unfolds, changing the population of each tension
 484 state, and a new balance between the polymer and the cantilever is
 485 determined.  If no unfolding event occurs the pulling continues and
 486 the unfolding probability is calculated again in the next step at a
 487 higher force.  When all the molecules in the polymer have unfolded,
 488 the pulling continues until a pre-determined force level is reached,
 489 where the polymer is assumed to detach from one of the tethering
 490 surfaces.  The cantilever deflection becomes zero after this point.
 491
 492 \subsubsection{Other models}
 493 \label{sec:sawsim:rate:other}
 494
 495 Although the Bell model (\cref{eq:sawsim:bell}) is the most widely
 496 used unfolding model due to its simplicity and its applicability to
 497 various biopolymers\citep{rief98}, other theoretical models have been
 498 proposed to interpret mechanical unfolding data.  For example,
 499 \citet{walton08} uses a stiffness-corrected Bell model.
 500 \citet{schlierf06} used the mechanical unfolding data of the protein
 501 ddFLN4 to demonstrate that Kramers' diffusion model (in the
 502 spatial-diffusion-limited case, a.k.a. the Smoluchowski
 503 limit)\citep{kramers40,hanggi90,evans97,shillcock98,vanKampen07} fit
 504 the measured unfolding force data better than the Bell model for
 505 proteins with broad free energy barriers.
 506 \index{Kramers' model}
 507 \index{Diffusion coefficient}
 508 \index{Free energy}
 509 \index{Unfolding coordinate}
 510 \begin{equation}
 511   \frac{1}{k_u}
 512     = \frac{1}{D}
 513       \integral{-\infty}{\infty}{x}{%
 514         \exp{\frac{U_F(x)}{k_B T}}
 515         \integral{-\infty}{x}{x'}{%
 516           \exp{\frac{-U_F(x')}{k_B T}}}} \;,
 517   \label{eq:kramers}
 518 \end{equation}
 519 where $D$ is the diffusion coefficient and $U_F(x)$ is the free energy
 520 along the unfolding cordinate $x$ (\cref{fig:landscape:kramers}).
 521 \nomenclature{$D$}{Diffusion coefficient (square meters per second)}
 522 \nomenclature{$U_F(x)$}{Protein free energy along the unfolding
 523   coordinate $x$ (joules)}
 524
 525 \begin{figure}
 526   \begin{center}
 527     \subfloat[][]{\asyinclude{figures/schematic/landscape}%
 528       \label{fig:landscape}}
 529     % \hspace{.25in}%
 530     \subfloat[][]{\asyinclude{figures/schematic/kramers-integrand}%
 531       \label{fig:kramers:integrand}}
 532     \caption{\protect\subref{fig:landscape} Energy landscape schematic
 533       for Kramers integration (compare with
 534       \cref{fig:bell-landscape}).
 535       \protect\subref{fig:kramers:integrand} A map of the magnitude of
 536       Kramers' integrand, with black lines tracing the integration
 537       region.  The bulk of the contribution to the integral comes from
 538       the bump in the upper left, with $x$ near the boundary and $x'$
 539       near the folded state.  This is why you can calculate a close
 540       approximation to this integral by restricting the integration to
 541       $x_\text{min}$ and $x_\text{max}$, located a few $k_B T$ beyond
 542       the folded and transition states respectively.  The restricted
 543       integral is much easier to calculate numerically than one bound
 544       by $\pm\infty$.
 545       (\cref{eq:kramers}).\label{fig:landscape:kramers}}
 546   \end{center}
 547 \end{figure}
 548
 549 When you are simulating the double integral form of Kramers' model,
 550 you obviously need to parameterize $U_F(x)$ somehow.  There has not
 551 been much research done in this direction, but \citet{schlierf06} used
 552 cubic splines with 15 variable knots.  \citet{shillcock98} used a
 553 cubic free energy with variable coefficients.  The amount of
 554 information you can extract from fitting such a model to your data is
 555 limitless, but you run the risk of over-specifying if you add too many
 556 parameters when you're fitting noisy data.
 557
 558 There are alternative formulations of Kramers' model besides the full
 559 double integral approach.  You can use a Gaussian steepest-descent
 560 approximation (a.k.a. stationary phase method or saddle-point method)
 561 to reduce the integral to a formula that only depends on the free
 562 energy landscape via the curvature $\npderiv{2}{x}{U_F}$ evaluated at
 563 the folded state and transition state\citep{hanggi90}.  This approach
 564 makes sense for sufficiently sharp folded and transition states, where
 565 these two measurements will capture the shape of the large-integrand
 566 region (\cref{fig:kramers:integrand}).  The steepest-descent
 567 formulation has less to say about the underlying energy landscape, but
 568 it may be more robust in the face of noisy data.
 569
 570 How to choose which unfolding model to use?  For proteins with
 571 relatively narrow folded and transition states, the Bell model
 572 provides a good approximation, and it is the model used by the vast
 573 majority of earlier work in the field.  I will use the Bell model in
 574 my analysis of ion-dependent unfolding (\cref{sec:salt}), but
 575 analyzing my unfolding data with a different transition rate model is
 576 just a matter of changing some command line options and rerunning the
 577 \sawsim\ simulations.
 578
 579 \subsubsection{Assumptions}
 580 \label{sec:sawsim:rate:assumptions}
 581
 582 The interactions between different parts of the polymer and between
 583 the chain and the surface (except at the tethering points) are often
 584 ignored.  This is usually reasonable since these interactions should
 585 not make substantial contributions to the force curve at the force
 586 levels of interest, where the polymer is in a relatively extended
 587 conformation.  However, \citet{li00} showed that while the unfolding
 588 properties of I27 are not effected by I28 flankers, I28 \emph{is}
 589 stabilized by neighboring I27.  The unforced Bell model unfolding rate
 590 for I28 in (I28)\textsubscript{8} was $2.8\E{-5}\U{s$^{-1}$}$, while
 591 in (I27-I28)\textsubscript{4} it dropped to
 592 $2.5\E{-6}\U{s$^{-1}$}$\citep{li00}.
 593
 594 \subsection{Choosing the simulation time steps}
 595 \label{sec:sawsim:timesteps}
 596
 597 The demands on the time step vary throughout a simulated pull due to
 598 the non-linear elasticity of the polymer.  Within a specified time
 599 duration (or pulling distance), the force change is small at low force
 600 levels and large at high force levels.  To be efficient, the
 601 simulation algorithm adapts the time step to keep the time steps large
 602 where large time steps have little effect, while shrinking the time
 603 step where smaller steps are necessary.
 604
 605 Within each time step, the total chain extension $x_t$ is treated as a
 606 constant and a force balance is reached very quickly among the various
 607 domains (\cref{sec:sawsim:timescales}).  This force is used to
 608 determine the unfolding probability
 609 (\cref{eq:sawsim:prob-one,eq:sawsim:prob-n}), which determines the
 610 domain state populations in the next time step.  Therefore, the chain
 611 tension must not change appreciably over the course of the time step
 612 ($\Delta F < 1\U{pN}$), and the unfolding probability is only
 613 calculated once for the entire step.  The time step must also be short
 614 enough that the probability of unfolding in a single time step is low
 615 ($P<10^{-3}$).  Besides ensuring that the approximations made in
 616 \cref{eq:sawsim:prob-one,eq:sawsim:prob-n} are valid, this restriction
 617 makes time steps which should have multiple unfoldings in a single
 618 time step highly unlikely.  Experimentally measured unfolding are
 619 temporally separated, because the unfolding transition is
 620 characterized by multiple, Markovian attempts over a large energy
 621 barrier, where the probability of crossing the barrier in a single
 622 attempt is very low.  A successful attempt quickly extends the chain
 623 contour length, reducing the tension, dramatically reducing the
 624 likelihood of a second escape in that time step.  The time step used
 625 is recalculated for each step so that both of these criteria are
 626 satisfied.