src/sawsim/methods.tex

   1 \section{Methods}
   2 \label{sec:sawsim:methods}
   3
   4 % simulation overview
   5 In simulating the mechanical unfolding process, a force curve is
   6 generated by calculating the amount of cantilever bending as the
   7 substrate surface moves away from the tip.  The cantilever bending is
   8 obtained by balancing the tension in the protein polymer and the
   9 Hookean force of the bent cantilever.  The unfolding probability of
  10 the protein molecules in the polymer is then calculated for that
  11 tension, and whether an unfolding event occurs is determined according
  12 to a Monte Carlo method.  The simulation was implemented in
  13 C\footnote{Source code available at
  14   \url{http://www.physics.drexel.edu/~wking/sawsim/}}.
  15
  16 \subsection{Generating force curves}
  17 \label{sec:sawsim:methods-tension}
  18
  19 % introduce domains and groups.
  20 The fundamental abstraction of the simulation is the ``domain'', which
  21 represents a discrete chunk of the flexible chain between the
  22 substrate and the cantilever holder.  Each of these domains is
  23 assigned a particular state; for example, the domain representing the
  24 cantilever is assigned to the ``cantilever'' state, and the domains
  25 representing protein molecules are assigned to either the ``folded''
  26 or the ``unfolded'' state.  When balancing the tension along the
  27 chain, we assume that the spatial order of domains along the chain is
  28 irrelevant\citep{li00}, and therefore, the domains can be rearranged
  29 and grouped by state.  To determine the tension in the chain and the
  30 amount of cantilever bending when $n$ states are populated, a system
  31 of $n+1$ equations with $n+1$ unknowns must be solved
  32 \begin{align}
  33   F_i(x_i) &= F_t        \label{eq:sawsim:tension-balance} \\
  34   \sum_i x_i &= x_t \;,  \label{eq:sawsim:x-total}
  35 \end{align}
  36 where $F$ are tensions, $x$ are extensions, and the subscripts $i$ and
  37 $t$ represent a particular state group and the total chain
  38 respectively (\cref{fig:unfolding-schematic}).  From this $F(x_t)$ may be
  39 computed using any multi-dimensional root-finding algorithm.
  40
  41 % introduce particular models, and mention parameter aggregation
  42 Inside this framework, we chose a particular extension model
  43 $F_i(x_i)$ for each domain state.  Cantilever elasticity is described
  44 by Hooke's law, which gives
  45 \begin{equation}
  46   F = \kappa_c x_c \;, \label{eq:sawsim:hooke}
  47 \end{equation}
  48 where $\kappa_c$ is the bending spring constant and $x_c$ is the
  49 deflection of the cantilever (\cref{fig:unfolding-schematic}).
  50 Unfolded domains are modeled as WLCs (\cref{sec:tension:wlc}).
  51
  52 The chain of $N_f$ folded domains is modeled as a string, free to
  53 assume any extension up to some fixed contour length $L_f=N_fL_{f1}$
  54 \begin{equation}
  55   F = \begin{cases}
  56          0 & \text{if $x_f<L_f$} \;, \\
  57          \infty & \text{if $x_f>L_f$} \;,
  58       \end{cases}  \label{eq:sawsim:piston}
  59 \end{equation}
  60 where $L_{f1}$ is the separation of the two linking points of a folded
  61 domain, and $x_f$ is the end-to-end length of the chain of folded
  62 domains.  In this model, any non-zero tension will fully extend these
  63 folded domains.  As discussed in \cref{sec:tension:folded}, the
  64 contribution of the folded domains to the elastic behavior of the
  65 polymer-cantilever system is relatively insignificant.
  66
  67 % address assumptions & caveats
  68 In the simulation, the protein polymer is assumed to be stretched in
  69 the direction perpendicular to the substrate surface, which is a good
  70 approximation in most experimental situations, because the unfolded
  71 length of a protein molecule is much larger than that of the folded
  72 form.  Therefore, after one molecule unfolds, the polymer becomes much
  73 longer and the angle between the polymer and the surface approaches
  74 $90$ degrees\citep{carrion-vazquez00}.  The joints between domain
  75 groups are assumed to lie along a line between the surface tether
  76 point and the position of the tip (\cref{eq:sawsim:x-total}).  The
  77 effects of this assumption are also minimized due to greater length of
  78 the unfolded domain.  Finally, the interactions between different
  79 parts of the polymer and between the chain and the surface (except at
  80 the tethering points) are not considered.  This is reasonable since
  81 these interactions should not make substantial contributions to the
  82 force curve at the force levels of interest, where the polymer is in a
  83 relatively extended conformation.
  84
  85 % introduce constant velocity and walk through explicit example pull
  86 Consider an experiment of pulling a polymer with $N$ identical protein
  87 molecules at a constant speed.  At the start of an experiment, the
  88 chain is unstretched ($x_t=0$), which means all the domains are
  89 unstretched, the cantilever is undeflected, and the tip is in contact
  90 with the surface.  There is one domain in the cantilever state, $N$ in
  91 the folded state, and none in the unfolded state.  As the surface
  92 moves away from the tip at a constant speed $v$, the chain becomes
  93 more extended (\cref{fig:unfolding-schematic}), such that
  94 \begin{equation}
  95   x_t = \sum_i x_i = vt  \label{eq:sawsim:const-v} \;.
  96 \end{equation}
  97 The simulation assumes that the pulling takes discrete steps in space
  98 and treats $x_t$ as constant over the duration of one time step
  99 $\Delta t$.  Because of the adaptive time steps discussed in
 100 \cref{sec:sawsim:methods-timesteps}, the space steps $\Delta x_t = v\Delta t$
 101 may have different sizes, but each step will be ``small''.  At each
 102 step, the total extension is calculated using \cref{eq:sawsim:const-v}, and
 103 the tension $F(x_t=vt)$ is determined by numerically solving
 104 \cref{eq:sawsim:tension-balance,eq:sawsim:x-total} using the models
 105 \cref{eq:sawsim:hooke,eq:sawsim:wlc,eq:sawsim:piston} for known values of the parameters in
 106 the various states $(N_u, N_f, v, \kappa_c, L_{f1}, L_{u1}, p_u)$.
 107 When one of the molecules in the polymer unfolds
 108 (\cref{sec:sawsim:methods-unfolding}), there will be one domain in the
 109 unfolded state and $N-1$ in the folded state.  In the next step, a
 110 newly balanced tension between the cantilever and the polymer is
 111 determined by solving for $F(x_t)$ as discussed above, but with the
 112 total extension $x_t$ incremented by $v\Delta t$ and the new unfolded
 113 contour length $L_{u1}$ and folded contour length $(N-1)L_{f1}$.  The
 114 sudden lengthening of the polymer chain results in a corresponding
 115 abrupt drop in the force, leading to the formation of one sawtooth in
 116 the force curve.  As the pulling continues and more domains unfold,
 117 force curves with a series of sawteeth are generated
 118 (\cref{fig:sawsim:sim-sawtooth}).
 119
 120 The tension calculation assumes an equilibrated chain, so
 121 consideration must be given to the chain's relaxation time, which
 122 should be short compared to the loading timescale.  The relaxation
 123 time for a WLC\index{WLC!relaxation time} is given by
 124 \begin{equation}
 125  \tau \approx \eta \frac{k_BT p}{F^2}
 126 %  < \eta \frac{k_BT p}{(k_BTx/pL)^2} =
 127 %     Note:  <  because  F > k_BTx/pL
 128 %   < \frac{\eta p^3 L^2}{k_BT x^2}
 129 %   < \frac{\eta p^2 L}{k_BT} % for x/L > \sqrt{p/L}
 130    \;, \label{eq:sawsim:tau-wlc}
 131 \end{equation}
 132 where $\eta$ is the dynamic viscosity, $F$ is the tension, and $p$ is
 133 the persistence length\citep{evans99}.  For forces greater than
 134 $1\U{pN}$, with $\eta_\text{water}/k_BT=2.45\E{-10}\U{s/nm$^3$}$,
 135 $\tau<2\U{ns}$ for the protein polymer used in the simulation.
 136 % python -c 'print 2.45e-10*(1e9)**3 * (1.38e-23*300)**2 * 0.4e-9 / (1e-12)**2'
 137 % 1.68...e-09
 138 Therefore, the polymer chain is equilibrated almost instantaneously
 139 within a time step, which is on the order of tens of microseconds.
 140 The relaxation time of the cantilever can be determined by measuring
 141 the cantilever deflection induced by liquid motion and fitting the
 142 time dependence of the deflection to an exponential
 143 function\citep{jones05}.  For a $200\U{$\mu$m}$ rectangular cantilever
 144 with a bending spring constant of $20\U{pN/nm}$, the measured
 145 relaxation time in water is $\sim50\U{$\mu$/s}$ (data not shown.
 146 TODO: show data).  This relatively large relaxation time constant
 147 makes the cantilever act as a low-pass filter and also causes a lag in
 148 the force measurement.
 149
 150 \subsection{Unfolding protein molecules by force}
 151 \label{sec:sawsim:methods-unfolding}
 152
 153 % introduce Bell, probability calculations, and MC comparison
 154 According to the theory developed by \citet{bell78} and extended by
 155 \citet{evans99}, an external stretching force $F$ increases the
 156 unfolding rate constant of a protein molecule
 157 \begin{equation}
 158   k_u = k_{u0} \exp\p({\frac{F\Delta x_u}{k_B T}}) \;, \label{eq:sawsim:bell}
 159 \end{equation}
 160 where $k_{u0}$ is the unfolding rate in the absence of an external
 161 force, and $\Delta x_u$ is the distance between the native state and
 162 the transition state along the pulling direction.  The probability for
 163 a protein molecule to unfold under an applied force is
 164 \begin{equation}
 165   P_1 = k_u \Delta t \;,  \label{eq:sawsim:prob-one}
 166 \end{equation}
 167 where $\Delta t$ is the time duration for each pulling step, over
 168 which $F$ is constant.  This expression is accurate for $P_1 \ll 1$.
 169 From the binomial distribution, the probability of at least one of a
 170 group of $N_f$ identical domains to unfold in a given time step is
 171 \begin{equation}
 172   P = 1 - (1-P_1)^{N_f} \approx N_fP_1 \;,  \label{eq:sawsim:prob-n}
 173 \end{equation}
 174 where the approximation is valid when $N_fP_1 \ll 1$.
 175
 176 \begin{figure}
 177   \asyinclude{figures/schematic/monte-carlo}
 178   \caption{Once the unfolding probability has been caculated, we need
 179     to determine whether or not a domain should unfold.  We do this by
 180     generating a random number, and comparing that number to the
 181     unfolding probability $P$.  The random number determines which of
 182     the possible paths we should follow for the current simulation.
 183     Such ``statistical sampling'' is the hallmark of the Monte Carlo
 184     approach\citep{metropolis87}.  This cartoon translates the idea
 185     into the more familiar doors (possible paths) and dice (random
 186     numbers).%
 187     \label{fig:monte-carlo}}
 188 \end{figure}
 189
 190 To determine if an unfolding event occurs in a particular time step,
 191 the probability calculated using \cref{eq:sawsim:prob-n} is compared
 192 with a randomly generated number uniformly distributed between $0$ and
 193 $1$ (\cref{fig:monte-carlo}).  If $P$ is bigger than the random
 194 number, a domain unfolds, changing the population of each tension
 195 state, and a new balance between the polymer and the cantilever is
 196 determined.  If no unfolding event occurs the pulling continues and
 197 the unfolding probability is calculated again in the next step at a
 198 higher force.  When all the molecules in the polymer have unfolded,
 199 the pulling continues until a pre-determined force level is reached,
 200 where the polymer is assumed to detach from one of the tethering
 201 surfaces.  The cantilever deflection becomes zero after this point.
 202
 203 % address unfolding models
 204 Although the Bell model (\cref{eq:sawsim:bell}) is the most widely used
 205 unfolding model due to its simplicity and its applicability to various
 206 biopolymers\citep{rief98}, other theoretical models have been proposed
 207 to interpret mechanical unfolding data.  For example,
 208 \citet{schlierf06} used the mechanical unfolding data of the protein
 209 ddFLN4 to demonstrate that Kramers' diffusion model fit the measured
 210 unfolding force data better than the Bell model for proteins with
 211 broad free energy barriers.  For proteins with relatively narrow
 212 folded and transition states, the Bell model provides a good
 213 approximation.
 214
 215 \subsection{Choosing the simulation time steps}
 216 \label{sec:sawsim:methods-timesteps}
 217
 218 The demands on the time step vary throughout a simulated pull due to
 219 the non-linear elasticity of the polymer.  Within a specified time
 220 duration (or pulling distance), the force change is small at low force
 221 levels and large at high force levels.  To be efficient, the
 222 simulation algorithm adapts the time step to keep the time steps large
 223 where large time steps have little effect, while shrinking the time
 224 step where smaller steps are necessary.
 225
 226 Within each time step, the total chain extension $x_t$ is treated as a
 227 constant and a force balance is reached very quickly among the various
 228 domains (see \cref{sec:sawsim:methods-tension} for equilibration timescales).
 229 This force is used to determine the unfolding probability
 230 (\cref{eq:sawsim:prob-one,eq:sawsim:prob-n}), which determines the domain state
 231 populations in the next time step.  Therefore, the chain tension must
 232 not change appreciably over the course of the time step ($\Delta F <
 233 1\U{pN}$), and the unfolding probability is only calculated once for
 234 the entire step.  The time step must also be short enough that the
 235 probability of unfolding in a single time step is low ($P<10^{-3}$).
 236 Besides ensuring that the approximations made in
 237 \cref{eq:sawsim:prob-one,eq:sawsim:prob-n} are valid, this restriction makes
 238 time steps which should have multiple unfoldings in a single time step
 239 highly unlikely.  Experimentally measured unfolding are temporally
 240 separated, because the unfolding transition is characterized by
 241 multiple, Markovian attempts over a large energy barrier, where the
 242 probability of crossing the barrier in a single attempt is very low.
 243 A successful attempt quickly extends the chain contour length,
 244 reducing the tension, dramatically reducing the likelihood of a second
 245 escape in that time step.  The time step used is recalculated for each
 246 step so that both of these criteria are satisfied.