From: W. Trevor King Date: Wed, 17 Feb 2010 23:53:33 +0000 (-0500) Subject: Broke sawsim chapter into sections and added local_words X-Git-Tag: v1.0~459 X-Git-Url: http://git.tremily.us/?a=commitdiff_plain;h=0bd11cdd6ee25ba830b0878815f072d9dfe53dd5;p=thesis.git Broke sawsim chapter into sections and added local_words --- diff --git a/tex/src/Makefile b/tex/src/Makefile index dc4028b..e99154c 100644 --- a/tex/src/Makefile +++ b/tex/src/Makefile @@ -6,6 +6,8 @@ all : @for i in $(SUBDIRS); do \ echo "make all in $$i..."; \ (cd $$i; $(MAKE) $(MFLAGS) all); done + @for file in $(shell find . -name '*.tex'); do \ + cat local_words >> $$file; done clean : @for i in $(SUBDIRS); do \ diff --git a/tex/src/introduction/main.tex b/tex/src/introduction/main.tex index f4ec3a1..927a76c 100644 --- a/tex/src/introduction/main.tex +++ b/tex/src/introduction/main.tex @@ -1 +1,134 @@ \chapter{Introduction} +\label{sec:intro} + +% why single molecule approach? +In biological systems the most important molecules, such as proteins, +nucleic acids, and polysaccharides, are all polymers. Understanding +the properties and functions of these polymeric molecules is crucial +in elucidating the molecular mechanisms of structures and processes in +cells. The large size of these molecules imposes certain limitations +on the information attainable from bulk measurements, because the +macromolecules in a population can have diverse conformations and +react differently to external stimuli. The individualized, and +sometimes rare, behaviors of macromolecules can have important +implications for their functions inside the cell. Recently developed +single molecule techniques, in which these macromolecules are studied +one at a time, have provided important and complementary information +about the functional mechanisms of several biological +systems\citep{bustamante08}. + +% why AFM & what an AFM is +Single molecule techniques for the study of biological macromolecules +include optical measurements, \ie, single molecule fluorescence +microscopy and spectroscopy, and mechanical manipulations of +individual macromolecules, \ie, force microscopy and spectroscopy +using atomic force microscopes (AFM), laser tweezers\citep{forde02}, +magnetic tweezers\citep{smith92}, or biomembrane force +probes\citep{merkel99}. Of these mechanical manipulation methods, AFM +is the most widely used due to the availability of user-friendly +commercial instruments. AFM has been employed on several types of +biological macromolecules, such as mechanically unfolding +proteins\citep{carrion-vazquez99b} and forcing structural transitions +in DNA\citep{rief99} and polysaccharides\citep{rief97a}. An AFM uses +a sharp tip integrated at the end of a cantilever to interact with the +sample. Cantilever bending is measured by a laser reflected off the +cantilever and incident on a position sensitive photodetector. When +the bending force constant of the cantilever is known\citep{levy02}, +the force applied to the sample can be calculated. The forces that +can be applied and measured with an AFM range from tens of piconewtons +to hundreds of nanonewtons. The investigation of the unfolding and +refolding processes of individual protein molecules by the AFM is +feasible because many globular proteins unfold under external forces +in this range. Since elucidating the mechanism of protein folding is +currently one of the most important problems in biological sciences, +the potential of the AFM for revealing significant and unique +information about protein folding has stimulated much effort in both +experimental and theoretical research. + +% AFM unfolding procedure +In a mechanical unfolding experiment, a protein polymer is tethered +between two surfaces: a flat substrate and an AFM tip. The polymer is +stretched by increasing the separation between the two surfaces +(\cref{fig:schematic}). The most common mode is the constant speed +experiment in which the substrate surface is moved away from the tip +at a uniform rate. The tethering surfaces, \ie, the AFM tip and the +substrate, have much larger radii of curvature than the dimensions of +single domain globular proteins that are normally used for folding +studies. This causes difficulties in manipulating individual protein +molecules because nonspecific interactions between the AFM tip and the +substrate may be stronger than the forces required to unfold the +protein when the surfaces are a few nanometers apart. To circumvent +these difficulties, globular protein molecules are linked into +polymers, which are then used in the AFM +studies\citep{carrion-vazquez99b,chyan04,carrion-vazquez03}. When +such a polymer is pulled from its ends, each protein molecule feels +the externally applied force, which increases the probability of +unfolding by reducing the free energy barrier between the native and +unfolded states. The unfolding of one molecule in the polymer causes +a sudden lengthening of the polymer chain, which reduces the force on +each protein molecule and prevents another unfolding event from +occurring immediately. The force versus extension relationship, or +\emph{force curve}, shows a typical sawtooth pattern +(\cref{fig:expt-sawtooth}), where each peak corresponds to the +unfolding of a single protein in the polymer. Therefore, the +individual unfolding events are separated from each other in space and +time, facilitating single molecule studies. + +% AFM unfolding analysis, what we'll do. +Much theoretical and computational work has been done in order to +extract information about the structural, kinetic, and energetic +properties of the protein molecules from the experimental data of +force-induced protein unfolding measurements. Steered molecular +dynamics simulations\citep{lu98}, as well as calculations and +simulations using lattice\citep{lu99} and off-lattice +models\citep{klimov00,li01}, have provided insights into structural +and energetic changes during force-induced protein unfolding. +However, these simulations often involve time scales that are orders +of magnitude smaller than those of the experiments, and the parameters +used in the calculations are often neither experimentally controllable +nor measurable. As a result, a Monte Carlo simulation approach based +on a simple two-state kinetic model for the protein is usually used to +analyze data from mechanical unfolding experiments. A comparison of +the force curves measured experimentally and those generated from +simulations can yield the unfolding rate constant of the protein in +the absence of force as well as the distance from the native state to +the transition state along the pulling direction. The Monte Carlo +simulation method has been used since the first report of mechanical +unfolding experiment using AFM\citep{rief97b}, +%,rief97a,rief98,carrion-vazquez99b,best02,zinober02,jollymore09}, +however, a comprehensive description and discussion of the simulation +procedures and the intricacies involved has not been reported. In +this paper, we provide a detailed description of the simulation +procedure, including the theories, approximations, and assumptions +involved. We also explain the procedure for extracting kinetic +properties of the protein from experimental data and introduce a +quantitative measure of fit quality between simulation and +experimental results. In addition, the effects of various +experimental parameters on force curve appearance are demonstrated, +and the errors associated with different methods of data pooling are +discussed. We believe that these results will be useful in +experimental design, artifact identification, and data analysis for +single molecule mechanical unfolding experiments. + +\begin{figure} +\begin{center} +\subfloat[][]{\includegraphics{sawsim/schematic}\label{fig:schematic}} +%\hspace{.25in}% +\subfloat[][]{\includegraphics{sawsim/expt-sawtooth}\label{fig:expt-sawtooth}} +\caption{(a) Schematic of the experimental setup for mechanical + unfolding of proteins using an AFM (not to scale). An experiment + starts with the tip in contact with the substrate surface, which is + then moved away from the tip at a constant speed. $x_t$ is the + distance traveled by the substrate, $x_c$ is the cantilever + deflection, $x_u$ is the extension of the unfolded polymer, and + $x_f=x_{f1}+x_{f2}$ is the extension of the folded polymer. (b) An + experimental force curve from stretching a ubiquitin polymer with + the rising parts of the peaks fitted to the WLC + model\citep{chyan04}. The pulling speed used was $1\U{$\mu$m/s}$. + The irregular features at the beginning of the curve are due to + nonspecific interactions between the tip and the substrate surface, + and the last high force peak is caused by the detachment of the + polymer from the tip or the substrate surface. Note that the + abscissa is the extension of the protein chain $x_t-x_c$.} +\end{center} +\end{figure} diff --git a/tex/src/local_words b/tex/src/local_words new file mode 100644 index 0000000..f7b26b9 --- /dev/null +++ b/tex/src/local_words @@ -0,0 +1,25 @@ +%% Ispell local dictionary: + +%% Science words +% LocalWords: polysaccharides polypeptide biomembrane biopolymers photodetector +% LocalWords: piconewtons nanonewtons nanometers vt timesteps timestep +% LocalWords: ubiquitin titin sawteeth octamer octameric +% LocalWords: unstretched undeflected unfoldings underdamped +% LocalWords: rebalances nonspecific equilibrated + +%% Abbreviations: +% LocalWords: Tel AFM afm WLC wlc FJC fjc PACS MSC pN nm MC ddFLN + +%% Names and related +% LocalWords: Hookean hooke Kramers kramers Mascheroni Kullback Leibler +% LocalWords: Markovian msu Meihong Su gyang Guoliang CSIRO Drexel +% LocalWords: bustamante forde smith merkel carrion vazquez rief levy fernandez +% LocalWords: chyan lu klimov li zinober jollymore marko sims evans schlierf +% LocalWords: janshoff granzier linke verdier dicola lin + +%% LaTeX and related +% LocalWords: documentclass elsarticle pt tnoteref tnotetext fnref fntext +% LocalWords: corref cortext ead form url cor TODO + +%% Reference abbreviations +% LocalWords: sec fig eq expt sim dep const prob hist hists diff --git a/tex/src/sawsim/conclusions.tex b/tex/src/sawsim/conclusions.tex new file mode 100644 index 0000000..60fe984 --- /dev/null +++ b/tex/src/sawsim/conclusions.tex @@ -0,0 +1,15 @@ +\section{Conclusions} +\label{sec:sawsim:conclusions} + +We have described the method of performing Monte Carlo simulations +based on a simple two-state model for the mechanical unfolding of +protein molecules and discussed the complications involved in the +simulation procedure. In addition to the extraction of kinetic +properties of the protein from mechanical unfolding data, such +simulations can help to elucidate the effects of various experimental +parameters on the appearance of force curves and to estimate the +errors associated with data pooling. To date, the force-induced +unfolding approach has been used to investigate several different +types of proteins. As the technique is used to study a wider range of +proteins, this simple simulation method will be useful for data +analysis, experimental design, and artifact identification. diff --git a/tex/src/sawsim/discussion.tex b/tex/src/sawsim/discussion.tex new file mode 100644 index 0000000..6e65fd6 --- /dev/null +++ b/tex/src/sawsim/discussion.tex @@ -0,0 +1,355 @@ +\section{Results and Discussion} +\label{sec:sawsim:results} + +\subsection{Force curves generated by simulation} +\label{sec:sawsim:results-force_curves} + +\Cref{fig:sawsim:sim-sawtooth} shows three simulated force curves from +pulling a polymer composed of eight identical protein molecules using +parameters from typical experimental settings. The order of the peaks +in the force curves reflects the temporal sequence of the unfolding +events instead of the positions of the protein molecules in the +polymer\cite{li00}. As observed experimentally +(\cref{fig:expt-sawtooth}), the forces at which identical protein +molecules unfold fluctuate, revealing the stochastic nature of protein +unfolding since no instrumental noise is included in the simulation. +\Cref{fig:sawsim:sim-hist} shows the distribution of the unfolding forces, +\ie, the highest force in each peak (except the last peak in a force +curve), from a total of $400$ force curves ($3200$ force values). The +unfolding forces have an average of $281\U{pN}$ with a standard +deviation of $25\U{pN}$. +% DONE: discuss noise? no. + +\begin{figure} +\vspace{-1in} +\begin{center} +\subfloat[][]{\includegraphics{sawsim/sim-sawtooth}\label{fig:sawsim:sim-sawtooth}% +}\\ +\subfloat[][]{\includegraphics{sawsim/sim-hist}\label{fig:sawsim:sim-hist}% +} +\caption{(a) Three simulated force curves from pulling a polymer of + eight identical protein molecules. The simulation was carried out + using the parameters: pulling speed $v=1\U{$\mu$m/s}$, cantilever + spring constant $\kappa_c=50\U{pN/nm}$, temperature $T=300\U{K}$, + persistence length of unfolded proteins $p_u=0.40\U{nm}$, $\Delta + x_u=0.225\U{nm}$, and $k_{u0}=5\E{-5}\U{s$^{-1}$}$. The contour + length between the two linking point on a protein molecule is + $L_{f1}=3.7\U{nm}$ in the folded form and $L_{u1}=28.1\U{nm}$ in the + unfolded form. These parameters are those of ubiquitin molecules + connected through the N-C termini\citep{chyan04,carrion-vazquez03}. + Detachment from the tip or substrate is assumed to occur at a force + of $400\U{pN}$. In experiments, detachments have been observed to + occur at a variety of forces. (b) The distribution of the unfolding + forces from $400$ simulated force curves ($3200$ data points) such + as that shown in (a). The frequency is normalized by the total + number of points, \ie, the height of each bin is equal to the number + of data points in that bin divided by the total number of data + points ($3200$, for this histogram).\label{fig:sawsim:sim-all}} +\end{center} +\end{figure} + +\subsection{Dependence of the unfolding force on the unfolding order +and polymer length} +\label{sec:sawsim:results-scaffold} + +Analysis of the mechanical unfolding data is complicated by the +dependence of the average unfolding force on the unfolding order due +to the serial linkage of the molecules. Under an external stretching +force $F$, the probability of some domain unfolding in a polymer with +$N_f$ folded domains is $N_fP_1$ (\cref{eq:sawsim:prob-n}), which is higher +than the unfolding probability for a single molecule $P_1$. +Consequently, the average unfolding force is lower for the earlier +unfolding events when $N_f$ is larger, and the force should increase +as more and more molecules become unfolded. However, there is a +competing factor that opposes this trend. As the protein molecules +unfold, the chain becomes softer and the force loading rate becomes +lower when the pulling speed is constant, leading to a decrease in the +unfolding force. The dependence of the average unfolding force on the +unfolding order is the result of these two opposing effects. +\Cref{fig:sawsim:order-dep} shows the dependence of the average unfolding +force on the unfolding force peak order (the temporal order of +unfolding events) for four polymers with $4$, $8$, $12$, and $16$ +identical protein molecules. The effect of polymer chain softening +dominates the initial unfolding events, and the average unfolding +force decreases as more molecules unfold. After several molecules +have unfolded, the softening for each additional unfolding event +becomes less significant, the change in unfolding probability becomes +dominant, and the unfolding force increases upon each subsequent +unfolding event\citep{zinober02}. + +We validate this explanation by calculating the unfolding force +probability distribution's depending on the two competing factors. +The rate of unfolding events with respect to force is +\begin{align} + r_{uF} &= -\deriv{F}{N_f} + = -\frac{\dd N_f/\dd t}{\dd F/\dd t} + = \frac{N_f k_u}{\kappa v} \\ + &= \frac{N_f k_{u0}}{\kappa v}\exp\p({\frac{F\Delta x_u}{k_B T}}) + = \frac{1}{\rho}\exp\p({\frac{F-\alpha}{\rho}}) \;, +\end{align} +where $N_f$ is the number of folded domain, +$\kappa=(1/\kappa_c+N_u/\kappa_\text{WLC})^{-1}$ is the spring +constant of the cantilever-polymer system ($\kappa_\text{WLC}$ is the +effective spring constant of one unfolded domain, assumed constant for +a particular polymer/cantilever combination), $\kappa v$ is the force +loading rate, and $k_u$ is the unfolding rate constant +(\cref{eq:sawsim:bell}). In the last expression, $\rho\equiv k_BT/\Delta +x_u$, and $\alpha\equiv-\rho\ln(N_fk_{u0}\rho/\kappa v)$. The event +probability density for events with an exponentially increasing +likelihood function follows the Gumbel (minimum) probability +density\cite{NIST:gumbel}, with $\rho$ and $\alpha$ being the scale +and location parameters, respectively +\begin{equation} + \mathcal{P}(F) = \frac{1}{\rho} \exp\p[{\frac{F-\alpha}{\rho} + -\exp\p({\frac{F-\alpha}{\rho}}) + }] \;. \label{eq:sawsim:gumbel} +\end{equation} +The distribution has a mean $\avg{F}=\alpha-\gamma_e\rho$ and a +variance $\sigma^2=\pi^2\rho^2/6$, where $\gamma_e=0.577\ldots$ is the +Euler-Mascheroni constant. Therefore, the unfolding force +distribution has a variance $\sigma^2 = (\pi k_BT/\Delta x_u)^2/6$, +and and average +\begin{equation} + \avg{F(i)} = \frac{k_BT}{\Delta x_u} + \p[{-\ln\p({\frac{N_fk_{u0}k_BT}{\kappa v\Delta x_u}}) + -\gamma_e}] \;, \label{eq:sawsim:order-dep} +\end{equation} +where $N_f$ and $\kappa$ depend on the domain index $i=N_u$. Curves based +on this formula fit the simulated data remarkably well considering the +effective WLC stiffness $\kappa_\text{WLC}$ is the only fitted +parameter, and that the actual WLC stiffness is not constant, as we +have assumed here, but a non-linear function of $F$. + +From \cref{fig:sawsim:order-dep}, it can be seen that the proper way to +process data from mechanical unfolding experiments is to group the +curves according to the length of the polymer and to perform +statistical analysis separately for peaks with the same unfolding +order. However, in most experiments, the tethering of the polymer to +the AFM tip is by nonspecific adsorption; as a result, the polymers +being stretched between the tip and the substrate have various +lengths. In addition, the interactions between the tip and the +surface often cause irregular features in the beginning of the force +curve (\cref{fig:expt-sawtooth}), making the identification of the +first peak uncertain. Furthermore, it is often difficult to acquire a +large amount of data in single molecules experiments. These +difficulties make the aforementioned data analysis approach unfeasible +for many mechanical unfolding experiments. As a result, the values of +all force peaks from polymers of different lengths are often pooled +together for statistical analysis. To assess the errors caused by +such pooling, simulation data were analyzed using different pooling +methods and the results were compared. \Cref{fig:sawsim:sim-hist} shows +that, for a polymer with eight protein molecules, the average +unfolding force is $281\U{pN}$ with a standard deviation of $25\U{pN}$ +when all data is pooled. If only the first peaks in the force curves +are analyzed, the average force is $279\U{pN}$ with a standard +deviation of $22\U{pN}$. While for the fourth and eighth peaks, the +average force are $275\U{pN}$ and $300\U{pN}$, respectively, and the +standard deviations are $23\U{pN}$ and $25\U{pN}$, respectively. As +expected from the Gumbel distribution, the width of the unfolding +force distribution (insets in \cref{fig:sawsim:order-dep}) is only weakly +effected by unfolding order, but the average unfolding force can be +quite different for the same protein because of the differences in +unfolding order and polymer length. + +\begin{figure} +\begin{center} +\includegraphics{sawsim/order-dep} +\caption{The dependence of the unfolding force on the temporal + unfolding order for four polymers with $4$, $8$, $12$, and $16$ + molecules of identical proteins. Each point in the figure is the + average of $400$ data points. The first point in each curve + represents the average of only the first peak in each of the $400$ + simulated force curves, the second point represents the average of + only the second peak, and so on. The solid lines are fits of + \cref{eq:sawsim:order-dep} to the simulated data, with best fit + $\kappa_\text{WLC}=203$, $207$, $161$, and $157\U{pN/nm}$, + respectively, for lengths $4$ through $16$. The insets show the + force distributions of the first, fourth, and eighth peaks, left to + right, for the polymer with eight protein molecules. The parameters + used for generating the data were the same as those used for + \cref{fig:sawsim:sim-sawtooth}, except the polymer length, and the + histograms in the insets were normalized in the same way as in + \cref{fig:sawsim:sim-hist}.\label{fig:sawsim:order-dep}} +\end{center} +\end{figure} + +\subsection{The effect of polymer inhomogeneity} +\label{sec:sawsim:results-folded-tension} + +The unfolded polypeptide chain has been shown to follow the WLC model +quite well, though other polymer models, such as the Freely-Jointed +Chain (FJC)\citep{verdier70}, can be used to fit the force-extension +relationship\citep{janshoff00}. A short chain of folded proteins, +however, cannot be described well by polymer models. Several studies +have used WLC and FJC models to fit the elastic properties of the +modular protein titin\citep{granzier97,linke98a}, +% TODO: check it really is folded domains \& bulk titin +but native titin contains hundreds of folded and unfolded domains +domains. For the short protein polymers common in mechanical +unfolding experiments, the cantilever dominates the elasticity of the +polymer-cantilever system before any protein molecules unfold. After +the first unfolding event occurs, the unfolded portion of the chain is +already longer and softer than the sum of all the remaining folded +domains, and dominates the elasticity of the whole chain. Therefore, +the details of the tension model chosen for the folded domains has +negligible effect on the unfolding forces, which was also suggested by +\citet{staple08}. Force curves simulated using different models to +describe the folded domains yielded almost identical unfolding force +distributions (data not shown). + +\subsection{The effect of cantilever force constant} +\label{sec:sawsim:cantilever} + +In mechanical unfolding experiments, the ability to observe the +unfolding of a single protein molecule depends on the tension drop +after an unfolding event such that another molecule does not unfold +immediately. The magnitude of this drop is determined by many +factors, including the magnitude of the unfolding force, the contour +and persistence lengths of the protein polymer, the contour length +increase from unfolding, and the stiffness (force constant) of the +cantilever. Among these, the effect of the cantilever force constant +is particularly interesting because cantilevers with a wide range of +force constants are available. In addition different single molecule +manipulation techniques, such as the AFM and laser tweezers, differ +mainly in the range of the spring constants of their force +transducers. \Cref{fig:sawsim:kappa-sawteeth} shows the simulated force +curves from pulling an octamer of protein molecules using cantilevers +with different force constants, while other parameters are identical. +For this model protein, the appearance of the force curve does not +change much until the force constant of the cantilever reaches a +certain value ($\kappa_c=50\U{pN/nm}$). When $\kappa_c$ is lower than +this value, the individual unfolding events become less identifiable. +In order to observe individual unfolding events, the cantilever needs +to have a force constant high enough so that the bending at the +maximum force is small in comparison with the contour length increment +from the unfolding of a single molecule. \Cref{fig:sawsim:kappa-sawteeth} +also shows that the back side of the force peaks becomes more tilted +as the cantilever becomes softer. This is due to the fact that the +extension (end-to-end distance) of the protein polymer has a large +sudden increase as the tension rebalances after an unfolding event. + +It should also be mentioned that the contour length increment from +each unfolding event is not equal to the distance between adjacent +peaks in the force curve because the chain is never fully stretched. +This contour length increase can only be obtained by fitting the curve +to WLC or other polymer models (\cref{fig:expt-sawtooth}). + +\begin{figure} +\begin{center} +\includegraphics{sawsim/kappa-sawteeth} +\caption{Simulated force curves obtained from pulling a polymer with + eight protein molecules using cantilevers with different force + constants $\kappa_c$. Parameters used in generating these curves + are the same as those used in \cref{fig:sawsim:sim-all}, except the + cantilever force constant.\label{fig:sawsim:kappa-sawteeth}} +\end{center} +\end{figure} + +\subsection{Determination of $\Delta x_u$ and $k_{u0}$} +\label{sec:sawsim:results-fitting} + +The zero-force unfolding rate $k_{u0}$ and the distance $\Delta x_u$ +from the native state to the transition state are the two kinetic +parameters obtainable for mechanical unfolding experiments by matching +the simulated data with measured results. \cref{fig:sawsim:v-dep} shows the +dependence of the unfolding force on the pulling speed for different +values of $k_{u0}$ and $\Delta x_u$. As expected, the unfolding force +increases linearly with the pulling speed in the linear-log +plot\citep{evans99}. While the magnitude of the unfolding forces is +affected by both $k_{u0}$ and $\Delta x_u$, the slope of speed +dependence is primarily determined by $\Delta x_u$. +\Cref{fig:sawsim:width-v-dep} shows that the width of the unfolding force +distribution is very sensitive to $\Delta x_u$, as expected from the +Gumbel distribution discussed in \cref{sec:sawsim:results-scaffold}. To +obtain the values of $k_{u0}$ and $\Delta x_u$ for the protein, the +pulling speed dependence and the distribution of the unfolding forces +from simulation, such as those shown in \cref{fig:sawsim:v-dep} and the +insets of \cref{fig:sawsim:width-v-dep}, are compared with the experimentally +measured results. The values of $k_{u0}$ and $\Delta x_u$ that +provide the best match are designated as the parameters describing the +protein under study. Since $k_{u0}$ and $\Delta x_u$ affect the +unfolding forces differently, the values of both parameters can be +determined simultaneously. The data used in plotting +\cref{fig:sawsim:all-v-dep} includes all force peaks from the simulated force +curves because most experimental data is analyzed that way. + +In most published literature, determination of the values of $k_{u0}$ +and $\Delta x_u$ was mostly done by carrying out simulations using a +handful of possible unfolding parameters and selected the best fit by +eye% +%\citep{us,CV1999} +. This approach does not allow estimation of uncertainties in the +fitting parameters, as shown by \citet{best02}. A more rigorous +approach involves quantifying the quality of fit between the +experimental and simulated force distributions, allowing the use of a +numerical minimization algorithm to pick the best fit parameters. We +use the Jensen-Shannon divergence\citep{sims09,lin91}, a measure of +the similarity between two probability distributions. +\begin{equation} + D_\text{JS}(p_e, p_s) + = D_\text{KL}(p_e, p_m) + D_\text{KL}(p_s, p_m) \;, \label{eq:sawsim:D_JS} +\end{equation} +where $p_e(i)$ and $p_s(i)$ are the the values of the $i^\text{th}$ +bin in the experimental and simulated unfolding force histograms, +respectively. $D_\text{KL}$ is the Kullback-Leibler divergence +\begin{equation} + D_\text{KL}(p_p,p_q) + = \sum_i p_p(i) \log_2\p({\frac{p_p(i)}{p_q(i)}}) \;, \label{eq:sawsim:D_KL} +\end{equation} +where the sum is over all unfolding force histogram bins. $p_m$ is +the symmetrized probability distribution +\begin{equation} + p_m(i) \equiv [p_e(i)+p_s(i)]/2 \;. +\end{equation} +% DONE: Mention inter-histogram normalization? no. +% For experiments carried out over a series of pulling velocities, we +% simply sum residuals computed for each velocity, although it would +% also be reasonable to weight this sum according to the number of +% experimental unfolding events recorded for each velocity. +% DONE: mention $D_\text{JS}$ features to explains selection over $\chi^2$? no. +\Cref{fig:sawsim:fit-space} shows the Jensen-Shannon divergence calculated +using \cref{eq:sawsim:D_JS} between an experimental data set and simulation +results obtained using a range of values of $k_{u0}$ and $\Delta x_u$. +There is an order of magnitude range of $k_{u0}$ that produce +reasonable fits to experimental data (\cref{fig:sawsim:fit-space}), which is +consistent with the results \citet{best02} obtained using a chi-square +test. The values of $k_{u0}$ and $\Delta x_u$ can be determined to +higher precision by using both the pulling speed dependent data and +the unfolding force distribution, as well as any relevant information +about the protein from other sources. + +\begin{figure} +\begin{center} +\subfloat[][]{\includegraphics{sawsim/v-dep}\label{fig:sawsim:v-dep}% +} \\ +\subfloat[][]{\includegraphics{sawsim/v-dep-sd}\label{fig:sawsim:width-v-dep}% +} +\caption{(a) The dependence of the unfolding forces on the pulling + speed for three different model protein molecules characterized by + the parameters $k_{u0}$ and $\Delta x_u$. The polymer length is + eight molecules, and each symbol is the average of $3200$ data + points. (b) The dependence of standard deviation of the unfolding + force distribution on the pulling speed for the simulation data + shown in (a), using the same symbols. The insets show the force + distribution histograms for the three proteins at the pulling speed + of $1\U{$\mu$m/s}$. The left, middle and right histograms are for + the proteins represented by the top, middle, and bottom lines in (a), + respectively.\label{fig:sawsim:all-v-dep}} +\end{center} +\end{figure} + +\begin{figure} +\begin{center} +\includegraphics{sawsim/fit-space} +\caption{Fit quality between an experimental data set and simulated + data sets obtained using various values of unfolding rate parameters + $k_{u0}$ and $\Delta x_u$. The experimental data are from octameric + ubiquitin pulled at $1\U{$\mu$m/s}$\citep{chyan04}, and the other + model parameters are the same as those in \cref{fig:sawsim:sim-all}. The + best fit parameters are $\Delta x_u=0.17\U{nm}$ and + $k_{u0}=1.2\E{-2}\U{s$^{-1}$}$. The simulation histograms were + built from $400$ pulls at for each parameter pair. The color scale + shown on the right is $log_{10}(D_\text{JS})$. +\label{fig:sawsim:fit-space}} +\end{center} +\end{figure} diff --git a/tex/src/sawsim/introduction.tex b/tex/src/sawsim/introduction.tex new file mode 100644 index 0000000..3634aa5 --- /dev/null +++ b/tex/src/sawsim/introduction.tex @@ -0,0 +1,4 @@ +\section{Introduction} +\label{sec:sawsim:introduction} + +TODO diff --git a/tex/src/sawsim/main.tex b/tex/src/sawsim/main.tex index 28d36a8..4773eb8 100644 --- a/tex/src/sawsim/main.tex +++ b/tex/src/sawsim/main.tex @@ -1,775 +1,7 @@ \chapter{Monte Carlo mechanical unfolding simulation} +\label{sec:sawsim} -\section{Introduction} -\label{sec:sawsim:intro} - -% why single molecule approach? -In biological systems the most important molecules, such as proteins, -nucleic acids, and polysaccharides, are all polymers. Understanding -the properties and functions of these polymeric molecules is crucial -in elucidating the molecular mechanisms of structures and processes in -cells. The large size of these molecules imposes certain limitations -on the information attainable from bulk measurements, because the -macromolecules in a population can have diverse conformations and -react differently to external stimuli. The individualized, and -sometimes rare, behaviors of macromolecules can have important -implications for their functions inside the cell. Recently developed -single molecule techniques, in which these macromolecules are studied -one at a time, have provided important and complementary information -about the functional mechanisms of several biological -systems\citep{bustamante08}. - -% why AFM & what an AFM is -Single molecule techniques for the study of biological macromolecules -include optical measurements, \ie, single molecule fluorescence -microscopy and spectroscopy, and mechanical manipulations of -individual macromolecules, \ie, force microscopy and spectroscopy -using atomic force microscopes (AFM), laser tweezers\citep{forde02}, -magnetic tweezers\citep{smith92}, or biomembrane force -probes\citep{merkel99}. Of these mechanical manipulation methods, AFM -is the most widely used due to the availability of user-friendly -commercial instruments. AFM has been employed on several types of -biological macromolecules, such as mechanically unfolding -proteins\citep{carrion-vazquez99b} and forcing structural transitions -in DNA\citep{rief99} and polysaccharides\citep{rief97a}. An AFM uses -a sharp tip integrated at the end of a cantilever to interact with the -sample. Cantilever bending is measured by a laser reflected off the -cantilever and incident on a position sensitive photodetector. When -the bending force constant of the cantilever is known\citep{levy02}, -the force applied to the sample can be calculated. The forces that -can be applied and measured with an AFM range from tens of piconewtons -to hundreds of nanonewtons. The investigation of the unfolding and -refolding processes of individual protein molecules by the AFM is -feasible because many globular proteins unfold under external forces -in this range. Since elucidating the mechanism of protein folding is -currently one of the most important problems in biological sciences, -the potential of the AFM for revealing significant and unique -information about protein folding has stimulated much effort in both -experimental and theoretical research. - -% AFM unfolding procedure -In a mechanical unfolding experiment, a protein polymer is tethered -between two surfaces: a flat substrate and an AFM tip. The polymer is -stretched by increasing the separation between the two surfaces -(\cref{fig:sawsim:schematic}). The most common mode is the constant speed -experiment in which the substrate surface is moved away from the tip -at a uniform rate. The tethering surfaces, \ie, the AFM tip and the -substrate, have much larger radii of curvature than the dimensions of -single domain globular proteins that are normally used for folding -studies. This causes difficulties in manipulating individual protein -molecules because nonspecific interactions between the AFM tip and the -substrate may be stronger than the forces required to unfold the -protein when the surfaces are a few nanometers apart. To circumvent -these difficulties, globular protein molecules are linked into -polymers, which are then used in the AFM -studies\citep{carrion-vazquez99b,chyan04,carrion-vazquez03}. When -such a polymer is pulled from its ends, each protein molecule feels -the externally applied force, which increases the probability of -unfolding by reducing the free energy barrier between the native and -unfolded states. The unfolding of one molecule in the polymer causes -a sudden lengthening of the polymer chain, which reduces the force on -each protein molecule and prevents another unfolding event from -occurring immediately. The force versus extension relationship, or -\emph{force curve}, shows a typical sawtooth pattern -(\cref{fig:sawsim:expt-sawtooth}), where each peak corresponds to the -unfolding of a single protein in the polymer. Therefore, the -individual unfolding events are separated from each other in space and -time, facilitating single molecule studies. - -% AFM unfolding analysis, what we'll do. -Much theoretical and computational work has been done in order to -extract information about the structural, kinetic, and energetic -properties of the protein molecules from the experimental data of -force-induced protein unfolding measurements. Steered molecular -dynamics simulations\citep{lu98}, as well as calculations and -simulations using lattice\citep{lu99} and off-lattice -models\citep{klimov00,li01}, have provided insights into structural -and energetic changes during force-induced protein unfolding. -However, these simulations often involve time scales that are orders -of magnitude smaller than those of the experiments, and the parameters -used in the calculations are often neither experimentally controllable -nor measurable. As a result, a Monte Carlo simulation approach based -on a simple two-state kinetic model for the protein is usually used to -analyze data from mechanical unfolding experiments. A comparison of -the force curves measured experimentally and those generated from -simulations can yield the unfolding rate constant of the protein in -the absence of force as well as the distance from the native state to -the transition state along the pulling direction. The Monte Carlo -simulation method has been used since the first report of mechanical -unfolding experiment using AFM\citep{rief97b}, -%,rief97a,rief98,carrion-vazquez99b,best02,zinober02,jollymore09}, -however, a comprehensive description and discussion of the simulation -procedures and the intricacies involved has not been reported. In -this paper, we provide a detailed description of the simulation -procedure, including the theories, approximations, and assumptions -involved. We also explain the procedure for extracting kinetic -properties of the protein from experimental data and introduce a -quantitative measure of fit quality between simulation and -experimental results. In addition, the effects of various -experimental parameters on force curve appearance are demonstrated, -and the errors associated with different methods of data pooling are -discussed. We believe that these results will be useful in -experimental design, artifact identification, and data analysis for -single molecule mechanical unfolding experiments. - -\begin{figure} -\begin{center} -\subfloat[][]{\includegraphics{sawsim/schematic}\label{fig:sawsim:schematic}} -%\hspace{.25in}% -\subfloat[][]{\includegraphics{sawsim/expt-sawtooth}\label{fig:sawsim:expt-sawtooth}} -\caption{(a) Schematic of the experimental setup for mechanical - unfolding of proteins using an AFM (not to scale). An experiment - starts with the tip in contact with the substrate surface, which is - then moved away from the tip at a constant speed. $x_t$ is the - distance traveled by the substrate, $x_c$ is the cantilever - deflection, $x_u$ is the extension of the unfolded polymer, and - $x_f=x_{f1}+x_{f2}$ is the extension of the folded polymer. (b) An - experimental force curve from stretching a ubiquitin polymer with - the rising parts of the peaks fitted to the WLC - model\citep{chyan04}. The pulling speed used was $1\U{$\mu$m/s}$. - The irregular features at the beginning of the curve are due to - nonspecific interactions between the tip and the substrate surface, - and the last high force peak is caused by the detachment of the - polymer from the tip or the substrate surface. Note that the - abscissa is the extension of the protein chain $x_t-x_c$.} -\end{center} -\end{figure} - -\section{Methods} -\label{sec:sawsim:methods} - -% simulation overview -In simulating the mechanical unfolding process, a force curve is -generated by calculating the amount of the cantilever bending as the -substrate surface moves away from the tip. The cantilever bending is -obtained by balancing the tension in the protein polymer and the -Hookean force of the bent cantilever. The unfolding probability of -the protein molecules in the polymer is then calculated for that -tension, and whether an unfolding event occurs is determined according -to a Monte Carlo method. The simulation was implemented in -C\footnote{Source code available at - \url{http://www.physics.drexel.edu/~wking/sawsim/}}. - -\subsection{Generating force curves} -\label{sec:sawsim:methods-tension} - -% introduce domains and groups. -The fundamental abstraction of the simulation is the ``domain'', which -represents a discrete chunk of the flexible chain between the -substrate and the cantilever holder. Each of these domains is -assigned a particular state; for example, the domain representing the -cantilever is assigned to the ``cantilever'' state, and the domains -representing protein molecules are assigned to either the ``folded'' -or the ``unfolded'' state. When balancing the tension along the -chain, we assume that the spatial order of domains along the chain is -irrelevant\citep{li00}, and therefore, the domains can be rearranged -and grouped by state. To determine the tension in the chain and the -amount of cantilever bending when $n$ states are populated, a system -of $n+1$ equations with $n+1$ unknowns must be solved -\begin{align} - F_i(x_i) &= F_t \label{eq:sawsim:tension-balance} \\ - \sum_i x_i &= x_t \;, \label{eq:sawsim:x-total} -\end{align} -where $F$ are tensions, $x$ are extensions, and the subscripts $i$ and -$t$ represent a particular state group and the total chain -respectively (\cref{fig:sawsim:schematic}). From this $F(x_t)$ may be -computed using any multi-dimensional root-finding algorithm. - -% introduce particular models, and mention parameter aggregation -Inside this framework, we choose a particular extension model -$F_i(x_i)$ for each domain state. Cantilever elasticity is described -by Hooke's law, which gives -\begin{equation} - F = \kappa_c x_c \;, \label{eq:sawsim:hooke} -\end{equation} -where $\kappa_c$ is the bending spring constant and $x_c$ is the -deflection of the cantilever (\cref{fig:sawsim:schematic}). Unfolded domains -are modeled as a Worm-Like Chain (WLC)\citep{marko95,bustamante94}, in -which the tension $F$ is related to its extension (end-to-end -distance) $x_u$ by -\begin{equation} - F = \frac{k_B T}{p_u} - \p[{ \frac{1}{4}\p({ \frac{1}{(1-x_u/L_u)^2} - 1 }) - + \frac{x_u}{L_u} }] \;, - \label{eq:sawsim:wlc} -\end{equation} -where $p_u$ is the persistence length and $L_u=N_uL_{u1}$ is the total -contour length of the unfolded chain. The chain of $N_f$ folded -domains is modeled as a string free to assume any extension up to some -fixed contour length $L_f=N_fL_{f1}$ -\begin{equation} - F = \begin{cases} - 0 & \text{if $x_fL_f$} \;, - \end{cases} \label{eq:sawsim:piston} -\end{equation} -where $L_{f1}$ is the separation of the two linking points of a folded -domain, and $x_f$ is the end-to-end length of the chain of folded -domains. In this model, any non-zero tension will fully extend these -folded domains. As discussed in \cref{sec:sawsim:results-folded-tension}, -the contribution of the folded domains to the elastic behavior of the -polymer-cantilever system is relatively insignificant. - -% address assumptions & caveats -In the simulation, the protein polymer is assumed to be stretched in -the direction perpendicular to the surface, which is a good -approximation in most experimental situations, because the unfolded -length of a protein molecule is much larger than that of the folded -form. Therefore, after one molecule unfolds, the polymer becomes much -longer and the angle between the polymer and the surface approaches -$90$ degrees\citep{carrion-vazquez00}. The joints between domain -groups are assumed to lie along a line between the surface tether -point and the position of the tip (\cref{eq:sawsim:x-total}). The effects of -this assumption are also minimized due to greater length of the -unfolded domain. Finally, the interactions between different parts of -the polymer and between the chain and the surface (except at the -tethering points) are not considered. This is reasonable since these -interactions should not make substantial contributions to the force -curve at the force levels of interest, where the polymer is in a -relatively extended conformation. - -% introduce constant velocity and walk through explicit example pull -Consider an experiment of pulling a polymer with $N$ identical protein -molecules at a constant speed. At the start of an experiment, the -chain is unstretched ($x_t=0$), which means all the domains are -unstretched, the cantilever is undeflected, and the tip is in contact -with the surface. There is one domain in the cantilever state, $N$ in -the folded state, and none in the unfolded state. As the surface -moves away from the tip at a constant speed $v$, the chain becomes -more extended (\cref{fig:sawsim:schematic}), such that -\begin{equation} - x_t = \sum_i x_i = vt \label{eq:sawsim:const-v} \;. -\end{equation} -The simulation assumes that the pulling takes discrete steps in space -and treats $x_t$ as constant over the duration of one time step -$\Delta t$. Because of the adaptive time steps discussed in -\cref{sec:sawsim:methods-timesteps}, the space steps $\Delta x_t = v\Delta t$ -may have different sizes, but each step will be ``small''. At each -step, the total extension is calculated using \cref{eq:sawsim:const-v}, and -the tension $F(x_t=vt)$ is determined by numerically solving -\cref{eq:sawsim:tension-balance,eq:sawsim:x-total} using the models -\cref{eq:sawsim:hooke,eq:sawsim:wlc,eq:sawsim:piston} for known values of the parameters in -the various states $(N_u, N_f, v, \kappa_c, L_{f1}, L_{u1}, p_u)$. -When one of the molecules in the polymer unfolds -(\cref{sec:sawsim:methods-unfolding}), there will be one domain in the -unfolded state and $N-1$ in the folded state. In the next step, a -newly balanced tension between the cantilever and the polymer is -determined by solving for $F(x_t)$ as discussed above, but with the -total extension $x_t$ incremented by $v\Delta t$ and the new unfolded -contour length $L_{u1}$ and folded contour length $(N-1)L_{f1}$. The -sudden lengthening of the polymer chain results in a corresponding -abrupt drop in the force, leading to the formation of one sawtooth in -the force curve. As the pulling continues and more domains unfold, -force curves with a series of sawteeth are generated -(\cref{fig:sawsim:sim-sawtooth}). - -The tension calculation assumes an equilibrated chain, so -consideration must be given to the chain's relaxation time, which -should be short compared to the loading timescale. The relaxation -time for a WLC is given by -\begin{equation} - \tau \approx \eta \frac{k_BT p}{F^2} -% < \eta \frac{k_BT p}{(k_BTx/pL)^2} = -% Note: < because F > k_BTx/pL -% < \frac{\eta p^3 L^2}{k_BT x^2} -% < \frac{\eta p^2 L}{k_BT} % for x/L > \sqrt{p/L} - \;, \label{eq:sawsim:tau-wlc} -\end{equation} -where $\eta$ is the dynamic viscosity, $F$ is the tension, and $p$ is -the persistence length\citep{evans99}. For forces greater than -$1\U{pN}$, with $\eta_\text{water}/k_BT=2.45\E{-10}\U{s/nm$^3$}$, -$\tau<2\U{ns}$ for the protein polymer used in the simulation. -% python -c 'print 2.45e-10*(1e9)**3 * (1.38e-23*300)**2 * 0.4e-9 / (1e-12)**2' -% 1.68...e-09 -Therefore, the polymer chain is equilibrated almost instantaneously -within a time step, which is on the order of tens of microseconds. -The relaxation time of the cantilever can be determined by measuring -the cantilever deflection induced by liquid motion and fitting the -time dependence of the deflection to an exponential -function\cite{jones05}. For a $200\U{$\mu$m}$ rectangular cantilever -with a bending spring constant of $20\U{pN/nm}$, the measured -relaxation time in water is $\sim50\U{$\mu$/s}$ (data not shown). -This relatively large relaxation time constant makes the cantilever -act as a low-pass filter and also causes a lag in the force -measurement. - -\subsection{Unfolding protein molecules by force} -\label{sec:sawsim:methods-unfolding} - -% introduce Bell, probability calculations, and MC comparison -According to the theory developed by \citet{bell78} and extended by -\citet{evans99}, an external stretching force $F$ increases the -unfolding rate constant of a protein molecule -\begin{equation} - k_u = k_{u0} \exp\p({\frac{F\Delta x_u}{k_B T}}) \;, \label{eq:sawsim:bell} -\end{equation} -where $k_{u0}$ is the unfolding rate in the absence of an external -force, and $\Delta x_u$ is the distance between the native state and -the transition state along the pulling direction. The probability for -a protein molecule to unfold under an applied force is -\begin{equation} - P_1 = k_u \Delta t \;, \label{eq:sawsim:prob-one} -\end{equation} -where $\Delta t$ is the time duration for each pulling step, over -which $F$ is constant. This expression is accurate for $P_1 \ll 1$. -From the binomial distribution, the probability of at least one of a -group of $N_f$ identical domains to unfold in a given time step is -\begin{equation} - P = 1 - (1-P_1)^{N_f} \approx N_fP_1 \;, \label{eq:sawsim:prob-n} -\end{equation} -where the approximation is valid when $N_fP_1 \ll 1$. To determine if -an unfolding event occurs in a particular time step, the probability -calculated using \cref{eq:sawsim:prob-n} is compared with a randomly -generated number uniformly distributed between $0$ and $1$. If $P$ -is bigger than the random number, a domain unfolds, changing the -population of each tension state, and a new balance between the -polymer and the cantilever is determined. If no unfolding event -occurs the pulling continues and the unfolding probability is -calculated again in the next step at a higher force. When all the -molecules in the polymer have unfolded, the pulling continues until a -pre-determined force level is reached, where the polymer is assumed to -detach from one of the tethering surfaces. The cantilever deflection -becomes zero after this point. - -% address unfolding models -Although the Bell model (\cref{eq:sawsim:bell}) is the most widely used -unfolding model due to its simplicity and its applicability to various -biopolymers\citep{rief98}, other theoretical models have been proposed -to interpret mechanical unfolding data. For example, -\citet{schlierf06} used the mechanical unfolding data of the protein -ddFLN4 to demonstrate that Kramers' diffusion model fit the measured -unfolding force data better than the Bell model for proteins with -broad free energy barriers. For proteins with relatively narrow -folded and transition states, the Bell model provides a good -approximation. - -\subsection{Choosing the simulation time steps} -\label{sec:sawsim:methods-timesteps} - -The demands on the time step vary throughout a simulated pull due to -the non-linear elasticity of the polymer. Within a specified time -duration (or pulling distance), the force change is small at low force -levels and large at high force levels. To be efficient, the -simulation algorithm adapts the time step to keep the time steps large -where large time steps have little effect, while shrinking the time -step where smaller steps are necessary. - -Within each time step, the total chain extension $x_t$ is treated as a -constant and a force balance is reached very quickly among the various -domains (see \cref{sec:sawsim:methods-tension} for equilibration timescales). -This force is used to determine the unfolding probability -(\cref{eq:sawsim:prob-one,eq:sawsim:prob-n}), which determines the domain state -populations in the next time step. Therefore, the chain tension must -not change appreciably over the course of the time step ($\Delta F < -1\U{pN}$), and the unfolding probability is only calculated once for -the entire step. The time step must also be short enough that the -probability of unfolding in a single time step is low ($P<10^{-3}$). -Besides ensuring that the approximations made in -\cref{eq:sawsim:prob-one,eq:sawsim:prob-n} are valid, this restriction makes -time steps which should have multiple unfoldings in a single time step -highly unlikely. Experimentally measured unfolding are temporally -supered, because the unfolding transition is characterized by -multiple, Markovian attempts over a large energy barrier, where the -probability of crossing the barrier in a single attempt is very low. -A successful attempt quickly extends the chain contour length, -reducing the tension, dramatically reducing the likelihood of a second -escape in that time step. The time step used is recalculated for each -step so that both of these criteria are satisfied. - -\section{Results and Discussion} -\label{sec:sawsim:results} - -\subsection{Force curves generated by simulation} -\label{sec:sawsim:results-force_curves} - -\Cref{fig:sawsim:sim-sawtooth} shows three simulated force curves from -pulling a polymer composed of eight identical protein molecules using -parameters from typical experimental settings. The order of the peaks -in the force curves reflects the temporal sequence of the unfolding -events instead of the positions of the protein molecules in the -polymer\cite{li00}. As observed experimentally -(\cref{fig:sawsim:expt-sawtooth}), the forces at which identical protein -molecules unfold fluctuate, revealing the stochastic nature of protein -unfolding since no instrumental noise is included in the simulation. -\Cref{fig:sawsim:sim-hist} shows the distribution of the unfolding forces, -\ie, the highest force in each peak (except the last peak in a force -curve), from a total of $400$ force curves ($3200$ force values). The -unfolding forces have an average of $281\U{pN}$ with a standard -deviation of $25\U{pN}$. -% DONE: discuss noise? no. - -\begin{figure} -\vspace{-1in} -\begin{center} -\subfloat[][]{\includegraphics{sawsim/sim-sawtooth}\label{fig:sawsim:sim-sawtooth}% -}\\ -\subfloat[][]{\includegraphics{sawsim/sim-hist}\label{fig:sawsim:sim-hist}% -} -\caption{(a) Three simulated force curves from pulling a polymer of - eight identical protein molecules. The simulation was carried out - using the parameters: pulling speed $v=1\U{$\mu$m/s}$, cantilever - spring constant $\kappa_c=50\U{pN/nm}$, temperature $T=300\U{K}$, - persistence length of unfolded proteins $p_u=0.40\U{nm}$, $\Delta - x_u=0.225\U{nm}$, and $k_{u0}=5\E{-5}\U{s$^{-1}$}$. The contour - length between the two linking point on a protein molecule is - $L_{f1}=3.7\U{nm}$ in the folded form and $L_{u1}=28.1\U{nm}$ in the - unfolded form. These parameters are those of ubiquitin molecules - connected through the N-C termini\citep{chyan04,carrion-vazquez03}. - Detachment from the tip or substrate is assumed to occur at a force - of $400\U{pN}$. In experiments, detachments have been observed to - occur at a variety of forces. (b) The distribution of the unfolding - forces from $400$ simulated force curves ($3200$ data points) such - as that shown in (a). The frequency is normalized by the total - number of points, \ie, the height of each bin is equal to the number - of data points in that bin divided by the total number of data - points ($3200$, for this histogram).\label{fig:sawsim:sim-all}} -\end{center} -\end{figure} - -\subsection{Dependence of the unfolding force on the unfolding order -and polymer length} -\label{sec:sawsim:results-scaffold} - -Analysis of the mechanical unfolding data is complicated by the -dependence of the average unfolding force on the unfolding order due -to the serial linkage of the molecules. Under an external stretching -force $F$, the probability of some domain unfolding in a polymer with -$N_f$ folded domains is $N_fP_1$ (\cref{eq:sawsim:prob-n}), which is higher -than the unfolding probability for a single molecule $P_1$. -Consequently, the average unfolding force is lower for the earlier -unfolding events when $N_f$ is larger, and the force should increase -as more and more molecules become unfolded. However, there is a -competing factor that opposes this trend. As the protein molecules -unfold, the chain becomes softer and the force loading rate becomes -lower when the pulling speed is constant, leading to a decrease in the -unfolding force. The dependence of the average unfolding force on the -unfolding order is the result of these two opposing effects. -\Cref{fig:sawsim:order-dep} shows the dependence of the average unfolding -force on the unfolding force peak order (the temporal order of -unfolding events) for four polymers with $4$, $8$, $12$, and $16$ -identical protein molecules. The effect of polymer chain softening -dominates the initial unfolding events, and the average unfolding -force decreases as more molecules unfold. After several molecules -have unfolded, the softening for each additional unfolding event -becomes less significant, the change in unfolding probability becomes -dominant, and the unfolding force increases upon each subsequent -unfolding event\citep{zinober02}. - -We validate this explanation by calculating the unfolding force -probability distribution's depending on the two competing factors. -The rate of unfolding events with respect to force is -\begin{align} - r_{uF} &= -\deriv{F}{N_f} - = -\frac{\dd N_f/\dd t}{\dd F/\dd t} - = \frac{N_f k_u}{\kappa v} \\ - &= \frac{N_f k_{u0}}{\kappa v}\exp\p({\frac{F\Delta x_u}{k_B T}}) - = \frac{1}{\rho}\exp\p({\frac{F-\alpha}{\rho}}) \;, -\end{align} -where $N_f$ is the number of folded domain, -$\kappa=(1/\kappa_c+N_u/\kappa_\text{WLC})^{-1}$ is the spring -constant of the cantilever-polymer system ($\kappa_\text{WLC}$ is the -effective spring constant of one unfolded domain, assumed constant for -a particular polymer/cantilever combination), $\kappa v$ is the force -loading rate, and $k_u$ is the unfolding rate constant -(\cref{eq:sawsim:bell}). In the last expression, $\rho\equiv k_BT/\Delta -x_u$, and $\alpha\equiv-\rho\ln(N_fk_{u0}\rho/\kappa v)$. The event -probability density for events with an exponentially increasing -likelihood function follows the Gumbel (minimum) probability -density\cite{NIST:gumbel}, with $\rho$ and $\alpha$ being the scale -and location parameters, respectively -\begin{equation} - \mathcal{P}(F) = \frac{1}{\rho} \exp\p[{\frac{F-\alpha}{\rho} - -\exp\p({\frac{F-\alpha}{\rho}}) - }] \;. \label{eq:sawsim:gumbel} -\end{equation} -The distribution has a mean $\avg{F}=\alpha-\gamma_e\rho$ and a -variance $\sigma^2=\pi^2\rho^2/6$, where $\gamma_e=0.577\ldots$ is the -Euler-Mascheroni constant. Therefore, the unfolding force -distribution has a variance $\sigma^2 = (\pi k_BT/\Delta x_u)^2/6$, -and and average -\begin{equation} - \avg{F(i)} = \frac{k_BT}{\Delta x_u} - \p[{-\ln\p({\frac{N_fk_{u0}k_BT}{\kappa v\Delta x_u}}) - -\gamma_e}] \;, \label{eq:sawsim:order-dep} -\end{equation} -where $N_f$ and $\kappa$ depend on the domain index $i=N_u$. Curves based -on this formula fit the simulated data remarkably well considering the -effective WLC stiffness $\kappa_\text{WLC}$ is the only fitted -parameter, and that the actual WLC stiffness is not constant, as we -have assumed here, but a non-linear function of $F$. - -From \cref{fig:sawsim:order-dep}, it can be seen that the proper way to -process data from mechanical unfolding experiments is to group the -curves according to the length of the polymer and to perform -statistical analysis separately for peaks with the same unfolding -order. However, in most experiments, the tethering of the polymer to -the AFM tip is by nonspecific adsorption; as a result, the polymers -being stretched between the tip and the substrate have various -lengths. In addition, the interactions between the tip and the -surface often cause irregular features in the beginning of the force -curve (\cref{fig:sawsim:expt-sawtooth}), making the identification of the -first peak uncertain. Furthermore, it is often difficult to acquire a -large amount of data in single molecules experiments. These -difficulties make the aforementioned data analysis approach unfeasible -for many mechanical unfolding experiments. As a result, the values of -all force peaks from polymers of different lengths are often pooled -together for statistical analysis. To assess the errors caused by -such pooling, simulation data were analyzed using different pooling -methods and the results were compared. \Cref{fig:sawsim:sim-hist} shows -that, for a polymer with eight protein molecules, the average -unfolding force is $281\U{pN}$ with a standard deviation of $25\U{pN}$ -when all data is pooled. If only the first peaks in the force curves -are analyzed, the average force is $279\U{pN}$ with a standard -deviation of $22\U{pN}$. While for the fourth and eighth peaks, the -average force are $275\U{pN}$ and $300\U{pN}$, respectively, and the -standard deviations are $23\U{pN}$ and $25\U{pN}$, respectively. As -expected from the Gumbel distribution, the width of the unfolding -force distribution (insets in \cref{fig:sawsim:order-dep}) is only weakly -effected by unfolding order, but the average unfolding force can be -quite different for the same protein because of the differences in -unfolding order and polymer length. - -\begin{figure} -\begin{center} -\includegraphics{sawsim/order-dep} -\caption{The dependence of the unfolding force on the temporal - unfolding order for four polymers with $4$, $8$, $12$, and $16$ - molecules of identical proteins. Each point in the figure is the - average of $400$ data points. The first point in each curve - represents the average of only the first peak in each of the $400$ - simulated force curves, the second point represents the average of - only the second peak, and so on. The solid lines are fits of - \cref{eq:sawsim:order-dep} to the simulated data, with best fit - $\kappa_\text{WLC}=203$, $207$, $161$, and $157\U{pN/nm}$, - respectively, for lengths $4$ through $16$. The insets show the - force distributions of the first, fourth, and eighth peaks, left to - right, for the polymer with eight protein molecules. The parameters - used for generating the data were the same as those used for - \cref{fig:sawsim:sim-sawtooth}, except the polymer length, and the - histograms in the insets were normalized in the same way as in - \cref{fig:sawsim:sim-hist}.\label{fig:sawsim:order-dep}} -\end{center} -\end{figure} - -\subsection{The effect of polymer inhomogeneity} -\label{sec:sawsim:results-folded-tension} - -The unfolded polypeptide chain has been shown to follow the WLC model -quite well, though other polymer models, such as the Freely-Jointed -Chain (FJC)\citep{verdier70}, can be used to fit the force-extension -relationship\citep{janshoff00}. A short chain of folded proteins, -however, cannot be described well by polymer models. Several studies -have used WLC and FJC models to fit the elastic properties of the -modular protein titin\citep{granzier97,linke98a}, -% TODO: check it really is folded domains \& bulk titin -but native titin contains hundreds of folded and unfolded domains -domains. For the short protein polymers common in mechanical -unfolding experiments, the cantilever dominates the elasticity of the -polymer-cantilever system before any protein molecules unfold. After -the first unfolding event occurs, the unfolded portion of the chain is -already longer and softer than the sum of all the remaining folded -domains, and dominates the elasticity of the whole chain. Therefore, -the details of the tension model chosen for the folded domains has -negligible effect on the unfolding forces, which was also suggested by -\citet{staple08}. Force curves simulated using different models to -describe the folded domains yielded almost identical unfolding force -distributions (data not shown). - -\subsection{The effect of cantilever force constant} -\label{sec:sawsim:cantilever} - -In mechanical unfolding experiments, the ability to observe the -unfolding of a single protein molecule depends on the tension drop -after an unfolding event such that another molecule does not unfold -immediately. The magnitude of this drop is determined by many -factors, including the magnitude of the unfolding force, the contour -and persistence lengths of the protein polymer, the contour length -increase from unfolding, and the stiffness (force constant) of the -cantilever. Among these, the effect of the cantilever force constant -is particularly interesting because cantilevers with a wide range of -force constants are available. In addition different single molecule -manipulation techniques, such as the AFM and laser tweezers, differ -mainly in the range of the spring constants of their force -transducers. \Cref{fig:sawsim:kappa-sawteeth} shows the simulated force -curves from pulling an octamer of protein molecules using cantilevers -with different force constants, while other parameters are identical. -For this model protein, the appearance of the force curve does not -change much until the force constant of the cantilever reaches a -certain value ($\kappa_c=50\U{pN/nm}$). When $\kappa_c$ is lower than -this value, the individual unfolding events become less identifiable. -In order to observe individual unfolding events, the cantilever needs -to have a force constant high enough so that the bending at the -maximum force is small in comparison with the contour length increment -from the unfolding of a single molecule. \Cref{fig:sawsim:kappa-sawteeth} -also shows that the back side of the force peaks becomes more tilted -as the cantilever becomes softer. This is due to the fact that the -extension (end-to-end distance) of the protein polymer has a large -sudden increase as the tension rebalances after an unfolding event. - -It should also be mentioned that the contour length increment from -each unfolding event is not equal to the distance between adjacent -peaks in the force curve because the chain is never fully stretched. -This contour length increase can only be obtained by fitting the curve -to WLC or other polymer models (\cref{fig:sawsim:expt-sawtooth}). - -\begin{figure} -\begin{center} -\includegraphics{sawsim/kappa-sawteeth} -\caption{Simulated force curves obtained from pulling a polymer with - eight protein molecules using cantilevers with different force - constants $\kappa_c$. Parameters used in generating these curves - are the same as those used in \cref{fig:sawsim:sim-all}, except the - cantilever force constant.\label{fig:sawsim:kappa-sawteeth}} -\end{center} -\end{figure} - -\subsection{Determination of $\Delta x_u$ and $k_{u0}$} -\label{sec:sawsim:results-fitting} - -The zero-force unfolding rate $k_{u0}$ and the distance $\Delta x_u$ -from the native state to the transition state are the two kinetic -parameters obtainable for mechanical unfolding experiments by matching -the simulated data with measured results. \cref{fig:sawsim:v-dep} shows the -dependence of the unfolding force on the pulling speed for different -values of $k_{u0}$ and $\Delta x_u$. As expected, the unfolding force -increases linearly with the pulling speed in the linear-log -plot\citep{evans99}. While the magnitude of the unfolding forces is -affected by both $k_{u0}$ and $\Delta x_u$, the slope of speed -dependence is primarily determined by $\Delta x_u$. -\Cref{fig:sawsim:width-v-dep} shows that the width of the unfolding force -distribution is very sensitive to $\Delta x_u$, as expected from the -Gumbel distribution discussed in \cref{sec:sawsim:results-scaffold}. To -obtain the values of $k_{u0}$ and $\Delta x_u$ for the protein, the -pulling speed dependence and the distribution of the unfolding forces -from simulation, such as those shown in \cref{fig:sawsim:v-dep} and the -insets of \cref{fig:sawsim:width-v-dep}, are compared with the experimentally -measured results. The values of $k_{u0}$ and $\Delta x_u$ that -provide the best match are designated as the parameters describing the -protein under study. Since $k_{u0}$ and $\Delta x_u$ affect the -unfolding forces differently, the values of both parameters can be -determined simultaneously. The data used in plotting -\cref{fig:sawsim:all-v-dep} includes all force peaks from the simulated force -curves because most experimental data is analyzed that way. - -In most published literature, determination of the values of $k_{u0}$ -and $\Delta x_u$ was mostly done by carrying out simulations using a -handful of possible unfolding parameters and selected the best fit by -eye% -%\citep{us,CV1999} -. This approach does not allow estimation of uncertainties in the -fitting parameters, as shown by \citet{best02}. A more rigorous -approach involves quantifying the quality of fit between the -experimental and simulated force distributions, allowing the use of a -numerical minimization algorithm to pick the best fit parameters. We -use the Jensen-Shannon divergence\citep{sims09,lin91}, a measure of -the similarity between two probability distributions. -\begin{equation} - D_\text{JS}(p_e, p_s) - = D_\text{KL}(p_e, p_m) + D_\text{KL}(p_s, p_m) \;, \label{eq:sawsim:D_JS} -\end{equation} -where $p_e(i)$ and $p_s(i)$ are the the values of the $i^\text{th}$ -bin in the experimental and simulated unfolding force histograms, -respectively. $D_\text{KL}$ is the Kullback-Leibler divergence -\begin{equation} - D_\text{KL}(p_p,p_q) - = \sum_i p_p(i) \log_2\p({\frac{p_p(i)}{p_q(i)}}) \;, \label{eq:sawsim:D_KL} -\end{equation} -where the sum is over all unfolding force histogram bins. $p_m$ is -the symmetrized probability distribution -\begin{equation} - p_m(i) \equiv [p_e(i)+p_s(i)]/2 \;. -\end{equation} -% DONE: Mention inter-histogram normalization? no. -% For experiments carried out over a series of pulling velocities, we -% simply sum residuals computed for each velocity, although it would -% also be reasonable to weight this sum according to the number of -% experimental unfolding events recorded for each velocity. -% DONE: mention $D_\text{JS}$ features to explains selection over $\chi^2$? no. -\Cref{fig:sawsim:fit-space} shows the Jensen-Shannon divergence calculated -using \cref{eq:sawsim:D_JS} between an experimental data set and simulation -results obtained using a range of values of $k_{u0}$ and $\Delta x_u$. -There is an order of magnitude range of $k_{u0}$ that produce -reasonable fits to experimental data (\cref{fig:sawsim:fit-space}), which is -consistent with the results \citet{best02} obtained using a chi-square -test. The values of $k_{u0}$ and $\Delta x_u$ can be determined to -higher precision by using both the pulling speed dependent data and -the unfolding force distribution, as well as any relevant information -about the protein from other sources. - -\begin{figure} -\begin{center} -\subfloat[][]{\includegraphics{sawsim/v-dep}\label{fig:sawsim:v-dep}% -} \\ -\subfloat[][]{\includegraphics{sawsim/v-dep-sd}\label{fig:sawsim:width-v-dep}% -} -\caption{(a) The dependence of the unfolding forces on the pulling - speed for three different model protein molecules characterized by - the parameters $k_{u0}$ and $\Delta x_u$. The polymer length is - eight molecules, and each symbol is the average of $3200$ data - points. (b) The dependence of standard deviation of the unfolding - force distribution on the pulling speed for the simulation data - shown in (a), using the same symbols. The insets show the force - distribution histograms for the three proteins at the pulling speed - of $1\U{$\mu$m/s}$. The left, middle and right histograms are for - the proteins represented by the top, middle, and bottom lines in (a), - respectively.\label{fig:sawsim:all-v-dep}} -\end{center} -\end{figure} - -\begin{figure} -\begin{center} -\includegraphics{sawsim/fit-space} -\caption{Fit quality between an experimental data set and simulated - data sets obtained using various values of unfolding rate parameters - $k_{u0}$ and $\Delta x_u$. The experimental data are from octameric - ubiquitin pulled at $1\U{$\mu$m/s}$\citep{chyan04}, and the other - model parameters are the same as those in \cref{fig:sawsim:sim-all}. The - best fit parameters are $\Delta x_u=0.17\U{nm}$ and - $k_{u0}=1.2\E{-2}\U{s$^{-1}$}$. The simulation histograms were - built from $400$ pulls at for each parameter pair. The color scale - shown on the right is $log_{10}(D_\text{JS})$. -\label{fig:sawsim:fit-space}} -\end{center} -\end{figure} - -\section{Conclusions} -\label{sec:sawsim:conclusions} - -We have described the method of performing Monte Carlo simulations -based on a simple two-state model for the mechanical unfolding of -protein molecules and discussed the complications involved in the -simulation procedure. In addition to the extraction of kinetic -properties of the protein from mechanical unfolding data, such -simulations can help to elucidate the effects of various experimental -parameters on the appearance of force curves and to estimate the -errors associated with data pooling. To date, the force-induced -unfolding approach has been used to investigate several different -types of proteins. As the technique is used to study a wider range of -proteins, this simple simulation method will be useful for data -analysis, experimental design, and artifact identification. - -%% Ispell local dictionary: - -%% Science words -% LocalWords: polysaccharides polypeptide biomembrane biopolymers photodetector -% LocalWords: piconewtons nanonewtons nanometers vt timesteps timestep -% LocalWords: ubiquitin titin sawteeth octamer octameric -% LocalWords: unstretched undeflected unfoldings underdamped -% LocalWords: rebalances nonspecific equilibrated - -%% Abbreviations: -% LocalWords: Tel AFM afm WLC wlc FJC fjc PACS MSC pN nm MC ddFLN - -%% Names and related -% LocalWords: Hookean hooke Kramers kramers Mascheroni Kullback Leibler -% LocalWords: Markovian msu Meihong Su gyang Guoliang CSIRO Drexel -% LocalWords: bustamante forde smith merkel carrion vazquez rief levy fernandez -% LocalWords: chyan lu klimov li zinober jollymore marko sims evans schlierf -% LocalWords: janshoff granzier linke verdier dicola lin - -%% LaTeX and related -% LocalWords: documentclass elsarticle pt tnoteref tnotetext fnref fntext -% LocalWords: corref cortext ead form url cor TODO - -%% Reference abbreviations -% LocalWords: sec fig eq expt sim dep const prob hist hists +\input{sawsim/introduction} +\input{sawsim/methods} +\input{sawsim/discussion} +\input{sawsim/conclusions} diff --git a/tex/src/sawsim/methods.tex b/tex/src/sawsim/methods.tex new file mode 100644 index 0000000..ad0243f --- /dev/null +++ b/tex/src/sawsim/methods.tex @@ -0,0 +1,240 @@ +\section{Methods} +\label{sec:sawsim:methods} + +% simulation overview +In simulating the mechanical unfolding process, a force curve is +generated by calculating the amount of the cantilever bending as the +substrate surface moves away from the tip. The cantilever bending is +obtained by balancing the tension in the protein polymer and the +Hookean force of the bent cantilever. The unfolding probability of +the protein molecules in the polymer is then calculated for that +tension, and whether an unfolding event occurs is determined according +to a Monte Carlo method. The simulation was implemented in +C\footnote{Source code available at + \url{http://www.physics.drexel.edu/~wking/sawsim/}}. + +\subsection{Generating force curves} +\label{sec:sawsim:methods-tension} + +% introduce domains and groups. +The fundamental abstraction of the simulation is the ``domain'', which +represents a discrete chunk of the flexible chain between the +substrate and the cantilever holder. Each of these domains is +assigned a particular state; for example, the domain representing the +cantilever is assigned to the ``cantilever'' state, and the domains +representing protein molecules are assigned to either the ``folded'' +or the ``unfolded'' state. When balancing the tension along the +chain, we assume that the spatial order of domains along the chain is +irrelevant\citep{li00}, and therefore, the domains can be rearranged +and grouped by state. To determine the tension in the chain and the +amount of cantilever bending when $n$ states are populated, a system +of $n+1$ equations with $n+1$ unknowns must be solved +\begin{align} + F_i(x_i) &= F_t \label{eq:sawsim:tension-balance} \\ + \sum_i x_i &= x_t \;, \label{eq:sawsim:x-total} +\end{align} +where $F$ are tensions, $x$ are extensions, and the subscripts $i$ and +$t$ represent a particular state group and the total chain +respectively (\cref{fig:schematic}). From this $F(x_t)$ may be +computed using any multi-dimensional root-finding algorithm. + +% introduce particular models, and mention parameter aggregation +Inside this framework, we choose a particular extension model +$F_i(x_i)$ for each domain state. Cantilever elasticity is described +by Hooke's law, which gives +\begin{equation} + F = \kappa_c x_c \;, \label{eq:sawsim:hooke} +\end{equation} +where $\kappa_c$ is the bending spring constant and $x_c$ is the +deflection of the cantilever (\cref{fig:schematic}). Unfolded domains +are modeled as a Worm-Like Chain (WLC)\citep{marko95,bustamante94}, in +which the tension $F$ is related to its extension (end-to-end +distance) $x_u$ by +\begin{equation} + F = \frac{k_B T}{p_u} + \p[{ \frac{1}{4}\p({ \frac{1}{(1-x_u/L_u)^2} - 1 }) + + \frac{x_u}{L_u} }] \;, + \label{eq:sawsim:wlc} +\end{equation} +where $p_u$ is the persistence length and $L_u=N_uL_{u1}$ is the total +contour length of the unfolded chain. The chain of $N_f$ folded +domains is modeled as a string free to assume any extension up to some +fixed contour length $L_f=N_fL_{f1}$ +\begin{equation} + F = \begin{cases} + 0 & \text{if $x_fL_f$} \;, + \end{cases} \label{eq:sawsim:piston} +\end{equation} +where $L_{f1}$ is the separation of the two linking points of a folded +domain, and $x_f$ is the end-to-end length of the chain of folded +domains. In this model, any non-zero tension will fully extend these +folded domains. As discussed in \cref{sec:sawsim:results-folded-tension}, +the contribution of the folded domains to the elastic behavior of the +polymer-cantilever system is relatively insignificant. + +% address assumptions & caveats +In the simulation, the protein polymer is assumed to be stretched in +the direction perpendicular to the surface, which is a good +approximation in most experimental situations, because the unfolded +length of a protein molecule is much larger than that of the folded +form. Therefore, after one molecule unfolds, the polymer becomes much +longer and the angle between the polymer and the surface approaches +$90$ degrees\citep{carrion-vazquez00}. The joints between domain +groups are assumed to lie along a line between the surface tether +point and the position of the tip (\cref{eq:sawsim:x-total}). The effects of +this assumption are also minimized due to greater length of the +unfolded domain. Finally, the interactions between different parts of +the polymer and between the chain and the surface (except at the +tethering points) are not considered. This is reasonable since these +interactions should not make substantial contributions to the force +curve at the force levels of interest, where the polymer is in a +relatively extended conformation. + +% introduce constant velocity and walk through explicit example pull +Consider an experiment of pulling a polymer with $N$ identical protein +molecules at a constant speed. At the start of an experiment, the +chain is unstretched ($x_t=0$), which means all the domains are +unstretched, the cantilever is undeflected, and the tip is in contact +with the surface. There is one domain in the cantilever state, $N$ in +the folded state, and none in the unfolded state. As the surface +moves away from the tip at a constant speed $v$, the chain becomes +more extended (\cref{fig:schematic}), such that +\begin{equation} + x_t = \sum_i x_i = vt \label{eq:sawsim:const-v} \;. +\end{equation} +The simulation assumes that the pulling takes discrete steps in space +and treats $x_t$ as constant over the duration of one time step +$\Delta t$. Because of the adaptive time steps discussed in +\cref{sec:sawsim:methods-timesteps}, the space steps $\Delta x_t = v\Delta t$ +may have different sizes, but each step will be ``small''. At each +step, the total extension is calculated using \cref{eq:sawsim:const-v}, and +the tension $F(x_t=vt)$ is determined by numerically solving +\cref{eq:sawsim:tension-balance,eq:sawsim:x-total} using the models +\cref{eq:sawsim:hooke,eq:sawsim:wlc,eq:sawsim:piston} for known values of the parameters in +the various states $(N_u, N_f, v, \kappa_c, L_{f1}, L_{u1}, p_u)$. +When one of the molecules in the polymer unfolds +(\cref{sec:sawsim:methods-unfolding}), there will be one domain in the +unfolded state and $N-1$ in the folded state. In the next step, a +newly balanced tension between the cantilever and the polymer is +determined by solving for $F(x_t)$ as discussed above, but with the +total extension $x_t$ incremented by $v\Delta t$ and the new unfolded +contour length $L_{u1}$ and folded contour length $(N-1)L_{f1}$. The +sudden lengthening of the polymer chain results in a corresponding +abrupt drop in the force, leading to the formation of one sawtooth in +the force curve. As the pulling continues and more domains unfold, +force curves with a series of sawteeth are generated +(\cref{fig:sawsim:sim-sawtooth}). + +The tension calculation assumes an equilibrated chain, so +consideration must be given to the chain's relaxation time, which +should be short compared to the loading timescale. The relaxation +time for a WLC is given by +\begin{equation} + \tau \approx \eta \frac{k_BT p}{F^2} +% < \eta \frac{k_BT p}{(k_BTx/pL)^2} = +% Note: < because F > k_BTx/pL +% < \frac{\eta p^3 L^2}{k_BT x^2} +% < \frac{\eta p^2 L}{k_BT} % for x/L > \sqrt{p/L} + \;, \label{eq:sawsim:tau-wlc} +\end{equation} +where $\eta$ is the dynamic viscosity, $F$ is the tension, and $p$ is +the persistence length\citep{evans99}. For forces greater than +$1\U{pN}$, with $\eta_\text{water}/k_BT=2.45\E{-10}\U{s/nm$^3$}$, +$\tau<2\U{ns}$ for the protein polymer used in the simulation. +% python -c 'print 2.45e-10*(1e9)**3 * (1.38e-23*300)**2 * 0.4e-9 / (1e-12)**2' +% 1.68...e-09 +Therefore, the polymer chain is equilibrated almost instantaneously +within a time step, which is on the order of tens of microseconds. +The relaxation time of the cantilever can be determined by measuring +the cantilever deflection induced by liquid motion and fitting the +time dependence of the deflection to an exponential +function\cite{jones05}. For a $200\U{$\mu$m}$ rectangular cantilever +with a bending spring constant of $20\U{pN/nm}$, the measured +relaxation time in water is $\sim50\U{$\mu$/s}$ (data not shown). +This relatively large relaxation time constant makes the cantilever +act as a low-pass filter and also causes a lag in the force +measurement. + +\subsection{Unfolding protein molecules by force} +\label{sec:sawsim:methods-unfolding} + +% introduce Bell, probability calculations, and MC comparison +According to the theory developed by \citet{bell78} and extended by +\citet{evans99}, an external stretching force $F$ increases the +unfolding rate constant of a protein molecule +\begin{equation} + k_u = k_{u0} \exp\p({\frac{F\Delta x_u}{k_B T}}) \;, \label{eq:sawsim:bell} +\end{equation} +where $k_{u0}$ is the unfolding rate in the absence of an external +force, and $\Delta x_u$ is the distance between the native state and +the transition state along the pulling direction. The probability for +a protein molecule to unfold under an applied force is +\begin{equation} + P_1 = k_u \Delta t \;, \label{eq:sawsim:prob-one} +\end{equation} +where $\Delta t$ is the time duration for each pulling step, over +which $F$ is constant. This expression is accurate for $P_1 \ll 1$. +From the binomial distribution, the probability of at least one of a +group of $N_f$ identical domains to unfold in a given time step is +\begin{equation} + P = 1 - (1-P_1)^{N_f} \approx N_fP_1 \;, \label{eq:sawsim:prob-n} +\end{equation} +where the approximation is valid when $N_fP_1 \ll 1$. To determine if +an unfolding event occurs in a particular time step, the probability +calculated using \cref{eq:sawsim:prob-n} is compared with a randomly +generated number uniformly distributed between $0$ and $1$. If $P$ +is bigger than the random number, a domain unfolds, changing the +population of each tension state, and a new balance between the +polymer and the cantilever is determined. If no unfolding event +occurs the pulling continues and the unfolding probability is +calculated again in the next step at a higher force. When all the +molecules in the polymer have unfolded, the pulling continues until a +pre-determined force level is reached, where the polymer is assumed to +detach from one of the tethering surfaces. The cantilever deflection +becomes zero after this point. + +% address unfolding models +Although the Bell model (\cref{eq:sawsim:bell}) is the most widely used +unfolding model due to its simplicity and its applicability to various +biopolymers\citep{rief98}, other theoretical models have been proposed +to interpret mechanical unfolding data. For example, +\citet{schlierf06} used the mechanical unfolding data of the protein +ddFLN4 to demonstrate that Kramers' diffusion model fit the measured +unfolding force data better than the Bell model for proteins with +broad free energy barriers. For proteins with relatively narrow +folded and transition states, the Bell model provides a good +approximation. + +\subsection{Choosing the simulation time steps} +\label{sec:sawsim:methods-timesteps} + +The demands on the time step vary throughout a simulated pull due to +the non-linear elasticity of the polymer. Within a specified time +duration (or pulling distance), the force change is small at low force +levels and large at high force levels. To be efficient, the +simulation algorithm adapts the time step to keep the time steps large +where large time steps have little effect, while shrinking the time +step where smaller steps are necessary. + +Within each time step, the total chain extension $x_t$ is treated as a +constant and a force balance is reached very quickly among the various +domains (see \cref{sec:sawsim:methods-tension} for equilibration timescales). +This force is used to determine the unfolding probability +(\cref{eq:sawsim:prob-one,eq:sawsim:prob-n}), which determines the domain state +populations in the next time step. Therefore, the chain tension must +not change appreciably over the course of the time step ($\Delta F < +1\U{pN}$), and the unfolding probability is only calculated once for +the entire step. The time step must also be short enough that the +probability of unfolding in a single time step is low ($P<10^{-3}$). +Besides ensuring that the approximations made in +\cref{eq:sawsim:prob-one,eq:sawsim:prob-n} are valid, this restriction makes +time steps which should have multiple unfoldings in a single time step +highly unlikely. Experimentally measured unfolding are temporally +supered, because the unfolding transition is characterized by +multiple, Markovian attempts over a large energy barrier, where the +probability of crossing the barrier in a single attempt is very low. +A successful attempt quickly extends the chain contour length, +reducing the tension, dramatically reducing the likelihood of a second +escape in that time step. The time step used is recalculated for each +step so that both of these criteria are satisfied.