Starting on version 0.6, with generalized discrete reactions.

[sawsim.git] / src / sawsim.nw
diff --git a/src/sawsim.nw b/src/sawsim.nw

index 1253c54fb99cde87c2527b3f08abae159dd3964a..0d3ae8ec11a90e1eeb5be4496c41525d96e24fe0 100644 (file)
--- a/src/sawsim.nw
+++ b/src/sawsim.nw
@@ -11,10 +11,15 @@
  \usepackage[breaklinks,colorlinks]{hyperref} % for internal links, \phantomsection
  % breaklinks breaks long links
  % colorlinks colors link text instead of boxing it
-\usepackage{amsmath}  % for \text{}, \xrightarrow[sub]{super}
+\usepackage{amsmath} % for \text{}, \xrightarrow[sub]{super}
  \usepackage{amsthm}  % for theorem and proof environments
  \newtheorem{theorem}{Theorem}
  
+\usepackage{subfig} % Support side-by-side figures
+\usepackage{pgf}    % Fancy graphics
+\usepackage{tikz}   % A nice, inline PGF frontend
+\usetikzlibrary{automata} % Graph-theory library
+
  \usepackage{doc}    % BibTeX
  \usepackage[super,sort&compress]{natbib} % fancy citation extensions
  % super selects citations in superscript mode
@@ -72,6 +77,9 @@
    \today \\
  \end{centering}
  \bigskip
+
+\tableofcontents
+
  %%%%%%%%%%%%%%%%%%%%%%%%%%%% End LaTeX boilerplate %%%%%%%%%%%%%%%%%%%%%%%%%%%%
  
  \section{Introduction}
@@ -136,15 +144,19 @@ line control of tension groups.
  Version 0.5 added the stiffness environmental parameter and it's
  associated unfolding models.
  
+Version 0.6 generalized from two state unfolding to arbitrary
+$N$-state rate reactions.  This allows simulations of
+unfolding-refolding, metastable intermediates, etc.
+
  <<version definition>>=
-#define VERSION "0.5"
+#define VERSION "0.6"
  @
  
  \subsection{License}
  
  <<license>>=
   sawsim - program for simulating single molecule mechanical unfolding.
- Copyright (C) 2008, William Trevor King
+ Copyright (C) 2008-2009, William Trevor King
  
   This program is free software; you can redistribute it and/or
   modify it under the terms of the GNU General Public License as
@@ -175,9 +187,57 @@ associated unfolding models.
  
  \section{Main}
  
-The unfolding system is stored as a chain of \emph{domains}.  Domains
-can be folded, globular protein domains, unfolded protein linkers, AFM
-cantilevers, or other stretchable system components.
+The unfolding system is stored as a chain of \emph{domains} (Figure
+\ref{fig.domain_chain}).  Domains can be folded, globular protein
+domains, unfolded protein linkers, AFM cantilevers, or other
+stretchable system components.  The system is modeled as graph of
+possible domain states with transitions between them (Figure
+\ref{fig.domain_states}).
+
+\begin{figure}
+\begin{center}
+\subfloat[][]{\label{fig.domain_chain}
+\begin{tikzpicture}[->,node distance=2cm,font=\footnotesize]
+  \tikzstyle{every state}=[fill,draw=red!50,very thick,fill=red!20]
+  \node[state] (A)              {domain 1};
+  \node[state] (B) [below of=A] {domain 2};
+  \node[state] (C) [below of=B] {.~.~.};
+  \node[state] (D) [below of=C] {domain $N$};
+  \node        (S) [below of=D] {Surface};
+  \node        (E) [above of=A] {};
+
+  \path[-] (A) edge (B)
+           (B) edge node [right] {Tension} (C)
+           (C) edge (D)
+           (D) edge (S);
+  \path[->,green] (A) edge node [right,black] {Extension} (E);
+\end{tikzpicture}
+}
+\qquad
+\subfloat[][]{\label{fig.domain_states}
+\begin{tikzpicture}[->,node distance=2.5cm,shorten <=1pt,shorten >=1pt,font=\footnotesize]
+  \tikzstyle{every state}=[fill,draw=blue!50,very thick,fill=blue!20]
+  \node[state] (A)              {cantilever};
+  \node[state] (C) [below of=A] {transition};
+  \node[state] (B) [left of=C]  {folded};
+  \node[state] (D) [right of=C] {unfolded};
+
+  \path (B) edge [bend left] node [above] {$k_1$} (C)
+        (C) edge [bend left] node [below] {$k_1'$} (B)
+            edge [bend left] node [above] {$k_2$} (D)
+        (D) edge [bend left] node [below] {$k_2'$} (C);
+\end{tikzpicture}
+}
+\caption{\subref{fig.domain_chain} Extending a chain of domains.  One end
+  of the chain is fixed, while the other is extended at a constant
+  speed.  The domains are coupled with rigid linkers, so the domains
+  themselves must stretch to accomodate the extension.
+  \subref{fig.domain_states} Each domain exists in a discrete state.  At
+  each timestep, it may transition into another state following a
+  user-defined state matrix such as this one, showing a metastable
+  transition state and an explicit ``cantilever'' domain.}
+\end{center}
+\end{figure}
  
  Each domain contributes to the total tension, which depends on the
  chain extension.  The particular model for the tension as a function
@@ -192,9 +252,9 @@ following models have been implemented:
  \end{packed_item}
  
  The tension model and parameters used for a particular domain depend
-on whether the domain is folded or unfolded.  The transition rate from
-the folded state to the unfolded state is also handled generally with
-function pointers, with implementations for
+on whether the domain's current state.  The transition rates between
+states are also handled generally with function pointers, with
+implementations for
  \begin{packed_item}
   \item  Null (Appendix \ref{sec.null_k}),
   \item  Bell's model (Appendix \ref{sec.bell}),
@@ -205,10 +265,11 @@ function pointers, with implementations for
  
  The unfolding of the chain domains is modeled in two stages.  First
  the tension of the chain is calculated.  Then the tension (assumed to
-be constant over the length of the chain), is applied to each folded
-domain, and used to calculate the probability of that subdomain
-unfolding in the next timestep $dt$.  Then the time is stepped
-forward, and the process is repeated.
+be constant over the length of the chain, see Section
+\ref{sec.timescales}), is applied to each folded domain, and
+used to calculate the probability of that subdomain unfolding in the
+next timestep $dt$.  Then the time is stepped forward, and the process
+is repeated.
  
  <<main program>>=
  int main(int argc, char **argv)
@@ -513,16 +574,6 @@ int domain_unfolds(double F, double dt, environment_t *env, domain_t *domain)
  }
  @ [[happens]] is a random decision making function defined in Appendix \ref{app.utils}.
  
-Only one domain can unfold in each timestep, because the timescale of
-a domain unfolding $dt_u$ is assumed to be less than the simulation
-timestep $dt$, so a domain will completely unfold in a single
-timestep.  We adapt our timesteps to keep the probability of a single
-domain unfolding low, so the probability of two domains unfolding in
-the same timestep is negligible.  This approach breaks down as the
-adaptive timestep scheme approaches $dt \sim dt_u$, but $dt_u \sim
-1\U{$\mu$s}$ for Ig27-like proteins \citep{klimov00}, so this
-shouldn't be a problem.  To reassure yourself, you can ask the
-simulator to print the smallest timestep that was used with TODO.
  <<randomly unfold domains>>=
  int random_unfoldings(list_t *domain_list, double tension,
                       double dt, environment_t *env,
@@ -543,18 +594,52 @@ int random_unfoldings(list_t *domain_list, double tension,
  }
  @
  
+\subsection{Timescales}
+\label{sec.timescales}
+
+The simulation assumes that chain equilibration is instantaneous
+relative to the simulation timestep $dt$, so that tension is uniform
+along the chain.  The quality of this assumption depends on your
+particular chain.  For example, a damped spring thermalizes on a
+timescale of order $\tau = 1/\gamma$ where the damping ratio $\gamma
+\equiv \omega_0(-\zeta \pm \sqrt{\zeta^2-1}$, the natural angular
+frequency $omega_0 \equiv \sqrt{k/m}$, $\zeta \equiv b/2\sqrt{km}$,
+and $b$ sets the drag $F_b = -bv$.  For our cantilevers $\tau$ is
+on the order of milliseconds, which is longer than a timestep.
+However, the approximation is still reasonable, because there is
+rarely unfolding during the cantilever return.  You could, of course,
+take cantilever, WLC, etc.\ response times into effect, but that
+would significantly complicate a simulation that seems to work
+well enough without bothering :p.  For WLC and FJC relaxation times,
+see the Appendix of \citet{evans99} and \citet{kroy07}.
+
+Besides assuming our timestep is much greater than equilibration
+times, we also force it to be short enough so that only one domain may
+unfold in a given timestep.  For an unfolding timescale of $dt_u =
+1/k$ we adapt our timesteps to keep $dt \ll dt_u$, so the probability
+of two domains unfolding in the same timestep is negligible.  This
+approach breaks down as the adaptive timestep scheme approaches $dt
+\sim dt_u$, but $dt_u \sim 1\U{$\mu$s}$ for Ig27-like proteins
+\citep{klimov00}, so this shouldn't be a problem.  To reassure
+yourself, you can ask the simulator to print the smallest timestep
+that was used with TODO.
+
  \subsection{Adaptive timesteps}
  \label{sec.adaptive_dt}
  
-We'd like to pick $dt$ so the probability of unfolding in the next
-timestep is small.  If we don't adapt our timestep, we risk breaking
-our approximation that $F(x' \in [x, x+v \cdot dt]) = F(x)$ or that
-only one domain unfolds in a given timestep.  Because $F(x)$ is
-monotonically increasing, excessively large timesteps will lead to
-erroneously large unfolding forces.  The simulation would have been
-accurate for sufficiently small fixed timesteps, but adaptive
-timesteps will allow us to move through low tension regions in fewer
-steps, leading to a more efficient simulation.
+We'd like to pick $dt$ so the probability of changing state
+(e.g. unfolding) in the next timestep is small.  If we don't adapt our
+timestep, we also risk breaking our approximation that $F(x' \in [x,
+  x+v \cdot dt]) = F(x)$ or that only one domain changes state in a
+given timestep.  The simulation would have been accurate for
+sufficiently small fixed timesteps, but adaptive timesteps will allow
+us to move through low tension regions in fewer steps, leading to a
+more efficient simulation.
+
+Consider the two state folded $\rightarrow$ unfolded transition.
+Because $F(x)$ is monotonically increasing between unfoldings,
+excessively large timesteps will lead to erroneously small unfolding
+rates (an thus, higher average unfolding force).
  
  The actual adaptive timestep implementation is not particularly
  interesting, since we are only required to reduce $dt$ to somewhere