From: W. Trevor King Date: Sat, 18 May 2013 03:06:41 +0000 (-0400) Subject: sawsim/discussion.tex: Add D_JS, D_KL, p_m, ... to the nomenclature X-Git-Tag: v1.0~172 X-Git-Url: http://git.tremily.us/?a=commitdiff_plain;h=8fac547fbe2e624fccd4173242cf664b445811a1;p=thesis.git sawsim/discussion.tex: Add D_JS, D_KL, p_m, ... to the nomenclature Thanks Mom! Also: * Add chi-squared and D_chi-squared to the nomenclature * Fix hyphen -> en-dash in Kullback--Leibler * Label eq:sawsim:p_m (so it can be referenced from the index) * Fix UTF-8 chi -> \chi * Remove a dangling close-paren from eq:sawsim:X2 * Cite NIST for the chi-square comparison * Break Gumbel and chi-squared citations into @inbook * Add editors for the new NIST:ESH book entry --- diff --git a/src/root.bib b/src/root.bib index fce2aff..19e6e3c 100644 --- a/src/root.bib +++ b/src/root.bib @@ -219,6 +219,7 @@ @string{MCoyne = "Coyne, M."} @string{DCraig = "Craig, David"} @string{ACravchik = "Cravchik, A."} +@string{CCroarkin = "Croarkin, Carroll"} @string{VCroquette = "Croquette, Vincent"} @string{YCui = "Cui, Y."} @string{COSB = "Current Opinion in Structural Biology"} @@ -1001,6 +1002,7 @@ @string{NNTint = "Tint, N. N."} @string{BTiribilli = "Tiribilli, Bruno"} @string{TTlusty = "Tlusty, Tsvi"} +@string{PTobias = "Tobias, Paul"} @string{JTocaHerrera = "Toca-Herrera, Jose L."} @string{CATovey = "Tovey, Craig A."} @string{AToyoda = "Toyoda, A."} @@ -1153,18 +1155,38 @@ @string{PGdeGennes = "de Gennes, P. G."} @string{PJdeJong = "de Jong, P. J."} @string{NGvanKampen = "van Kampen, N.G."} -@string{NISTSEMATECH = "{NIST/SEMATECH}"} +@string{NIST:SEMATECH = "{NIST/SEMATECH}"} @string{EDCola = "{\uppercase{d}}i Cola, Emanuela"} -@misc { NIST:gumbel, - author = NISTSEMATECH, - key = "NIST:gumbel", - title = "e-Handbook of Statistical Methods: Extreme Value Type {I} - Distribution", - year = 2009, - month = oct, - day = 9, - url = "http://www.itl.nist.gov/div898/handbook/eda/section3/eda366g.htm" +@inbook{ NIST:chi-square, + crossref = {NIST:ESH}, + chapter = {1.3.5.15: Chi-Square Goodness-of-Fit Test}, + year = 2013, + month = may, + day = 15, + url = {http://www.itl.nist.gov/div898/handbook/eda/section3/eda35f.htm}, +} + +@inbook{ NIST:gumbel, + crossref = {NIST:ESH}, + chapter = {1.3.6.6.16: Extreme Value Type {I} Distribution}, + year = 2009, + month = oct, + day = 9, + url = {http://www.itl.nist.gov/div898/handbook/eda/section3/eda366g.htm}, +} + +@book{ NIST:ESH, + editor = CCroarkin #" and "# PTobias, + author = NIST:SEMATECH, + title = {e-Handbook of Statistical Methods}, + year = 2013, + month = may, + publisher = NIST:SEMATECH, + address = {Boulder, Colorado}, + url = {http://www.itl.nist.gov/div898/handbook/}, + note = {This manual was developed from seed material produced by + Mary Natrella.}, } @misc{ wikipedia:gumbel, diff --git a/src/sawsim/discussion.tex b/src/sawsim/discussion.tex index 2a4b6e1..4e8ecd9 100644 --- a/src/sawsim/discussion.tex +++ b/src/sawsim/discussion.tex @@ -402,7 +402,7 @@ the similarity between two probability distributions. \end{equation} where $p_e(i)$ and $p_s(i)$ are the the values of the $i^\text{th}$ bin in the experimental and simulated unfolding force histograms, -respectively. $D_\text{KL}$ is the Kullback-Leibler divergence +respectively. $D_\text{KL}$ is the Kullback--Leibler divergence \begin{equation} D_\text{KL}(p_p,p_q) = \sum_i p_p(i) \log_2\p({\frac{p_p(i)}{p_q(i)}}) \;, \label{eq:sawsim:D_KL} @@ -410,8 +410,16 @@ respectively. $D_\text{KL}$ is the Kullback-Leibler divergence where the sum is over all unfolding force histogram bins. $p_m$ is the symmetrized probability distribution \begin{equation} - p_m(i) \equiv [p_e(i)+p_s(i)]/2 \;. + p_m(i) \equiv [p_e(i)+p_s(i)]/2 \;. \label{eq:sawsim:p_m} \end{equation} +% +\nomenclature{$D_\text{JS}$}{The Jensen--Shannon divergence + (\cref{eq:sawsim:D_JS}).} +\nomenclature{$D_\text{LK}$}{The Kullback--Leibler divergence + (\cref{eq:sawsim:D_KL}).} +\nomenclature{$p_m(i)$}{The symmetrized probability distribution used + in calculating the Jensen--Shannon divergence + (\cref{eq:sawsim:D_JS,eq:sawsim:p_m}).} % DONE: Mention inter-histogram normalization? no. % For experiments carried out over a series of pulling velocities, we % simply sum residuals computed for each velocity, although it would @@ -421,11 +429,15 @@ the symmetrized probability distribution The major advantage of the Jensen--Shannon divergence is that $D_\text{JS}$ is bounded ($0\le D_\text{JS}\le 1$) regardless of the experimental and simulated histograms. For comparison, Pearson's -$\chi^2$ test, +$\chi^2$ test\citep{NIST:chi-square}, \begin{equation} - D_{χ^2} = \sum_i \frac{(p_e(i)-p_s(i))^2}{p_s(i)}) \;, \label{eq:sawsim:X2} + D_{\chi^2} = \sum_i \frac{(p_e(i)-p_s(i))^2}{p_s(i)} \;, + \label{eq:sawsim:X2} \end{equation} is infinite if there is a bin for which $p_e(i)>0$ but $p_s(i)=0$. +% +\nomenclature{$\chi^2$}{The chi-squared distribution} +\nomenclature{$D_{\chi^2}$}{Pearson's $\chi^2$ test (\cref{eq:sawsim:X2}).} \Cref{fig:sawsim:fit-space} shows the Jensen--Shannon divergence calculated using \cref{eq:sawsim:D_JS} between an experimental data