From: W. Trevor King Date: Sun, 5 May 2013 19:55:58 +0000 (-0400) Subject: calibcant/discussion.tex: Fill in section on data archival X-Git-Tag: v1.0~256 X-Git-Url: http://git.tremily.us/?a=commitdiff_plain;h=c5044fd20cea36da02a81609f2ebf40043e266be;p=thesis.git calibcant/discussion.tex: Fill in section on data archival --- diff --git a/src/calibcant/conclusions.tex b/src/calibcant/conclusions.tex index dd2460a..3f29e1b 100644 --- a/src/calibcant/conclusions.tex +++ b/src/calibcant/conclusions.tex @@ -10,5 +10,5 @@ by the notice of calibration experts\citep{hutter93-erratum} or incorrect formul\ae\ are used during the fitting (\cref{sec:calibcant:lorentzian}). By centralizing calibration procedures in an open package, \calibcant\ should both reduce the -effort needed to calibrate AFM cantilevers and improve the quality of -the calibration. +effort needed to calibrate AFM cantilevers, improve the quality of the +calibration, and ease data sharing and archival. diff --git a/src/calibcant/discussion.tex b/src/calibcant/discussion.tex index 721edbd..97e0ef6 100644 --- a/src/calibcant/discussion.tex +++ b/src/calibcant/discussion.tex @@ -160,4 +160,31 @@ the calculated $\kappa$. \subsection{Archiving experimental data} \label{sec:calibcant:discussion:data} -TODO +Scientific data is not thrown away after analysis. Organizations may +have standards for archival, and many journals require supporting data +to be available on request for a number of years after +publication\citep{TODO}. Both the raw data and the experimental +parameters used to collect need to be preserved, but managing this +manually is tedious and error prone. Lab notebooks rarely contain +\emph{all} of the parameters used to collect and analyze a particular +calibration. Data collected with \calibcant\ is saved in +\citetalias{hdf5} with the full configuration +(\cref{sec:pyafm:h5config}), bundling all of the information together +in a single file. + +One minor drawback to this approach is that configuration information +(which is not likely to change often) is duplicated between +calibration runs. While this uses some extra disk space, the overhead +is small. The full calibration datafile weighs in at $3.4\U{MB}$, +while the calibration section alone is just $37\U{kB}$ (1\% of the +total). + +Besides the benefit of having a self contained file, HDF5 provides +efficient support for large arrays of typed data (such as the unsigned +16-bit values from our DACs and ADCs), which is not possible with many +other open file formats. The HDF libraries are supported by the +non-profit HDF Group with a 20 year development +history\citep{hdf-group} and many users\citep{hdf-users}. This +suggests that HDF will be around for the long haul, and if it is +eventually phased out, that there will be a number of well funded +organizations interested in developing migration plans. diff --git a/src/pyafm/main.bib b/src/pyafm/main.bib index 8312c77..95971ea 100644 --- a/src/pyafm/main.bib +++ b/src/pyafm/main.bib @@ -115,6 +115,20 @@ note = {Version 1.8.10}, } +@misc{ hdf-group, + author = HDFG, + title = {HDF Group History}, + year = 2013, + url = {http://www.hdfgroup.org/about/history.html}, +} + +@misc{ hdf-users, + author = HDFG, + title = {Who uses HDF?}, + year = 2013, + url = {http://www.hdfgroup.org/users.html}, +} + @misc{ yaml, title = {{YAML} Ain't Markup Language ({YAML\texttrademark}) Version 1.2},