The data handling and storage techniques of the Climatic Research Unit in UK revealed many flaws in preserving data. The drawbacks of science data storage revealed from the study were shocking to the scientific groups. The computer codes were not maintained properly and the data was spread over innumerable files in difficult formats. Old punch card data were not interpretable and standard operating procedures for data sharing were non existent. It triggered the doubts among many scientists whether climatologists were involved in some conspiracy.
The responses of the researchers were varied. A common perception was the need for more responsive data sharing. The resultant chaotic situations, and inappropriate data handling and scientific data loss were not new to many of the science critics. The true challenge in handling of scientific data is in making, recording, preserving and sharing. In fact, this is not a single challenge but an accumulation of small significant challenges at each level needing some compromise.
Research findings are the sum total of observations both in a lab and in the observable world. Most of the events of the world are not constant and are under observation for a short span of time. Animal interactions and photons traveling out from a supernova crossing the earth are just a few among them. The best scientific way to deal with such momentary events is to record them and preserve the data.
Many branches of science deal with preservation of data whether short term or long term. Paleontologists ‘preserve everything’ since the fossils are rare and hence precious. However judgmental decisions become inevitable with the constraints of space.
In some other cases, it is better to record the data and discard the sample. This is applicable to most of the biological and chemical samples which do not preserve the essential features with the original value. For example, chemicals degrade and radioisotopes decay over time. Reanalyzing the samples does not make sense because of the loss of measurable features.
The decisions as to whether to store everything and discard after a fixed time or to check for the sample quality after a specific time before discarding are left to the researcher. Some researchers just discard the samples once the needed data has been extracted from it. These decision making aspects should be practiced regularly in research labs.
For stable samples the decision to discard is tougher. Scientific samples include generated data, an innovative procedure or publications. It is not essential to save all the materials. But failure to preserve the data can lead to unnecessary controversies over the research bias.
Lastly, it will never be know how a new technology can create interest and value in old samples. One of the best examples is the Miller’s experiment. Miller preserved the samples in 1953. However a reanalysis of the samples later revealed the presence of 22 different amino acids, which are the basic structural units of proteins which was a complex mixture than the one Miller described.
The best policies of data preservation however cannot guarantee the permanency of data. During the North East massive blackout in New York in 2003, many labs found it difficult to find alternate power sources for preservation of samples, especially the heat sensitive ones. Researchers find many such situations in their career.
Sometimes the issue of preservation is want of money. The cost of freezers, liquid nitrogen storages or rearing a mouse line is high enough to make the researchers think twice about the necessities before venturing into purchases. When the science funding has begun to decline, the decisions concerning preservations are getting tougher forcing the scientists to compromise in services as well as reagents.
An ideal scenario of science needs to preserve everything. But the reality forces down a decision which is often a mix of practicality and scientific knowledge limiting the data to be preserved. Accidents scale the quantum of data further down. These issues are to be addressed in data generation step itself.