Consideration needs to be given to which data will be archived and which will be shared on a data repository. The Digital Curation Centre (DCC) offers a series of questions to help you decide which data to share:
It is not possible to store all research data for financial and ecological reasons.
The criteria for retaining or destroying data largely depends on the current practice in the field of digital archiving. Choosing what to keep and what can be disposed of or deleted is always going to involve a subjective judgement, as nobody knows exactly what information is going to be wanted in the future. Data is retained for sharing purposes. This is part of a global vision of enriching research on a wider scale, for the benefit of the entire scientific community2.
Data to keep4 | Data to destroy2 |
Relevance to the mission | Poor-quality data (bad or junk data) |
Legal compliance: to restrict access if necessary, or to make data public if required by funders | Data that cannot be used by others |
Long-term scientific or historical value | Data that is easily reproducible |
Uniqueness of the dataset | Data without good metadata |
Potential for re-use: in relation to intellectual property and ethical issues | Older data that is not used and has no obvious cultural or historical value |
Non-replicability | Pilot, test or intermediate data |
Evaluate costs | Proprietary data |
Complete documentation | Sensitive or confidential data |
Level, type, format of data |
Final deposit | Possible deposit | Not required | No deposit |
---|---|---|---|
Raw data obtained from analysis of physical samples Observation data that cannot be regenerated Original datasets Non-original data sets not readily available online Codebook Original software code |
Intermediate versions of analyses or code if they are potentially useful for other people or if they have been used in publications or theses |
Incomplete version Dataset already available online |
Any data that contains personal identification information for human subjects |
In addition to the legal aspects, the length of time researchers are asked to keep their data varies greatly depending on data retention policies. Durations may vary according to the purpose of retention. Data with historical value will have to be kept longer than purely administrative data which can be destroyed once the legal time limit has expired.