LibGuides: Research data management: Choosing a data repository

Research data in US repositories or servers: potential data loss

The HES-SO Research and Innovation department (20.05.2025) warns of the potential risks of deletion of research data archived on data repositories funded by US institutions (e.g. OSF) or hosted on servers located in the USA. If your data are at risk, we recommend that you make a backup to guarantee access in the event of deletion, and seek an alternative data repository.

Please contact the research support network for more information or for help with the identification of an alternative data repository.

Recommandations

Share your data in a FAIR, non-commercial data repository that ensures long-term preservation (Example SWISSUbase, OLOS, Zenodo, ...)
Choose a repository that automatically assigns DOIs (Digital Object Identifiers) or allows them to be added
Choose a repository that offers Creative Commons licenses (CC0 or CC BY)

Data repositories

The data repositories (DRs) are platforms for storing, managing and preserving data sets over the long term. A repository contains data sets and their description (metadata). They can then be retrieved and reused by humans or machines. The size of data sets is measured in kilo-, mega-, giga- or tetrabytes.

There are several types of repositories :

disciplinary
institutional
generalist (multidisciplinary)
publisher-specific
project-specific

To help you choose the most suitable data repository for your needs that also complies with the FAIR principles, here is a list of 12 criteria to consider.

Does the repository offer to assign a unique, permanent identifier to the dataset?

The unique and persistent identifier, or PID (Persistant Identifier), enables us to reference, cite and provide a stable link to an online file⁴. A unique, perennial hypertext link is created, enabling the resource to be retrieved at any time, even if the URL address of the page changes.

There are several types of DOI. The best known for identifying datasets or journal articles is the DOI (Digital Object Identifier).
- Example of a DOI attached to a resource available on the Zenodo data repository:

The perennial identifier is integrated into the bibliographic reference. This means that if the dataset is reused, it can be easily retrieved from the quote.
- Example: Valérie Gadrat, Yvette Lafosse, Claire Sowinski, & Coralie Wysoczynski (2019, April). GopenDoRe game: the cooperative game for acquiring good practices in managing and sharing research data (Version 2). Zenodo. http://doi.org/10.5281/zenodo.2657316
DOIs are generally assigned when your data is deposited in a data repository or via the HES-SO Center for Scientific Information (CISO). You can obtain a DOI by sending an e-mail to open(at)hes- so.ch

Is the license under which the data will be made available clearly stated, or can the user download/select a license?

Some repositories impose specific a license, while others allow you to choose your preferred license
When data is linked to a scientific article, it is necessary to check whether the journal applies specific conditions to the data distribution license. To do this, consult the journal's instructions intended for authors.
Creatives Commons licenses were created in 2002 for the distribution of digital content such as text, images and films. But they can also be used for distribution on paper. They make it easy for authors to indicate the rights they want to retain and the rights they want to wave, so that others can reuse their work⁵.
- Four clauses can be combined to suit your needs:

Can I upload and/or add metadata ?

Metadata are elements used to describe, in a standardized and structured way, the purpose, origin, temporal characteristics, geographical location, authorship, conditions of access and conditions of use of a resource, such as a dataset². They make it easier to find and understand the dataset. The more information is communicated about the dataset, the easier it will be to understand and find.

There are various description schemas, including Dublin Core, Data Documentation Initiative (DDI), Metadata Encoding and Transmission Standard (METS), DataCite metadata schema...

Are citations and metadata always publicly accessible ?

Making metadata accessible even if the data cannot be or is no longer available helps to meet the FAIR Accessible and Findable criteria. (e.g. metadata on authors, institutions and associated publications can be useful even if the data is missing). In addition, this solution reduces storage costs

Does the data repository provide a submission form requesting that intrinsic metadata comply with a specific format ?

To be machine-readable, metadata (descriptive, administrative and structural) must conform to standard schemas such as DataCite.

Does the data repository have a long-term preservation plan for archived data?

The preservation plan will provide information on the actions put in place by the repository to ensure its long-term management and funding, as well as the preservation of data. "The more the repository moves towards preservation, the more solid its financing, resource management and governance must be"⁷
The COAR framework of best practices for repositories proposes 4 characteristics (2 essential, 2 recommended) for the preservation objective¹:
- "The repository (or the organization that manages the repository) has a long-term plan for repository management and financing
- The repository has a policy defining the length of time resources will be managed, and provide a documentation on preservation practices
- The repository has a documented approach to preservation, adopting recognized preservation practices
- The agreement between the depositor and the repository provides for all actions necessary to ensure preservation responsibilities - for example, rights to copy, transform and store resources"

Is the data repository adapted to the type of data you are going to deposit?

Does the repository accept all file formats?
Is there a size limit?
Are all data formats accepted?

Does it meet the recommendations of your project sponsor, your institution, or the journal in which you publish?

SNSF: research data should be freely accessible
- Requirement to include a data management plan (DMP) when submitting an application for most funding instruments
- Open-access archiving of data produced during research on an open data repository is mandatory, as long as no legal, ethical, copyright or other clauses stand in the way.
- The SNSF provides funding of up to CHF 10,000 for the preparation of and validation. Downloading is not taken into account.
Other European countries

Is it recognized in your discipline and by the scientific community ?

Some repositories, still few in number, have the certificate CoreTrustSeal which ensures sustainable and trustworthy data infrastructures.
SNSF offers a list of data repositories that meet their ORD requirements and are frequently used by the community
- Generalists : Zenodo, SWISSUbase, OLOS, GitHub...
- Disciplines: DaSCH, ENA, PlaTec...
The list of deposits recommended by Nature magazine

Is it free ? If not, is the data deposit fee reasonable and acceptable?

Does it offer data access options adapted to your needs?

When depositing your data in a data repository, you can choose between different access conditions⁸ :

Closed data	A general description of the data is published, but access to the data itself is not possible. E.g., the dataset contains non-anonymized or pseudonymized sensitive data
Data on request	A record of the data is published, and information is provided on how to demand access to the data. The demand is generally sent out to the researchers and approved by them. E.g., the dataset contains data with a high re-identification risk, it can only be shared under certain conditions
Data under embargo	A record of the data is published, but the data can't be accessed until the end of a specified time period. After that period, the data becomes openly accessible E.g., the researcher wants to file a patent before publishing the data. However, the associated metadata must be accessible to indicate the existence of the data while preserving its protection. Some data repositories do not include the "embargo" option.
Open data	The data is freely available and can be accessed by anyone

These options only concern data, and not metadata: in all cases, metadata remains publicly accesible.

Where are the repository servers located?

Check where the DR server is located. It is generally advised to avoid depositing on an American server or a server that is located in the USA¹¹

Tools & tutorials

Creative Commons: License chooser

Repositories to choose the most suitable data repository

More informations

Data repositories

References

Confederation of Open Access Repositories. (2020, novembre 6). Cadre commun COAR de bonnes pratiques en matière d’entrepôts. Zend. https://doi.org/10.5281/zenodo.4118380
Corti, L., Van den Eynden, V., Bishop, L., & Woollard, M. (2020). Managing and sharing research data : A guide to good practice (Second edition). SAGE
Dedieu, L. ; Barale, M. 2020. Déposer des données dans un entrepôt, en 6 points. CIRAD. https://doi.org/10.18167/coopist/0070
Deboin, M. C. (2017). Identifier et rechercher une publication ou un jeu de données par son DOI. CIRAD. https://doi.org/10.18167/COOPIST/0005
Fily, M.-F. (2015). Connaitre et utiliser les licences Creative Commons. CIRAD. https://doi.org/10.18167/XTNV-D457
Fonds national suisse. (2022). Open Research Data. FNS. https://www.snf.ch/fr/dMILj9t4LNk8NwyR/dossier/open-research-data
Guirlet, M. (2020). Guide décisionnel et vade-mecum pour la mise à disposition d’un dépôt de données de recherche ouvertes en Suisse [Haute école de gestion. Travail de Master]. https://doc.rero.ch/record/329696/files/Guirlet-M_moire-Vdef.pdf
Jouhar, M., Melly, P., Trombert, A., Elmers, J., & Beaulieu, M.-C. (2023, June 4). Mission RDM: the ultimate quest before the holidays: complementary guide. In. Mission RDM: a card-based educational escape game on research data management. Zenodo. https://doi.org/10.5281/zenodo.8002770
Scientific Data. (2022a). Data Policies. Nature. https://www.nature.com/sdata/policies/data-policies
Scientific Data. (2022b). Data Repository Guidance. Nature. https://www.nature.com/sdata/policies/repositories
Préposé fédéral à la protection des données et à la transparence (2022). European-U.S. Data Privacy Framework (EU-U.S. DPF). https://www.edoeb.admin.ch/edoeb/fr/home/kurzmeldungen/2022/20221007_eu_us_dpf.html

Critères pour choisir un dépôt de données: guideline
Groupe de travail Guidelines de la Communauté Open Science HES-SO