Share Data
This page provides access to guidance on best practices in sharing infectious disease and pandemic preparedness data. Overall, data should be made as open as possible and as closed as necesssary (see national guidelines on Open Science). Contact our helpdesk to get tailored support for sharing your data.
Where to share data
Data should be shared in well-established, data-type specific repositories wherever possible. This not only makes it more findable, it also ensures that any relevant metadata standards and recommended file formatting will be applied, which will increase the reuse of the data.
Locating a suitable repository:
- The data repositories page lists where infectious disease data can be deposited, according to data type.
- The interactive data submission tool on the European COVID-19 Data Portal suggests which repository to use, based on information about your data.
- Re3data.org lists research data repositories available for use.
In the event that there is no data-type specific repository available, or you're unsure where to share your data, contact our helpdesk for support. Alternatively, you can deposit data in a non-data type specific repository. For example, the SciLifeLab Data Repository accepts life science data from Swedish researchers. Data can also be stored using services such as SciLifeLab FAIR Storage, which provides Swedish researchers with secure, high-performance storage designed for large-scale life science datasets.
Standards for data sharing
Locating advice
Infection biology research often involves sensitive, complex, and large-scale data. It can be difficult to understand the ramifications of sharing particular data types, and how data of different types should be shared. Getting tailored guidance is possible through the Swedish Pathogens Portal helpdesk. However, multiple resources have also been generated to aid with sharing data releated to infectious disease and pandemic preparedness.
The Infectious Disease Toolkit (IDTk) was created to collate information on best practices in data management from across infectious disease research. It covers topics such as biosafety, ethical considerations, and data flows for handling sensitive pathogen and patient-derived data. IDTk also links to tools, repositories, and policies relevant to Sweden, making it directly applicable for researchers based in Sweden.
In many cases, general research data management (RDM) also applies to infectious disease research. To access advice on RDM relevant to Swedish research, visit SciLifeLab Research Data Management Guidelines, which includes information on how to submit data to potentially relevant repositories. See, for example, the ENA submission tutorial.
Metadata Standards
When sharing data, it is important to follow relevant metadata standards to ensure that your data is reusable. Metadata standards are often outlined by data repositories. The following table provides an overview of some key metadata standards separated by data type.
| Data Type | Standards | Description |
|---|---|---|
| Genomics | MIxS (Minimum Information about any 'x' Sequence) | Developed by Genomic Standards Consortium ; used for describing sequences from different environments (e.g., host-associated, environmental). |
| MINSEQE (Minimum Information about a High-Throughput Nucleotide Sequencing Experiment) | Recommended by FGED for RNA-seq and other sequencing metadata. | |
| ENA Checklists | Specific checklists for submission to European Nucleotide Archive (e.g., pathogen, human, metagenome). | |
| ISA-Tab / ISA-JSON | Framework for describing experimental metadata, often used with bioinformatics tools and databases. | |
| Proteomics | MIAPE (Minimum Information About a Proteomics Experiment) | Developed by HUPO-PSI; covers mass spectrometry, sample processing, informatics. |
| PSI-MI XML / MITAB | For molecular interaction data formats (used in interaction databases). | |
| mzML / mzIdentML / mzTab | Standard formats for raw data, identifications, vocabulary and quantification results in the field of mass spectrometry-based proteomics. | |
| Imaging | OME-TIFF / OME-XML | Developed by Open Microscopy Environment; widely used for storing microscopy images and associated metadata. |
| REMBI (Recommended Metadata for Biological Images) | Designed to enable reproducibility and data reuse for imaging datasets. | |
| DICOM (Digital Imaging and Communications in Medicine) | Standard for handling, storing, and transmitting medical imaging information (e.g., CT, MRI). | |
| Bioassays / Experimental Data | MIACA (Minimum Information About a Cellular Assay) | For reporting cellular assays, including experimental context and protocols. |
| MIABE (Minimum Information About a Bioactive Entity) | For small molecule screening and bioactivity reporting. | |
| BAO (BioAssay Ontology) | Ontology that enables uniform annotation of bioassays and protocols. | |
| Clinical & Health Data | CDISC standards (e.g., SDTM, ADaM, SEND) | Industry standards for clinical trial data exchange and analysis. |
| HL7 / FHIR (Fast Healthcare Interoperability Resources) | Widely adopted in EHR systems for structured health data. | |
| LOINC / SNOMED CT / ICD-10 | Controlled vocabularies for lab tests, symptoms, diagnoses. | |
| MIMIC-IV Metadata Guidelines | For structured ICU/clinical datasets in open research. | |
| Dublin Core / DCAT-AP-SE | Metadata cataloging for health data in national repositories. | |
| Omics Imaging (e.g., Spatial Transcriptomics, Multi-modal) | STOMIC (Spatial Transcriptomics Open Metadata and Image Convention) | A proposed standard for organizing spatial omics data. |
| ISA-Tab / OME-XML | For integrating omics and imaging data. | |
| HUPO-B/D Standards | For multimodal single-cell data and proteogenomics metadata. | |
| Metabolomics Data | ISA-Tab / ISA-JSON | Describes experimental design, sample preparation, and data files. |
| Metabolomics Standards Initiative (MSI) | Offers domain-specific guidelines for metadata and reporting. |
Licensing
Selecting the correct licence when sharing data is important for enabling data reuse whilst protecting your rights as the data creator.
What does a licence do?:
- Defines how others may access, use, and distribute your data.
- Ensures you receive proper credit and attribution for your work.
- Supports FAIR principles, making data Findable, Accessible, Interoperable, and Reusable.
- Reduces legal uncertainty for both creators and users.
Before selecting a licence, check:
- Institutional or funder requirements (e.g., EU Horizon, SciLifeLab, or Swedish Research Council mandates).
- Whether your data includes sensitive, personal, or third-party components that cannot be made fully open.
- Any journal or repository-specific policies (Re3data.org lists repositories and their licensing policies).
What else to consider:
- Include a clear licence statement in your metadata and documentation.
- When in doubt, consult your institutional data steward or legal team before assigning a licence.
Other useful information on licensing