Open data resources

Here we list all PaN-related open data resources we know of, either domain-specific resources or open data portals from our individual facilities. We limit this list to data hosting repositories, excluding aggregators like the PaNOSC data portal or the EOSC search portal.

Domain-specific repositories

AlphaFold

Protein Structure Database

AlphaFold, the state-of-the-art AI system developed by DeepMind, is able to computationally predict protein structures with unprecedented accuracy and speed. Working in partnership with EMBL’s European Bioinformatics Institute (EMBL-EBI), we’ve released over 200 million protein structure predictions by AlphaFold that are freely and openly available to the global scientific community. Included are nearly all catalogued proteins known to science – with the potential to increase humanity’s understanding of biology by orders of magnitude.

API CC-BY-4.0
BMRB

Biological Magnetic Resonance Bank

BMRB makes bio-NMR data FAIR. It collects, annotates, archives, and disseminates spectral and quantitative data derived from NMR spectroscopic investigations of biological macromolecules and metabolites.

API
CSD

Cambridge Structural Database

The Cambridge Structural Database, or CSD, has been curated since 1965 from the published literature, direct deposition, and sources such as patents and PhD theses.
The world’s largest database of small-molecule organic and metal-organic crystal structure data, the CSD is managed by the Cambridge Crystallographic Data Centre (CCDC).

crystallography subscription-based
CXIDB

Coherent X-ray Imaging Data Bank

CXIDB offers scientists from all over the world a unique opportunity to access data from Coherent X-Ray Imaging (CXI) experiments. The website also serves as the reference for the CXI file format, in which most of the experimental data on the database is stored in.

coherent x-ray imaging free
EMDB

Electron Microscopy Data Bank

EMDB (the Electron Microscopy Data Bank) is a public repository for electron cryo-microscopy maps and tomograms of macromolecular complexes and subcellular structures. It covers a variety of techniques, including single-particle analysis, electron tomography, sub-tomogram averaging, fibre diffraction and electron crystallography.

API electron microscopy free
Human Organ Atlas

Human Organ Atlas

The Human Organ Atlas is making Hierarchical Phase-Contrast Tomography (HiP-CT) 3D scans of entire organs, with ca. 20 micron voxels, open access.

CC-BY-4.0 free
PED

Protein Ensemble Database

PED is a platform for the intrinsically disordered proteins (IDP) community where ensembles and their corresponding primary data can be stored and used as benchmarking datasets to facilitate the development of new ensemble calculation methods.

API
Perovskite DB

The Perovskite Database Project

The Perovskite Database Project aims at making all perovskite device data, both past and future, available in a form adherent to the FAIR data principles, i.e. findable, accessible, interoperable, and reusable. In the initial phase of the project, the project team went through the over 16000 perovskite papers published until the end of February 2020 and extracted data for every single adequately described perovskite solar cell we could find. For papers published after that, the database relies on authors to upload their own data. The project is based around an open database and open-sourced tools enabling anyone, without any programming experience, to interactively explore, search, filter, analyse, and visualise the data. The core of those tools are a set of interactive graphics. The interactive graphics are hosted by MaterialsZone. To reach the graphics, you will need to create a free account by filling out this form. Shortly after filling out the form, you will receive an invitation by email.

CC-BY-4.0 free after registering
RefXAS

RefXAS- an open access database of XAS spectra

In the frame of DAPHNE4NFDI, an X-ray absorption spectroscopy (XAS) reference database called RefXAS has been set-up where users are provided with well curated XAS reference spectra along with related metadata fields and online processing tools for visualizing the data. The developed online procedure enables users to submit a raw dataset along with its associated metadata via a dedicated website for inclusion in the database. The unique feature of quality criteria formulated for the uploaded reference data at the database make users aware about the usability of the data. These quality criteria are further employed for automatic quality check of the uploaded data which is then followed by manual curation at the interface. Implementation of the database includes an upload of metadata to the Scientific-Catalogue and an upload of files via object storage, with automated query capabilities through a web server and visualisation of the data and files. A prototype of the database with integrated quality control for uploaded spectra has been created, which can process common data types for X-ray absorption spectra and has a standardized metadata schema.

CC-BY-4.0 x-ray absorption spectroscopy free
SASDB

Small Angle Scattering Biological Data Bank

SASBDB is a curated repository of freely accessible and fully searchable SAS experimental data, which are deposited together with the relevant experimental conditions, sample details, instrument characteristic and derived models. The quality of deposited experimental data and the accuracy of models obtained from SAS and complementary techniques is assessed by the site developers.

API small angle scattering free
TomoBank

X-Ray Tomography Data Bank

The X-ray Tomography Data Bank or TomoBank, provides a repository of experimental and simulated data sets with the aim to foster collaboration among computational scientists, beamline scientists and experimentalists, to accelerate the development of tomographic reconstruction and 3D visualization methods and to speed up their implementation in the various synchrotron facility data analysis software packages.

CC-BY-4.0 with exceptions tomography free
wwPDB

Protein Data Bank

CoreTrustSeal

Since 1971, the Protein Data Bank archive (PDB) has served as the single repository of information about the 3D structures of proteins, nucleic acids, and complex assemblies.

CC0-1.0

PaN facilities repositories

Facility Open data repository OAI-PMH endpoint PaN search API endpoint
ALBA data.cells.es/...
Active last check: 2024-12-22
Datasets: 1 last check: 2024-12-22
Elettra vuo.elettra.eu/pls/vuo/op...
Active last check: 2024-12-22
Datasets: 20 last check: 2024-12-22
ESRF data.esrf.fr/... CoreTrustSeal
Active last check: 2024-12-22
Datasets: 418,636 last check: 2024-12-22
ESS scicat.ess.eu/...
Active last check: 2024-12-22
Error last check: 2024-12-22
EuXFEL in.xfel.eu/metadata/...
Active last check: 2024-12-22
Datasets: 100 last check: 2024-12-22
HZB
Active last check: 2024-12-22
last check:
HZDR rodare.hzdr.de/...
Active last check: 2024-12-22
Datasets: 47 last check: 2024-12-22
ILL data.ill.eu/...
Active last check: 2024-12-22
Error last check: 2024-12-22
ISIS data.isis.stfc.ac.uk/data...
Error last check: 2024-12-22
Datasets: 165,267 last check: 2024-12-22
MAX IV scicat.maxiv.lu.se/...
Error last check: 2024-12-22
Datasets: 100 last check: 2024-12-22
PSI doi.psi.ch/...
Active last check: 2024-12-22
Datasets: 3,401 last check: 2024-12-22
SESAME access.sesame.org.jo/get-...
last check:
last check:
SOLEIL datacatalog.synchrotron-s...
Error last check: 2024-12-22
Error last check: 2024-12-22
Total datasets: 587,572