Skip to content

Metadata Specification

History

1.0.0 Previous versions can be found in the Agreement history.

Sharing metadata is a critical practice in scientific research and data management. High-quality metadata facilitates the understanding and proper utilization of datasets, enhancing the reproducibility of scientific results and fostering collaboration across disciplines. To uphold these principles, Austrian NeuroCloud adheres to the FAIR (Findable, Accessible, Interoperable, Reusable) data management guidelines. By following the FAIR principles, we ensure that the metadata we share meets the highest standards of data stewardship, promoting greater transparency and efficiency in research.

The metadata specification is based on the Brain Image Data Structure (BIDS), the only data format currently supported in the Austrian NeuroCloud. BIDS is a standardized format that organizes and describes neuroimaging and related data, enabling seamless sharing and analysis. By adopting the BIDS format, we ensure compatibility with a wide range of tools and platforms used in the neuroimaging community, further enhancing the utility and impact of our shared metadata.

We are committed to protecting the privacy of individuals by ensuring that no personal data is ever shared or published. All metadata shared will be anonymized and stripped of any personally identifiable information, in compliance with ethical standards and legal regulations. This commitment safeguards the privacy of research participants while allowing the scientific community to benefit from shared data.

Additionally, we will only share metadata that does not reveal any specific scientific contributions, thereby securing the intellectual property of the scientists.

Metadata sources

The metadata is categorized into four different levels: dataset, participant, measurement, and event. For each of these levels, specific files in a BIDS-formatted dataset are listed and used as the source of the metadata. We do not publish the files as they are, and in most cases we distribute only a summary of the information they contain. We use POSIX syntax and glob patterns to specify the files and their paths relative to the BIDS dataset root directory ./.

Dataset level

The metadata provides a comprehensive description of the dataset as a whole, including its name, authors, identifiers, and related publications.

File pattern Metadata definition
./README.md Entire content of the file.
./CITATION.cff Entire content of the file.
./dataset_description.json Entire content of the file.

Participant level

The metadata provides basic information about the participant's demographics and completed assessment tools. Participant level metadata is published only in summarized form and never disclosed for individual participants. Data of individual participants may be stored in a database, but the query results reveal only summarized information about participants.

File pattern Metadata definition
./participants.tsv Columns and their values representing demographic data of the participants, including but not limited to age, sex, and diagnosis. Columns representing the availability and a score of a completed assessment tool.
./participants.json Definitions of the columns representing demographic data and the availability of a completed assessment tool.

Measurement level

The metadata provides information about the data modalities available in a dataset, their specific acquisition parameters, and the number of acquisitions within and between multiple sessions. Acquisition times and dates are not published.

File pattern Metadata definition
./sub-*/*sessions.tsv The number of different acquisition sessions.
./sub-*/ses-*/*scans.tsv Data modalities within a session.
./sub-*/ses-*/*/*.json Acquisition parameters of any modality.

Event level

The metadata lists and describes the events that occurred during the data acquisition, in particular the stimuli to which the subjects were exposed. This metadata is essential for semantic interoperability of the datasets, especially for finding similar experiments. This metadata will be published in a summarized form, and we will not redistribute the entire event logs.

File pattern Metadata definition
./sub-*/ses-*/*/*events.tsv Stimuli and subject actions.
./*events.json Descriptions of events and their HED annotations.

Dissemination channels

Metadata dissemination is facilitated through several channels, as outlined in the following table. These channels are designed to ensure broad and effective access to the metadata and the underlying datasets, and to enhance its findability within the scientific community.

Channel Description Metadata levels
DOI registration The Austrian NeuroCloud uses DataCite as DOI registration service via the Library of the Paris Lodron University of Salzburg. Certain minimal dataset-level metadata is required for registering a DOI with DataCite. All applicable dataset-level metadata will be shared with DataCite. Dataset
DataCite Commons DataCite Commons exposes the metadata of the DOIs registered with DataCite for querying. The metadata of each DOI registered with the Austrian NeuroCloud will be exposed in DataCite Commons. Dataset
Dataset website Each dataset has an automatically generated website. This website is associated with the dataset DOI. The metadata is rendered on the website and embedded in its source code, complying with the FAIR assessment guidelines and ensuring a high fairness score. Dataset, Participant, Measurement
ANC querying interface (work in progress) To increase findability, the metadata will be stored in a database and exposed in a custom-built ANC querying interface. All levels
Neurobagel node To enable interoperability with other datasets at a participant level, the metadata will be stored in a database and exposed for querying using Neurobagel. Dataset, Participant