Metadata Specification
History
1.0.0 Previous versions can be found in the Agreement history.
Sharing metadata is a critical practice in scientific research and data management. High-quality metadata facilitates the understanding and proper utilization of datasets, enhancing the reproducibility of scientific results and fostering collaboration across disciplines. To uphold these principles, Austrian NeuroCloud adheres to the FAIR (Findable, Accessible, Interoperable, Reusable) data management guidelines. By following the FAIR principles, we ensure that the metadata we share meets the highest standards of data stewardship, promoting greater transparency and efficiency in research.
The metadata specification is based on the Brain Image Data Structure (BIDS), the only data format currently supported in the Austrian NeuroCloud. BIDS is a standardized format that organizes and describes neuroimaging and related data, enabling seamless sharing and analysis. By adopting the BIDS format, we ensure compatibility with a wide range of tools and platforms used in the neuroimaging community, further enhancing the utility and impact of our shared metadata.
We are committed to protecting the privacy of individuals by ensuring that no personal data is ever shared or published. All metadata shared will be anonymized and stripped of any personally identifiable information, in compliance with ethical standards and legal regulations. This commitment safeguards the privacy of research participants while allowing the scientific community to benefit from shared data.
Additionally, we will only share metadata that does not reveal any specific scientific contributions, thereby securing the intellectual property of the scientists.
Metadata sources
The metadata is categorized into four different levels: dataset, participant, measurement, and event. For each of these levels, specific files in a BIDS-formatted dataset are listed and used as the source of the metadata. We do not publish the files as they are, and in most cases we distribute only a summary of the information they contain. We use POSIX syntax and glob patterns to specify the files and their paths relative to the BIDS dataset root directory ./
.
Dataset level
The metadata provides a comprehensive description of the dataset as a whole, including its name, authors, identifiers, and related publications.
File pattern | Metadata definition |
---|---|
./README.md |
Entire content of the file. |
./CITATION.cff |
Entire content of the file. |
./dataset_description.json |
Entire content of the file. |
Participant level
The metadata provides basic information about the participant's demographics and completed assessment tools. Participant level metadata is published only in summarized form and never disclosed for individual participants. Data of individual participants may be stored in a database, but the query results reveal only summarized information about participants.
File pattern | Metadata definition |
---|---|
./participants.tsv |
Columns and their values representing demographic data of the participants, including but not limited to age, sex, and diagnosis. Columns representing the availability and a score of a completed assessment tool. |
./participants.json |
Definitions of the columns representing demographic data and the availability of a completed assessment tool. |
Measurement level
The metadata provides information about the data modalities available in a dataset, their specific acquisition parameters, and the number of acquisitions within and between multiple sessions. Acquisition times and dates are not published.
File pattern | Metadata definition |
---|---|
./sub-*/*sessions.tsv |
The number of different acquisition sessions. |
./sub-*/ses-*/*scans.tsv |
Data modalities within a session. |
./sub-*/ses-*/*/*.json |
Acquisition parameters of any modality. |
Event level
The metadata lists and describes the events that occurred during the data acquisition, in particular the stimuli to which the subjects were exposed. This metadata is essential for semantic interoperability of the datasets, especially for finding similar experiments. This metadata will be published in a summarized form, and we will not redistribute the entire event logs.
File pattern | Metadata definition |
---|---|
./sub-*/ses-*/*/*events.tsv |
Stimuli and subject actions. |
./*events.json |
Descriptions of events and their HED annotations. |
Dissemination channels
Metadata dissemination is facilitated through several channels, as outlined in the following table. These channels are designed to ensure broad and effective access to the metadata and the underlying datasets, and to enhance its findability within the scientific community.
Channel | Description | Metadata levels |
---|---|---|
DOI registration | The Austrian NeuroCloud uses DataCite as DOI registration service via the Library of the Paris Lodron University of Salzburg. Certain minimal dataset-level metadata is required for registering a DOI with DataCite. All applicable dataset-level metadata will be shared with DataCite. | Dataset |
DataCite Commons | DataCite Commons exposes the metadata of the DOIs registered with DataCite for querying. The metadata of each DOI registered with the Austrian NeuroCloud will be exposed in DataCite Commons. | Dataset |
Dataset website | Each dataset has an automatically generated website. This website is associated with the dataset DOI. The metadata is rendered on the website and embedded in its source code, complying with the FAIR assessment guidelines and ensuring a high fairness score. | Dataset, Participant, Measurement |
ANC querying interface (work in progress) | To increase findability, the metadata will be stored in a database and exposed in a custom-built ANC querying interface. | All levels |
Neurobagel node | To enable interoperability with other datasets at a participant level, the metadata will be stored in a database and exposed for querying using Neurobagel. | Dataset, Participant |