Demographic data
Demographic data or participant data is stored in BIDS participants files, specifically the participants.tsv
and participants.json
file, stored at the top level of a BIDS dataset. The ANC also supports Neurobagel annotations, that make your data findable based on participants characteristics via the Neurobagel query interface.
participants.tsv
The default participant.tsv file contains three columns: participant_id
, age
, and sex
. These three variables are the minimum requirement. Adding any other demographic column for more extensive subject description is recommended. Whenever data belonging to a new participant is added to a dataset, a new row should be added to this file.
The participants.tsv file has the following rules:
- Use tabulator as column separator.
- Use lower case m, f and o for the sex values. According to BIDS these values refer to the phenotypical sex.
- Age should be an integer.
- Use 89 to indicate ages over 89 to prevent participant identification. Do not indicate age as 89+ or any other string value.
- Missing values are always indicated via
n/a
Example
participant_id | age | sex | deafness | hearing_months |
---|---|---|---|---|
sub-450207866cba | 28 | m | no | n/a |
sub-9fb5e49bc1b9 | 28 | m | no | n/a |
sub-51d9a4b8abed | 38 | m | yes | n/a |
sub-fa7f791ac1f8 | 60 | m | yes | 3 |
sub-c06323af4171 | 19 | f | yes | n/a |
sub-29863894b750 | 55 | m | ci_pre | 24 |
sub-a415d9dc6d83 | 39 | m | no | n/a |
sub-09d43edda56a | 23 | f | no | n/a |
sub-503a5c6607c5 | 28 | m | no | n/a |
sub-a7cee54229be | 58 | m | yes | n/a |
participants.json
The default participants.json
contains the description of the participants.tsv
columns. The default annotations are in line with the annotations required for the querying of summarized participant level information via Neurobagel.
Example
{
"participant_id": {
"Description": "A participant ID",
"Annotations": {
"IsAbout": {
"Label": "Subject Unique Identifier",
"TermURL": "nb:ParticipantID"
},
"Identifies": "participant"
}
},
"age": {
"Annotations": {
"IsAbout": {
"Label": "Age in years",
"TermURL": "nb:Age"
},
"Transformation": {
"Label": "integer value",
"TermURL": "nb:FromInt"
},
"MissingValues": ["n/a"]
},
"Description": "The age of the participant at data acquisition",
"Unit": "years"
},
"sex": {
"Annotations": {
"IsAbout": {
"Label": "Sex",
"TermURL": "nb:Sex"
},
"Levels": {
"m": {
"TermURL": "snomed:248153007",
"Label": "Male"
},
"f": {
"TermURL": "snomed:248152002",
"Label": "Female"
},
"o": {
"TermURL": "snomed:32570681000036106",
"Label": "Other"
}
},
"MissingValues": ["n/a"]
},
"Description": "The biological gender assigned at birth"
},
"deafness": {
"Description": "Grouping we used in this study as we investigated different levels of audiovisual listening experience",
"Levels": {
"yes": "Congenitally deaf",
"ci_pre": "Did hear something but turned deaf (bilaterally or single-sided) at some point",
"no": "No hearing problems are reported"
},
"Annotations": {
"IsAbout": {
"Label": "Diagnosis",
"TermURL": "nb:Diagnosis"
},
"Levels": {
"yes": {
"TermURL": "snomed:95828007",
"Label": "Congenital deafness"
},
"ci_pre": {
"TermURL": "snomed:343087000",
"Label": "Partial deafness"
},
"no": {
"TermURL": "ncit:C94342",
"Label": "Healthy Control"
}
},
"MissingValues": ["n/a"]
}
},
"hearing_months": {
"description": "Calculated months that participants were exposed to audiovisual listening experience",
"units": "months"
}
}
age
different than integer
The template assumes that the age
column has integer values. If your age
values are of a different type, this has to be indicated in the section of the participant.json
marked below and according to this table.
"age": {
"Annotations": {
"IsAbout": {
"Label": "Age",
"TermURL": "nb:Age"
},
"Transformation": {
"Label": "integer value", <-------
"TermURL": "nb:FromInt" <--------
},
"MissingValues": ["n/a"]
},
"Description": "The age of the participant at data acquisition",
"Unit": "years"
}
Additional columns
Your participant.tsv
may contain additional columns describing your participants. In such a case, the participants.json
has to be extended with the descriptions of all additional columns.
Additional columns MUST NOT contain personal data.
group
column with patient's diagnosis
An additional group
column describes participant's diagnosis. Use the annotation tool provided by Neurobagel for generating the column description. An example you can find in the categorical example below.
Continuous data columns
Description of an example column hearing_months
with values in centimeters:
"hearing_months": {
"description": "Calculated months that participants were exposed to audiovisual listening experience",
"units": "months"
}
Categorical data columns
Description of an example column deafness
with three categorical values (yes
, no
, ci_pre
)
"deafness": {
"Description": "Grouping we used in this study as we investigated different levels of audiovisual listening experience",
"Levels": {
"yes": "Congenitally deaf",
"ci_pre": "Did hear something but turned deaf (bilaterally or single-sided) at some point",
"no": "No hearing problems are reported"
},
"Annotations": {
"IsAbout": {
"Label": "Diagnosis",
"TermURL": "nb:Diagnosis"
},
"Levels": {
"yes": {
"TermURL": "snomed:95828007",
"Label": "Congenital deafness"
},
"ci_pre": {
"TermURL": "snomed:343087000",
"Label": "Partial deafness"
},
"no": {
"TermURL": "ncit:C94342",
"Label": "Healthy Control"
}
},
"MissingValues": ["n/a"]
}
}