Skip to content

Demographic data

Demographic data or participant data is stored in BIDS participants files, specifically the participants.tsv and participants.json file, stored at the top level of a BIDS dataset. The ANC also supports Neurobagel annotations, that make your data findable based on participants characteristics via the Neurobagel query interface.

participants.tsv

The default participant.tsv file contains three columns: participant_id, age, and sex. These three variables are the minimum requirement. Adding any other demographic column for more extensive subject description is recommended. Whenever data belonging to a new participant is added to a dataset, a new row should be added to this file.

The participants.tsv file has the following rules:

  • Use tabulator as column separator.
  • Use lower case m, f and o for the sex values. According to BIDS these values refer to the phenotypical sex.
  • Age should be an integer.
  • Use 89 to indicate ages over 89 to prevent participant identification. Do not indicate age as 89+ or any other string value.
  • Missing values are always indicated via n/a

Example

participant_id age sex deafness hearing_months
sub-450207866cba 28 m no n/a
sub-9fb5e49bc1b9 28 m no n/a
sub-51d9a4b8abed 38 m yes n/a
sub-fa7f791ac1f8 60 m yes 3
sub-c06323af4171 19 f yes n/a
sub-29863894b750 55 m ci_pre 24
sub-a415d9dc6d83 39 m no n/a
sub-09d43edda56a 23 f no n/a
sub-503a5c6607c5 28 m no n/a
sub-a7cee54229be 58 m yes n/a

participants.json

The default participants.json contains the description of the participants.tsv columns. The default annotations are in line with the annotations required for the querying of summarized participant level information via Neurobagel.

Example

    {
        "participant_id": {
            "Description": "A participant ID",
            "Annotations": {
                "IsAbout": {
                    "Label": "Subject Unique Identifier",
                    "TermURL": "nb:ParticipantID"
                },
                "Identifies": "participant"
            }
        },
        "age": {
            "Annotations": {
                "IsAbout": {
                    "Label": "Age in years",
                    "TermURL": "nb:Age"
                },
                "Transformation": {
                    "Label": "integer value",
                    "TermURL": "nb:FromInt"
                },
                "MissingValues": ["n/a"]
            },
            "Description": "The age of the participant at data acquisition",
            "Unit": "years"
        },
        "sex": {
            "Annotations": {
                "IsAbout": {
                    "Label": "Sex",
                    "TermURL": "nb:Sex"
                },
                "Levels": {
                    "m": {
                        "TermURL": "snomed:248153007",
                        "Label": "Male"
                    },
                    "f": {
                        "TermURL": "snomed:248152002",
                        "Label": "Female"
                    }, 
                    "o": {
                    "TermURL": "snomed:32570681000036106",
                    "Label": "Other"
                  }

                },
                "MissingValues": ["n/a"]
            },
            "Description": "The biological gender assigned at birth"
        },
        "deafness": {
            "Description": "Grouping we used in this study as we investigated different levels of audiovisual listening experience",
            "Levels": {
                "yes": "Congenitally deaf",
                "ci_pre": "Did hear something but turned deaf (bilaterally or single-sided) at some point",
                "no": "No hearing problems are reported"
            },
            "Annotations": {
                "IsAbout": {
                    "Label": "Diagnosis",
                    "TermURL": "nb:Diagnosis"
                },
                "Levels": {
                    "yes": {
                        "TermURL": "snomed:95828007",
                        "Label": "Congenital deafness"
                    },
                    "ci_pre": {
                        "TermURL": "snomed:343087000",
                        "Label": "Partial deafness"
                    },
                    "no": {
                        "TermURL": "ncit:C94342",
                        "Label": "Healthy Control"
                    }
                },
                "MissingValues": ["n/a"]
            }

        },
        "hearing_months": {
            "description": "Calculated months that participants were exposed to audiovisual listening experience",
            "units": "months"
        }
    }

age different than integer

The template assumes that the age column has integer values. If your age values are of a different type, this has to be indicated in the section of the participant.json marked below and according to this table.

    "age": {
        "Annotations": {
            "IsAbout": {
                "Label": "Age",
                "TermURL": "nb:Age"
            },
            "Transformation": {
                "Label": "integer value", <-------
                "TermURL": "nb:FromInt" <--------
            },
            "MissingValues": ["n/a"]
        },
        "Description": "The age of the participant at data acquisition",
        "Unit": "years"
    }

Additional columns

Your participant.tsv may contain additional columns describing your participants. In such a case, the participants.json has to be extended with the descriptions of all additional columns.

Additional columns MUST NOT contain personal data.

group column with patient's diagnosis

An additional group column describes participant's diagnosis. Use the annotation tool provided by Neurobagel for generating the column description. An example you can find in the categorical example below.

Continuous data columns

Description of an example column hearing_months with values in centimeters:

    "hearing_months": {
            "description": "Calculated months that participants were exposed to audiovisual listening experience",
            "units": "months"
        }

Categorical data columns

Description of an example column deafness with three categorical values (yes, no, ci_pre)

        "deafness": {
            "Description": "Grouping we used in this study as we investigated different levels of audiovisual listening experience",
            "Levels": {
                "yes": "Congenitally deaf",
                "ci_pre": "Did hear something but turned deaf (bilaterally or single-sided) at some point",
                "no": "No hearing problems are reported"
            },
            "Annotations": {
                "IsAbout": {
                    "Label": "Diagnosis",
                    "TermURL": "nb:Diagnosis"
                },
                "Levels": {
                    "yes": {
                        "TermURL": "snomed:95828007",
                        "Label": "Congenital deafness"
                    },
                    "ci_pre": {
                        "TermURL": "snomed:343087000",
                        "Label": "Partial deafness"
                    },
                    "no": {
                        "TermURL": "ncit:C94342",
                        "Label": "Healthy Control"
                    }
                },
                "MissingValues": ["n/a"]
            }

        }