Skip to content

Neurobagel Integration

Purpose

The Neurobagel integration generates .jsonld files from participants.tsv and participants.json files across all ANC dataset repositories. These files are consumed by the Neurobagel service hosted for the ANC, enabling participant-level federated querying across datasets.

Source code: neurobagel_jsonld_generation

Staying current

We monitor developments in the Neurobagel ecosystem via:

  • Repository notifications on the Neurobagel GitHub organisation
  • Direct contact with the core developers: Sebastian Urchs, JB Poline, Alyssa Dai, Arman Jahanmanpour

When the Neurobagel data model or CLI changes, update the pipeline and verify that existing participants.json files remain compatible.

How it works

A scheduled CI/CD pipeline runs once a day:

  1. Uses the python-gitlab API to connect to the ANC data repository
  2. Clones all dataset repositories in the bids-datasets group — without fetching Git LFS files
  3. Crawls each repository for a participants.tsv and a matching participants.json
  4. If both files exist, applies the Neurobagel CLI (bagel pheno) to generate .jsonld files
  5. Exposes the resulting .jsonld files as zipped pipeline artefacts
  6. The Ansible playbook hosting the Neurobagel service downloads the artefacts and updates the service

Key assumption: participants.tsv and participants.json are complete and conform to the Neurobagel data model.