Neurobagel Integration¶
Purpose¶
The Neurobagel integration generates .jsonld files from participants.tsv and participants.json files across all ANC dataset repositories. These files are consumed by the Neurobagel service hosted for the ANC, enabling participant-level federated querying across datasets.
Source code: neurobagel_jsonld_generation
Staying current¶
We monitor developments in the Neurobagel ecosystem via:
- Repository notifications on the Neurobagel GitHub organisation
- Direct contact with the core developers: Sebastian Urchs, JB Poline, Alyssa Dai, Arman Jahanmanpour
When the Neurobagel data model or CLI changes, update the pipeline and verify that existing participants.json files remain compatible.
How it works¶
A scheduled CI/CD pipeline runs once a day:
- Uses the
python-gitlabAPI to connect to the ANC data repository - Clones all dataset repositories in the
bids-datasetsgroup — without fetching Git LFS files - Crawls each repository for a
participants.tsvand a matchingparticipants.json - If both files exist, applies the Neurobagel CLI (
bagel pheno) to generate.jsonldfiles - Exposes the resulting
.jsonldfiles as zipped pipeline artefacts - The Ansible playbook hosting the Neurobagel service downloads the artefacts and updates the service
Key assumption: participants.tsv and participants.json are complete and conform to the Neurobagel data model.