Skip to content

Scientific Knowledge Graph

The ANC Knowledge Graph provides a way to query metadata of ANC datasets using GraphQL.

The metadata is stored in a Neo4j graphql database and exposed through a GraphQL API.

This allows you to:

  • explore datasets and their relationships
  • query authors and affiliations
  • access datatype metadata
  • access HED metadata
  • find datasets based on HED tags

What this tool provides

The API represents ANC metadata as a graph structure.

Main entities:

  • Dataset — dataset in the ANC
  • Author — dataset authors
  • Affiliation — author institutions
  • DatatypeItem — datatype metadata
  • HedItem — HED annotations

Main relationships:

  • Dataset → Authors
  • Author → Affiliation
  • Dataset → DatatypeItems
  • Dataset → HedItems

Querying the Knowledge Graph

You can query the data using GraphQL.

Example:

query Datasets {
  datasets {
    title
    datasetId
    doi
    accessStatus
  }
}

Common use cases

1. Explore datasets

query {
  datasets {
    title
    datasetId
  }
}

2. Get authors of a dataset

query {
  datasets {
    title
    authors {
      givenNames
      familyNames
      orcid
    }
  }
}

3. Get affiliations

query {
  datasets {
    title
    authors {
      affiliation {
        name
        ror
      }
    }
  }
}

4. Query datatype metadata

query {
  datasets {
    title
    datatypeItems {
      datatypes
    }
  }
}

5. Query HED metadata

query {
  datasets {
    title
    hedItems {
      hedTags
    }
  }
}

6. Find datasets by HED tag

query {
  datasetsByHed(hedId: "YOUR_HED_ID") {
    title
    datasetId
    hedItems {
      hedTags
    }
  }
}

How the data is structured

The Knowledge Graph follows this structure:

  • datasets are connected to authors
  • authors are connected to affiliations
  • datasets are linked to datatype metadata
  • datasets are linked to HED annotations

This allows flexible queries across datasets and metadata.


Technical overview

The system consists of:

  • Neo4j — graph database
  • GraphQL API — query interface
  • Docker setup — deployment

The API is built using:


Setup (for developers)

The service is run using Docker:

docker compose up -d

This starts:

  • the GraphQL API
  • the Neo4j database

Data ingestion

Metadata is generated using the BIDS Indexer and imported into Neo4j.

The import process:

  1. Generate CSV files using BIDS Indexer
  2. Import into Neo4j using neo4j-admin
  3. Start services with Docker

Accessing the interfaces

After starting the services:


When should I use this?

Use the Knowledge Graph if you want to:

  • explore available datasets
  • search datasets by metadata
  • find datasets with specific HED annotations
  • integrate ANC data into analysis pipelines