Skip to content

Download data

Dowloading your dataset to a local system can be done by cloning your repository. A clone is a local version of your dataset repository. It is still linked to the repository so you can make changes locally and 'push' them back to the repository. See our guide on how to work locally.

We use git LFS for handling large files

If you are familiar with git and GitLab or Github, most of this will be known. However, note that we deal with dataset repositories that carry large files, and we use git LFS to handle these. If you are unfamiliar with git LFS we recommend you still read through the following guide.

To handle data repositories with large files or LFS files(1) the git extension LFS is needed. With the extension the files get correctly tracked and when preferred, the repositories can be cloned without downloading the large files.

  1. LFS stands for large file storage. You can see a LFS label behind the large .nii files.

Install git lfs

Open the command line interface, and enter the following command:

git lfs install --skip-smudge

Using the --skip-smudge flag ensures that LFS files are not cloned by default. It is recommended to install Git LFS this way, regardless of whether LFS files are needed, as it prevents from accidentally pulling LFS files.

Warning

All our guides assume the --skip-smudge flag has been used!

Access token

To successfully clone a dataset repository, an access token is required.

This token is generated within your GitLab profile. To do this, navigate to the profil settings, navigate to Access Tokens and click on the Add new token button on the right.

access token

This token is later needed as password in the command line.

Cloning a repository without LFS files

You can clone a repository without the content of the lfs files, meaning they will appear as representative pointer files in your cloned repository. These pointer files can be moved, renamed, or deleted, and any operation performed on them will also affect the corresponding LFS files. However, the actual content of the LFS files remains inaccessible.

Open the command line interface, navigate to the directory where the repository should be stored, and enter the following command:

git clone <yourrepositorylink>

Getting the lfs files

If the contents of the LFS files are required (for example, for analysis), they can be pulled. It is possible to pull all lfs files in your data, or just a subset.

Cloning the entire repository with all LFS files

Disclaimer

Be aware that depending on the size of your dataset, this process may require significant resources, including network bandwidth, storage capacity, and time.

Open the command line interface, navigate to the directory where the repository should be stored, and enter the following command:

GIT_LFS_SKIP_SMUDGE=0 git clone <yourrepositorylink>

By setting GIT_LFS_SKIP_SMUDGE to false, the LFS files don't get smudged into representive pointer files anymore.

Pulling specific LFS files

It is also possible to clone specific LFS files by using filenames or filename patterns.

Open the command line interface, navigate to the directory where the repository should be stored or already is stored, and enter the following command:

git lfs pull --include file1 file2

This is a resource saving option if only a small fraction of files is needed.

You can also use a file pattern, for example, all lfs files from the first subject:

git lfs pull sub-01/*