Coder Social home page Coder Social logo

athms / learning-from-brains Goto Github PK

View Code? Open in Web Editor NEW
79.0 3.0 17.0 14.02 GB

Self-supervised learning techniques for neuroimaging data inspired by prominent learning frameworks in natural language processing + One of the broadest neuroimaging datasets used for pre-training to date.

Dockerfile 0.43% Python 99.57%
neuroimaging research-project transfer-learning

learning-from-brains's Introduction

learning-from-brains's People

Contributors

athms avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

learning-from-brains's Issues

Arguments for other datasets?

Cool work! Could you also provide arguments you used for other datasets, including both upstream and downstream stages?

Not able to download your pretrained model due to your LFS bandwidth

Hi! When I used git LFS to get your pretrained model, it shows that you don't have enought data quota, i.e. IFS bandwidth:

Downloading results/models/upstream/BERT_lrs-4_hds-12_embd-768_train-BERT_lr-0001_bs-96_drp-01_msk-02/model_final/pytorch_model.bin (128 MB)
Error downloading object: results/models/upstream/BERT_lrs-4_hds-12_embd-768_train-BERT_lr-0001_bs-96_drp-01_msk-02/model_final/pytorch_model.bin (6df0990): Smudge error: Error downloading results/models/upstream/BERT_lrs-4_hds-12_embd-768_train-BERT_lr-0001_bs-96_drp-01_msk-02/model_final/pytorch_model.bin (6df0990175a1cb7df29a504c2be9a8fd2f95d3d816e746a80d1aeef6eeabaf9f): batch response: This repository is over its data quota. Account responsible for LFS bandwidth should purchase more data packs to restore access.

Could you kindly fix it or let us know any other alternative way to get your pretrained weights? Thank you.

Estimated preprocessing time for parcellating fMRI data

Hello, thank you for your excellent work!
I am currently trying to apply your model to my custom dataset. Nevertheless, the preprocessing time seems a bit long. (for Difumo with 1024 networks).
Could you please provide an estimate for the preprocessing (parcellating) time required to convert fMRI data into vectors?

UnpicklingError exception when running train.py

We were very impressed by the results reported in the paper and wanted to replicate the work and try to apply it in our own area.

However, there's an UnpicklingError exception when running train.py, which perhaps indicates a problem with the specification of requirements.txt and the Dockerfile?

We downloaded the repo, and created environments in two different ways:

  • Adapted requirements.txt into a environment.yml
  • Used the docker file to create a docker image, then from that, created a singularity container and ran within that

In both environments we run into the same UnpicklingError exception:

> python3 scripts/train.py \
>     --data 'data/downstream/ds002105' \
>     --n-train-subjects-per-dataset 11 \
>     --n-val-subjects-per-dataset 3 \
>     --n-test-subjects-per-dataset 9 \
>     --architecture 'GPT' \
>     --pretrained-model 'results/models/upstream/GPT_lrs-4_hds-12_embd-768_train-CSM_lr-0005_bs-192_drp-01/model_final/pytorch_model.bin' \
>     --training-style 'decoding' \
>     --decoding-target 'task_label.pyd' \
>     --num-decoding-classes 26 \
>     --training-steps 10000 \
>     --per-device-training-batch-size 64 \
>     --learning-rate 1e-4 \
>     --log-dir 'results/models/downstream/ds002105' \
>     --log-every-n-steps 1000
/usr/local/lib/python3.8/dist-packages/nilearn/input_data/__init__.py:27: FutureWarning: The import path 'nilearn.input_data' is deprecated in version 0.9. Importing from 'nilearn.input_data' will be possible at least until release 0.13.0. Please import from 'nilearn.maskers' instead.
  warnings.warn(message, FutureWarning)
Saving tarfile split to results/models/downstream/ds002105/GPT_lrs-4_hds-12_embd-768_train-decoding_lr-0001_bs-64_drp-01_2022-10-13_16-43-26/tarfile_paths_split.json
Loading pretrained model from results/models/upstream/GPT_lrs-4_hds-12_embd-768_train-CSM_lr-0005_bs-192_drp-01/model_final/pytorch_model.bin
loading the following pre-trained path:
results/models/upstream/GPT_lrs-4_hds-12_embd-768_train-CSM_lr-0005_bs-192_drp-01/model_final/pytorch_model.bin
cudaisavailable is True
loading with cpu
Traceback (most recent call last):
  File "scripts/train.py", line 1077, in <module>
    trainer = train()
  File "scripts/train.py", line 237, in train
    trainer = make_trainer(
  File "/learningfrombrains/scripts/../src/trainer/make.py", line 207, in make_trainer
    trainer = Trainer(
  File "/learningfrombrains/scripts/../src/trainer/base.py", line 14, in __init__
    super().__init__(**kwargs)
  File "/usr/local/lib/python3.8/dist-packages/transformers/trainer.py", line 313, in __init__
    model = self.call_model_init()
  File "/usr/local/lib/python3.8/dist-packages/transformers/trainer.py", line 980, in call_model_init
    model = self.model_init(trial)
  File "scripts/train.py", line 226, in model_init
    return make_model(model_config)
  File "scripts/train.py", line 366, in make_model
    model.from_pretrained(model_config["pretrained_model"])
  File "/learningfrombrains/scripts/../src/model.py", line 69, in from_pretrained
    pretrained = torch.load(pretrained_path, map_location=torch.device('cpu'))
  File "/usr/local/lib/python3.8/dist-packages/torch/serialization.py", line 713, in load
    return _legacy_load(opened_file, map_location, pickle_module, **pickle_load_args)
  File "/usr/local/lib/python3.8/dist-packages/torch/serialization.py", line 920, in _legacy_load
    magic_number = pickle_module.load(f, **pickle_load_args)
_pickle.UnpicklingError: invalid load key, 'v'.

An aside: Along the way, we did make a small tweak to https://github.com/athms/learning-from-brains/blob/master/src/model.py#L60 , which we modified from

if next(self.parameters()).is_cuda:

to

if torch.cuda.is_available():

because with the first version, it didn't seem to recognize our cuda device. However, the UnpicklingError exception occurs regardless of which version we're using (i.e., whether we use pretrained = torch.load(pretrained_path) or pretrained = torch.load(pretrained_path, map_location=torch.device('cpu'))).

I don't know why we're getting an UnpicklingError here. My first guess is that the version of pickle in the Dockerfile and requirements.txt doesn't match the version that the torch code actually expects. That would be easier to sort out if you were able to check your own implementation to see if the versions in the Dockerfile and requirements.txt are the versions your code actually runs on.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.