Coder Social home page Coder Social logo

model-tools's Introduction

Build Status

Brain-Score Model Tools

Utility for generic models to interact with brain data.

Environment variables

Environment variables are prefixed with MT_.

Variable Description
MT_MULTITHREAD whether or not to use multi-threading
MT_HOME path to framework root
MT_IMAGENET_PATH path to ImageNet file containing the validation image set

model-tools's People

Contributors

franzigeiger avatar jjpr-mit avatar mike-ferguson avatar mschrimpf avatar stothe2 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

model-tools's Issues

Cannot install model-tools

ollecting git+https://github.com/brain-score/model-tools.git
Cloning https://github.com/brain-score/model-tools.git to /tmp/pip-req-build-10va2fci
Running command git clone --filter=blob:none --quiet https://github.com/brain-score/model-tools.git /tmp/pip-req-build-10va2fci
Resolved https://github.com/brain-score/model-tools.git to commit 75365b5
Preparing metadata (setup.py) ... done
Collecting brainio@ git+https://github.com/brain-score/brainio (from model-tools==0.1.0)
Cloning https://github.com/brain-score/brainio to /tmp/pip-install-4lpjkv0w/brainio_a2958f4ac95540b0abc3f8fa26316e6c
Running command git clone --filter=blob:none --quiet https://github.com/brain-score/brainio /tmp/pip-install-4lpjkv0w/brainio_a2958f4ac95540b0abc3f8fa26316e6c
Resolved https://github.com/brain-score/brainio to commit 9bc00b21a82f4b3637117a6329b1f629df3170cd
Preparing metadata (setup.py) ... done
Collecting brain-score@ git+https://github.com/brain-score/brain-score (from model-tools==0.1.0)
Cloning https://github.com/brain-score/brain-score to /tmp/pip-install-4lpjkv0w/brain-score_3025ffe44f7448c480537b97c3a59fd2
Running command git clone --filter=blob:none --quiet https://github.com/brain-score/brain-score /tmp/pip-install-4lpjkv0w/brain-score_3025ffe44f7448c480537b97c3a59fd2
Resolved https://github.com/brain-score/brain-score to commit 25c9abde4479c1422cdffab7c0380dd05d21d125
Preparing metadata (setup.py) ... done
Collecting result_caching@ git+https://github.com/brain-score/result_caching (from model-tools==0.1.0)
Cloning https://github.com/brain-score/result_caching to /tmp/pip-install-4lpjkv0w/result-caching_fbb0aaa46ed94ae6a68a7d954ea8bd95
Running command git clone --filter=blob:none --quiet https://github.com/brain-score/result_caching /tmp/pip-install-4lpjkv0w/result-caching_fbb0aaa46ed94ae6a68a7d954ea8bd95
Resolved https://github.com/brain-score/result_caching to commit 27ace7e892a2cbfbcb654d027e8d108e168986d4
Preparing metadata (setup.py) ... done
Requirement already satisfied: h5py in /home/atuin/b112dc/b112dc10/software/privat/conda/envs/model-training/lib/python3.9/site-packages (from model-tools==0.1.0) (3.1.0)
Requirement already satisfied: Pillow in /home/hpc/b112dc/b112dc10/.local/lib/python3.9/site-packages (from model-tools==0.1.0) (10.0.0)
Requirement already satisfied: numpy in /home/atuin/b112dc/b112dc10/software/privat/conda/envs/model-training/lib/python3.9/site-packages (from model-tools==0.1.0) (1.26.1)
Requirement already satisfied: tqdm in /home/atuin/b112dc/b112dc10/software/privat/conda/envs/model-training/lib/python3.9/site-packages (from model-tools==0.1.0) (4.66.1)
Requirement already satisfied: torch in /home/hpc/b112dc/b112dc10/.local/lib/python3.9/site-packages (from model-tools==0.1.0) (2.0.1)
Requirement already satisfied: torchvision in /home/hpc/b112dc/b112dc10/.local/lib/python3.9/site-packages (from model-tools==0.1.0) (0.15.2)
INFO: pip is looking at multiple versions of model-tools to determine which version is compatible with other requirements. This could take a while.
ERROR: Could not find a version that satisfies the requirement tensorflow==1.15 (from model-tools) (from versions: 2.5.0, 2.5.1, 2.5.2, 2.5.3, 2.6.0rc0, 2.6.0rc1, 2.6.0rc2, 2.6.0, 2.6.1, 2.6.2, 2.6.3, 2.6.4, 2.6.5, 2.7.0rc0, 2.7.0rc1, 2.7.0, 2.7.1, 2.7.2, 2.7.3, 2.7.4, 2.8.0rc0, 2.8.0rc1, 2.8.0, 2.8.1, 2.8.2, 2.8.3, 2.8.4, 2.9.0rc0, 2.9.0rc1, 2.9.0rc2, 2.9.0, 2.9.1, 2.9.2, 2.9.3, 2.10.0rc0, 2.10.0rc1, 2.10.0rc2, 2.10.0rc3, 2.10.0, 2.10.1, 2.11.0rc0, 2.11.0rc1, 2.11.0rc2, 2.11.0, 2.11.1, 2.12.0rc0, 2.12.0rc1, 2.12.0, 2.12.1, 2.13.0rc0, 2.13.0rc1, 2.13.0rc2, 2.13.0, 2.13.1, 2.14.0rc0, 2.14.0rc1, 2.14.0)
ERROR: No matching distribution found for tensorflow==1.15

StimulusSet in brainio does not have get_image

File "/home/allagash/miniconda3/envs/fuzzy-spoon/lib/python3.7/site-packages/pandas/core/generic.py", line 5487, in getattr
return object.getattribute(self, name)
AttributeError: 'StimulusSet' object has no attribute 'get_image'

stimuli_paths = [str(stimulus_set.get_stimulus(stimulus_id)) for stimulus_id in stimulus_set['stimulus_id']]

This line appears to be referencing a function that no longer exists in StimulusSet
https://github.com/brain-score/brainio/blob/main/brainio/stimuli.py#L10

key error when layers have different flattened coordinates

I was running brainscore on some transformer models and ran into an issue with channel_x not being a key in a dictionary. It seems to be noted in the code:

# using these names/keys for all assemblies results in KeyError if the first layer contains flatten_coord_names

Log here for the failed model:
http://braintree.mit.edu:8080/job/run_benchmarks/3861/parsed_console/job/run_benchmarks/3861/parsed_console/log.html

I found that I also could not include the final fc (logits) layer as one of the places where I was grabbing the activations, as this caused a key error ('embedding' was missing). I just removed these parts of the model from scoring, but it seems like something that others might run into.

FileNotFoundError with Majaj V4 but not Majaj IT

I have a script which fits models against Majaj IT and Majaj V4. IT runs fine, but when I try specifying V4 instead, I receive the following stack trace and error:

  File "python3.7/site-packages/model_tools/activations/core.py", line 79, in _from_paths_stored
    return self._from_paths(layers=layers, stimuli_paths=stimuli_paths)
  File "python3.7/site-packages/model_tools/activations/core.py", line 85, in _from_paths
    layer_activations = self._get_activations_batched(stimuli_paths, layers=layers, batch_size=self._batch_size)
  File "python3.7/site-packages/model_tools/activations/core.py", line 135, in _get_activations_batched
    batch_activations = hook(batch_activations)
  File "python3.7/site-packages/model_tools/activations/pca.py", line 23, in __call__
    self._ensure_initialized(batch_activations.keys())
  File "python3.7/site-packages/model_tools/activations/pca.py", line 40, in _ensure_initialized
    n_components=self._n_components)
  File "python3.7/site-packages/result_caching/__init__.py", line 231, in wrapper
    self.save(result, function_identifier)
  File "python3.7/site-packages/result_caching/__init__.py", line 125, in save
    os.rename(savepath_part, path)
FileNotFoundError: [Errno 2] No such file or directory: '/om2/user/rylansch/FieteLab-Reg-Eff-Dim/.result_caching/model_tools.activations.pca.LayerPCA._pcas/identifier=architecture:RF-100-cosine-bernoulli-b-ns|task:None|kind:Rand|source:RS|lyr:mlp|agg:pca|n_comp:1000,n_components=1000.pkl.filepart' -> '/om2/user/rylansch/FieteLab-Reg-Eff-Dim/.result_caching/model_tools.activations.pca.LayerPCA._pcas/identifier=architecture:RF-100-cosine-bernoulli-b-ns|task:None|kind:Rand|source:RS|lyr:mlp|agg:pca|n_comp:1000,n_components=1000.pkl'

I'm not familiar with result_caching. Could someone please help me understand why this problem emerges for V4 but not IT? What's the solution to fixing it?

speed up activations retrieval

when we started writing model-tools, we were primarily thinking of neuroscience stimulus sets with only a couple thousand images. Speed was therefore less of an issue because even with a suboptimal implementation, a few k images are quickly passed through the network.

We are now evaluating models on increasingly large ML benchmarks (e.g. brain-score/vision#232) and due to the slow activations retrieval, the evaluation takes very long (days), sometimes timing out on the cluster. We therefore need to speed up the activations extraction.

Models are already using cuda when possible, I believe the main bottleneck is actually the loading of images which is currently single-threaded. We should profile the code to confirm this and (if true) use multiple workers to load the images to pass into the model (e.g. here and here for pytorch, ideally using existing tools such as DataLoader).

delete submitted models and score consistency

Hey,
I just realized that the average model score is different in all three places (profile, model page, competition leaderboard). I guess it's due to benchmarks that have and have not been added in the places. I guess it would clearer and more comprehensive to have this unified on all tables.

Also I was wondering if you might want to implement a function to delete submitted models from the database, as the number of model submissions might increase..

Thanks!

//EDIT wrong repo :( see here: brain-score/brain-score.web#129

Error encountered while testing a base model.

Hello,

I have encountered an error while testing a base_models.py implementation. Here is the log and the line where it breaks. There is a comment there that may be describing the issue, but I'm not sure how to interpret it.

It looks like this traces back to the behavioral benchmark, specifically here. Is this assuming the model has a layer named 'logits'? The model I am testing does not have this.

Any advice on how to debug this would be very appreciated.

Thank you,
Cory

model-tools 0.2.0 version and library dependency

In setup.py, the version number still says 0.1.0 rather than 0.2.0. Also in the version release tar.gz file, the result_caching dependency links to result_caching @ git+https://github.com/mschrimpf/result_caching rather than result_caching @ git+https://github.com/brain-score/result_caching

reduce memory requirements

running large models like ResNet takes insane amounts of memory (~450GB).
This is probably due to us collecting all the layer activations across batches, and could be optimized by continuously writing batches to disk

OOM error caused by minimal memory requirements?

I'm getting an OOM error that allegedly says not enough memory can be found for 1.19 GB. I'm running SLURM jobs with ~80GB.

How can I investigate the cause? Is it possible that previous layers' activations are consuming memory? If so, is there some flag or some mechanism to free that memory?

Traceback (most recent call last):
  File "scripts/compute_eigenspectra_and_fit_encoding_model.py", line 63, in <module>
    activations_extractor=model,
  File "/home/gridsan/rschaeffer/FieteLab-Reg-Eff-Dim/regression_dimensionality/custom_model_tools/eigenspectrum.py", line 44, in fit
    image_transform_name=transform_name)
  File "/home/gridsan/rschaeffer/FieteLab-Reg-Eff-Dim/regdim_venv/lib/python3.7/site-packages/result_caching/__init__.py", line 223, in wrapper
    result = function(**reduced_call_args)
  File "/home/gridsan/rschaeffer/FieteLab-Reg-Eff-Dim/regression_dimensionality/custom_model_tools/eigenspectrum.py", line 141, in _fit
    activations = self._extractor(image_paths, layers=[layer])
  File "/home/gridsan/rschaeffer/FieteLab-Reg-Eff-Dim/regdim_venv/lib/python3.7/site-packages/model_tools/activations/pytorch.py", line 41, in __call__
    return self._extractor(*args, **kwargs)
  File "/home/gridsan/rschaeffer/FieteLab-Reg-Eff-Dim/regdim_venv/lib/python3.7/site-packages/model_tools/activations/core.py", line 43, in __call__
    return self.from_paths(stimuli_paths=stimuli, layers=layers, stimuli_identifier=stimuli_identifier)
  File "/home/gridsan/rschaeffer/FieteLab-Reg-Eff-Dim/regdim_venv/lib/python3.7/site-packages/model_tools/activations/core.py", line 73, in from_paths
    activations = fnc(layers=layers, stimuli_paths=reduced_paths)
  File "/home/gridsan/rschaeffer/FieteLab-Reg-Eff-Dim/regdim_venv/lib/python3.7/site-packages/model_tools/activations/core.py", line 85, in _from_paths
    layer_activations = self._get_activations_batched(stimuli_paths, layers=layers, batch_size=self._batch_size)
  File "/home/gridsan/rschaeffer/FieteLab-Reg-Eff-Dim/regdim_venv/lib/python3.7/site-packages/model_tools/activations/core.py", line 141, in _get_activations_batched
    layer_activations[layer_name] = np.concatenate((layer_activations[layer_name], layer_output))
  File "<__array_function__ internals>", line 6, in concatenate
numpy.core._exceptions._ArrayMemoryError: Unable to allocate 1.19 GiB for an array with shape (1216, 256, 32, 32) and data type float32 

check_submission not working with Stochastic models

Getting error when testing stochastic model for model submission. From what I could infer the MockBenchmark does not average trial presentations and then gets a conflicting size error. Check log below.

Traceback (most recent call last):
File "brain_models.py", line 50, in
check_models.check_brain_models(name)
File "/braintree/home/tmarques/brainscore/model-tools/model_tools/check_submission/check_models.py", line 24, in check_brain_models
check_brain_model_processing(model)
File "/braintree/home/tmarques/brainscore/model-tools/model_tools/check_submission/check_models.py", line 30, in check_brain_model_processing
score = benchmark(model, do_behavior=True)
File "/braintree/home/tmarques/brainscore/model-tools/model_tools/check_submission/check_models.py", line 88, in call
candidate.look_at(self.assembly.stimulus_set)
File "/braintree/home/tmarques/brainscore/model-tools/model_tools/brain_transformation/init.py", line 53, in look_at
return self.behavior_model.look_at(stimuli)
File "/braintree/home/tmarques/brainscore/model-tools/model_tools/brain_transformation/behavior.py", line 22, in look_at
return self.current_executor.look_at(stimuli, *args, **kwargs)
File "/braintree/home/tmarques/brainscore/model-tools/model_tools/brain_transformation/behavior.py", line 50, in look_at
dims=['choice', 'presentation'])
File "/braintree/home/tmarques/anaconda3/envs/model-submission/lib/python3.6/site-packages/brainio_base/assemblies.py", line 24, in init
super(DataAssembly, self).init(*args, **kwargs)
File "/braintree/home/tmarques/anaconda3/envs/model-submission/lib/python3.6/site-packages/xarray/core/dataarray.py", line 230, in init
coords, dims = _infer_coords_and_dims(data.shape, coords, dims)
File "/braintree/home/tmarques/anaconda3/envs/model-submission/lib/python3.6/site-packages/xarray/core/dataarray.py", line 81, in _infer_coords_and_dims
'coordinate %r' % (d, sizes[d], s, k))
ValueError: conflicting sizes for dimension 'choice': length 1 on the data but length 20 on coordinate 'synset'

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.