Hi, I use python 3.12.2 and torch 2.2.2 on macOS 12.7.4. The ReFinED version is 1.

(I'm not from Amazon). The Environment</co

TypeError: cannot pickle 'Environment' object about refined HOT 4 OPEN

amazon-science commented on July 24, 2024

TypeError: cannot pickle 'Environment' object

from refined.

Comments (4)

shern2 commented on July 24, 2024 1

(I'm not from Amazon).

The Environment object is likely related to the lmdb's https://lmdb.readthedocs.io/en/release/#environment-class
Without the full stack trace, I can only guess.
My guess is something is trying to save the processor which includes the preprocessor, which includes the lookups to the lmdb tables. Perhaps the checkpoint portion.

Nevertheless, I suggest that you use a machine with GPU, because this is research code, and not 'battle-tested' in different environments (e.g. just pure CPU for training). I have testing for inference, CPU-only works, but didn't try training/fine-tuning with just CPU.

from refined.

shern2 commented on July 24, 2024 1

My guess is that the torch data loader is trying spin up x number of worker processes to prepare the batches of data.
Problem is likely here:
https://github.com/amazon-science/ReFinED/blob/main/src/refined/dataset_reading/entity_linking/wikipedia_dataset.py

You can look it up, I think others also encountered similar issues with lmdb pickling when using multiple workers:
pytorch/vision#689 (comment)

Cheers

from refined.

yayamamo commented on July 24, 2024

This happens when there is no GPUs, but I am not sure how to workaround this.

from refined.

yayamamo commented on July 24, 2024

Thanks, it may be reasonable to try at a GPU machine.

Just for a reference, I put the full stack trace below.

/Users/yayamamo/.pyenv/versions/3.12.2/lib/python3.12/site-packages/torch/cuda/amp/grad_scaler.py:126: UserWarning: torch.cuda.amp.GradScaler is enabled, but CUDA is not available.  Disabling.
  warnings.warn(
14:35:33 - __main__ - INFO - Fine-tuning end-to-end EL
14:36:00 - __main__ - INFO - Fine-tuning end-to-end EL
INFO:__main__:Fine-tuning end-to-end EL
  0%|                                                                                                                                                                                                                                  | 0/10 [00:00<?, ?it/s]14:36:02 - __main__ - INFO - Starting epoch number 0
INFO:__main__:Starting epoch number 0
14:36:02 - __main__ - INFO - lr: 0.0
INFO:__main__:lr: 0.0
14:36:02 - __main__ - INFO - lr: 0.0
INFO:__main__:lr: 0.0
14:36:02 - __main__ - INFO - lr: 0.0
INFO:__main__:lr: 0.0
14:36:02 - __main__ - INFO - lr: 0.0
INFO:__main__:lr: 0.0
14:36:02 - __main__ - INFO - lr: 0.0
INFO:__main__:lr: 0.0
  0%|                                                                                                                                                                                                                                  | 0/10 [00:03<?, ?it/s]
Traceback (most recent call last):
  File "/Users/yayamamo/git/ReFinED/src/refined/training/fine_tune/fine_tune.py", line 207, in <module>
    main()
  File "/Users/yayamamo/git/ReFinED/src/refined/training/fine_tune/fine_tune.py", line 44, in main
    start_fine_tuning_task(refined=refined,
  File "/Users/yayamamo/git/ReFinED/src/refined/training/fine_tune/fine_tune.py", line 95, in start_fine_tuning_task
    run_fine_tuning_loops(refined=refined, fine_tuning_args=fine_tuning_args,
  File "/Users/yayamamo/git/ReFinED/src/refined/training/fine_tune/fine_tune.py", line 114, in run_fine_tuning_loops
    for step, batch in tqdm(enumerate(training_dataloader), total=len(training_dataloader)):
                            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/yayamamo/.pyenv/versions/3.12.2/lib/python3.12/site-packages/torch/utils/data/dataloader.py", line 439, in __iter__
    return self._get_iterator()
           ^^^^^^^^^^^^^^^^^^^^
  File "/Users/yayamamo/.pyenv/versions/3.12.2/lib/python3.12/site-packages/torch/utils/data/dataloader.py", line 387, in _get_iterator
    return _MultiProcessingDataLoaderIter(self)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/yayamamo/.pyenv/versions/3.12.2/lib/python3.12/site-packages/torch/utils/data/dataloader.py", line 1040, in __init__
    w.start()
  File "/Users/yayamamo/.pyenv/versions/3.12.2/lib/python3.12/multiprocessing/process.py", line 121, in start
    self._popen = self._Popen(self)
                  ^^^^^^^^^^^^^^^^^
  File "/Users/yayamamo/.pyenv/versions/3.12.2/lib/python3.12/multiprocessing/context.py", line 224, in _Popen
    return _default_context.get_context().Process._Popen(process_obj)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/yayamamo/.pyenv/versions/3.12.2/lib/python3.12/multiprocessing/context.py", line 289, in _Popen
    return Popen(process_obj)
           ^^^^^^^^^^^^^^^^^^
  File "/Users/yayamamo/.pyenv/versions/3.12.2/lib/python3.12/multiprocessing/popen_spawn_posix.py", line 32, in __init__
    super().__init__(process_obj)
  File "/Users/yayamamo/.pyenv/versions/3.12.2/lib/python3.12/multiprocessing/popen_fork.py", line 19, in __init__
    self._launch(process_obj)
  File "/Users/yayamamo/.pyenv/versions/3.12.2/lib/python3.12/multiprocessing/popen_spawn_posix.py", line 47, in _launch
    reduction.dump(process_obj, fp)
  File "/Users/yayamamo/.pyenv/versions/3.12.2/lib/python3.12/multiprocessing/reduction.py", line 60, in dump
    ForkingPickler(file, protocol).dump(obj)
TypeError: cannot pickle 'Environment' object

from refined.

TypeError: cannot pickle 'Environment' object about refined HOT 4 OPEN

Comments (4)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent