Coder Social home page Coder Social logo

Comments (6)

akanimax avatar akanimax commented on August 21, 2024 1

Please let me take a look then 😢. Seems like something is broken.

from big-discriminator-batch-spoofing-gan.

akanimax avatar akanimax commented on August 21, 2024

Could you please try switching to python 3.5.6?

from big-discriminator-batch-spoofing-gan.

JCBrouwer avatar JCBrouwer commented on August 21, 2024

Alright, first I created a new env with python 3.5.6 and installed pytorch 1.0.1, cudatoolkit 10, etc. again but on running I get the following error:

Starting the training process ...

Epoch: 1
Exception in thread Thread-2:
Traceback (most recent call last):
  File "/home/hans/.conda/envs/bbmsg/lib/python3.5/threading.py", line 914, in _bootstrap_inner
    self.run()
  File "/home/hans/.conda/envs/bbmsg/lib/python3.5/threading.py", line 862, in run
    self._target(*self._args, **self._kwargs)
  File "/home/hans/.conda/envs/bbmsg/lib/python3.5/multiprocessing/resource_sharer.py", line 139, in _ser
ve
    signal.pthread_sigmask(signal.SIG_BLOCK, range(1, signal.NSIG))
  File "/home/hans/.conda/envs/bbmsg/lib/python3.5/signal.py", line 60, in pthread_sigmask
    sigs_set = _signal.pthread_sigmask(how, mask)
ValueError: signal number 32 out of range

Exception in thread Thread-2:
Traceback (most recent call last):
  File "/home/hans/.conda/envs/bbmsg/lib/python3.5/threading.py", line 914, in _bootstrap_inner
    self.run()
  File "/home/hans/.conda/envs/bbmsg/lib/python3.5/threading.py", line 862, in run
    self._target(*self._args, **self._kwargs)
  File "/home/hans/.conda/envs/bbmsg/lib/python3.5/multiprocessing/resource_sharer.py", line 139, in _ser
ve
    signal.pthread_sigmask(signal.SIG_BLOCK, range(1, signal.NSIG))
  File "/home/hans/.conda/envs/bbmsg/lib/python3.5/signal.py", line 60, in pthread_sigmask
    sigs_set = _signal.pthread_sigmask(how, mask)
ValueError: signal number 32 out of range

Exception in thread Thread-2:
Traceback (most recent call last):
  File "/home/hans/.conda/envs/bbmsg/lib/python3.5/threading.py", line 914, in _bootstrap_inner
    self.run()
  File "/home/hans/.conda/envs/bbmsg/lib/python3.5/threading.py", line 862, in run
    self._target(*self._args, **self._kwargs)
  File "/home/hans/.conda/envs/bbmsg/lib/python3.5/multiprocessing/resource_sharer.py", line 139, in _serve
    signal.pthread_sigmask(signal.SIG_BLOCK, range(1, signal.NSIG))
  File "/home/hans/.conda/envs/bbmsg/lib/python3.5/signal.py", line 60, in pthread_sigmask
    sigs_set = _signal.pthread_sigmask(how, mask)
ValueError: signal number 32 out of range

^CTraceback (most recent call last):
  File "train.py", line 310, in <module>
    main(parse_arguments())
  File "train.py", line 304, in main
    fid_batch_size=args.fid_batch_size
  File "/home/hans/BBMSG-GAN/sourcecode/MSG_GAN/GAN.py", line 539, in train
    while real_data_store.hasnext() and batch_counter < limit:
  File "/home/hans/BBMSG-GAN/sourcecode/MSG_GAN/utils/iter_utils.py", line 31, in hasnext
    self._thenext = next(self.it)
  File "/home/hans/.conda/envs/bbmsg/lib/python3.5/site-packages/torch/utils/data/dataloader.py", line 631, in __next__
    idx, batch = self._get_batch()
  File "/home/hans/.conda/envs/bbmsg/lib/python3.5/site-packages/torch/utils/data/dataloader.py", line 610, in _get_batch
    return self.data_queue.get()
  File "/home/hans/.conda/envs/bbmsg/lib/python3.5/multiprocessing/queues.py", line 113, in get
    return ForkingPickler.loads(res)
  File "/home/hans/.conda/envs/bbmsg/lib/python3.5/site-packages/torch/multiprocessing/reductions.py", line 256, in rebuild_storage_fd
    fd = df.detach()
  File "/home/hans/.conda/envs/bbmsg/lib/python3.5/multiprocessing/resource_sharer.py", line 57, in detach
    with _resource_sharer.get_connection(self._id) as conn:
  File "/home/hans/.conda/envs/bbmsg/lib/python3.5/multiprocessing/resource_sharer.py", line 87, in get_connection
    c = Client(address, authkey=process.current_process().authkey)
  File "/home/hans/.conda/envs/bbmsg/lib/python3.5/multiprocessing/connection.py", line 493, in Client
    answer_challenge(c, authkey)
  File "/home/hans/.conda/envs/bbmsg/lib/python3.5/multiprocessing/connection.py", line 732, in answer_challenge
    message = connection.recv_bytes(256)         # reject large message
  File "/home/hans/.conda/envs/bbmsg/lib/python3.5/multiprocessing/connection.py", line 216, in recv_bytes
    buf = self._recv_bytes(maxlength)
  File "/home/hans/.conda/envs/bbmsg/lib/python3.5/multiprocessing/connection.py", line 407, in _recv_bytes
    buf = self._recv(4)
  File "/home/hans/.conda/envs/bbmsg/lib/python3.5/multiprocessing/connection.py", line 379, in _recv
    chunk = read(handle, remaining)
KeyboardInterrupt

Found some answers online that either upgrading to python 3.7 or setting num_workers=0 fixed it. However, running again with num_workers=0 left me in the same situation of zombie processes being left on my GPU.

from big-discriminator-batch-spoofing-gan.

akanimax avatar akanimax commented on August 21, 2024

Hmmm ... 😮. Can you try upgrading pytorch to 1.1.0?

from big-discriminator-batch-spoofing-gan.

JCBrouwer avatar JCBrouwer commented on August 21, 2024

No joy, still the same threading errors :(

from big-discriminator-batch-spoofing-gan.

JCBrouwer avatar JCBrouwer commented on August 21, 2024

My entire env for reference

# packages in environment at /home/hans/.conda/envs/bbmsg:
#
# Name                    Version                   Build  Channel
_libgcc_mutex             0.1                        main  
blas                      1.0                         mkl  
ca-certificates           2019.5.15                     1    anaconda
certifi                   2018.8.24                py35_1    anaconda
cffi                      1.11.5           py35he75722e_1  
cudatoolkit               10.0.130                      0  
freetype                  2.9.1                h8a8886c_1  
intel-openmp              2019.4                      243  
jpeg                      9b                   h024ee3a_2  
libedit                   3.1.20181209         hc058e9b_0  
libffi                    3.2.1                hd88cf55_4  
libgcc-ng                 9.1.0                hdf63c60_0  
libgfortran-ng            7.3.0                hdf63c60_0  
libpng                    1.6.37               hbc83047_0  
libstdcxx-ng              9.1.0                hdf63c60_0  
libtiff                   4.0.10               h2733197_2  
mkl                       2019.4                      243  
ncurses                   6.1                  he6710b0_1  
ninja                     1.8.2            py35h6bb024c_1  
numpy                     1.14.2           py35hdbf6ddf_0  
olefile                   0.46                     py35_0  
openssl                   1.0.2t               h7b6447c_1    anaconda
pillow                    5.2.0            py35heded4f4_0  
pip                       10.0.1                   py35_0  
protobuf                  3.9.1                    pypi_0    pypi
pycparser                 2.19                     py35_0  
python                    3.5.6                hc3d631a_0  
pytorch                   1.1.0           py3.5_cuda10.0.130_cudnn7.5.1_0    pytorch
readline                  7.0                  h7b6447c_5  
scipy                     1.3.1                    pypi_0    pypi
setuptools                40.2.0                   py35_0  
six                       1.11.0                   py35_1  
sqlite                    3.29.0               h7b6447c_0  
tensorboardx              1.8                      pypi_0    pypi
tk                        8.6.8                hbc83047_0  
torchvision               0.3.0           py35_cu10.0.130_1    pytorch
tqdm                      4.35.0                   pypi_0    pypi
wheel                     0.31.1                   py35_0  
xz                        5.2.4                h14c3975_4  
zlib                      1.2.11               h7b6447c_3  
zstd                      1.3.7                h0b5b093_0  

from big-discriminator-batch-spoofing-gan.

Related Issues (4)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.