Comments (13)
@penguinshin I fix this bug by adding 64G swap memory. When data loader forks workers to load data, the memory increases rapidly. You can try setting num_workers=1
first, then try allocating larger swap space.
from resnext.pytorch.
**Fixed this ** by allocating 4G swap memory. You can try allocating more memory if 4G does not suffice.
Follow this blog to allocate swap memory on your device.
https://www.digitalocean.com/community/tutorials/how-to-add-swap-space-on-ubuntu-16-04
from resnext.pytorch.
Fixed the problem by allocating 64GB of swap memory from the external disk.
from resnext.pytorch.
OSError: [Errno 12] Cannot allocate memory
sounds more like a RAM problem, not a GPU problem. Check you have enough RAM/SWAP, and the correct user permissions.
from resnext.pytorch.
Fix is almost always:
echo 1 > /proc/sys/vm/overcommit_memory
https://stackoverflow.com/a/52311756/1391392
from resnext.pytorch.
Yeah Exactly!
Meanwhile, I collected the output of lspci command:(for NVIDIA 1080 TI)
bansa01@vita:~/pytorch_resnext/tmp$ lspci -v -s 89:00.0
89:00.0 VGA compatible controller: NVIDIA Corporation Device 1b06 (rev a1) (prog-if 00 [VGA controller])
Subsystem: ZOTAC International (MCO) Ltd. Device 1470
Flags: bus master, fast devsel, latency 0, IRQ 105
Memory at f4000000 (32-bit, non-prefetchable) [size=16M]
Memory at 2ff80000000 (64-bit, prefetchable) [size=256M]
Memory at 2ff90000000 (64-bit, prefetchable) [size=32M]
I/O ports at b000 [size=128]
[virtual] Expansion ROM at f5000000 [disabled] [size=512K]
Capabilities: <access denied>
Kernel driver in use: nvidia
Kernel modules: nvidiafb, nouveau, nvidia_384_drm, nvidia_384
Does this give us any information as to where we might be going wrong. Can I change anything myself,(given that I have root permission)which could help me prevent this issue.
from resnext.pytorch.
Have you fixed that ? I am facing the same issue.
from resnext.pytorch.
I am also running into the same problem, although I am running everything on a CPU. I have more than enough memory (the error occurs when I'm using only 10G out of 32G)
from resnext.pytorch.
Hi! As @ZhuFengdaaa confirms, it seems a peak memory problem although I am not able to reproduce it. Again as @ZhuFengdaaa suggests, this seems to be linked with the number of threads (also see https://discuss.pytorch.org/t/guidelines-for-assigning-num-workers-to-dataloader/813/6).
from resnext.pytorch.
Another related thread ruotianluo/self-critical.pytorch#11
from resnext.pytorch.
OSError: [Errno 12] Cannot allocate memory
sounds more like a RAM problem, not a GPU problem. Check you have enough RAM/SWAP, and the correct user permissions.
Why does this have to do with permissions?and what should i do with permissions?
from resnext.pytorch.
Why would some swap be needed? It slows down everything
I am having the same problem while trying to allocate 30 GB even though I have 1 TB free...
from resnext.pytorch.
This problem is come from CPU memory allocation. check CPU Ram Memory
from resnext.pytorch.
Related Issues (18)
- Questions about the performances. HOT 23
- GPU memory usage during training HOT 1
- Pretrained model link broken HOT 4
- question about the input img size HOT 8
- Sublinear speed-up with dataparallel HOT 1
- Is the D right ? HOT 6
- About initial learning rate HOT 2
- The structure problem
- cannot run inference mismatch sizes HOT 2
- Could you provide the ResNeXt of Caffe version ?
- RuntimeError: Error(s) in loading state_dict for CifarResNeXt: HOT 1
- Can I directly convert TensorFlow's ResNext weight to pytorch's weight and use it directly? HOT 2
- question about the dimension of the net HOT 4
- GPU memory usage HOT 1
- Question about the number of channels HOT 4
- sizes do not match HOT 1
- TypeError: tensor(0, device='cuda:0') is not JSON serializable HOT 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from resnext.pytorch.