Comments (6)
It's not metadata, your saved tensors seem to be views into a bigger tensor.
x = torch.ones(1000**3)
y = x[:5] # y is a view of the first 5 entries of x
Calling torch.save(y, ...)
will still save the x
tensor.
from pytorch.
Are the tensors you are saving views of a bigger tensor? In that case, torch
will save the whole tensor. You can avoid that by calling .clone()
on all tensor views before saving.
from pytorch.
Are the tensors you are saving views of a bigger tensor? In that case,
torch
will save the whole tensor. You can avoid that by calling.clone()
on all tensor views before saving.
Yes. state_dict_save_during_training
contains 64 parameters and I have tried to save every parameter to a bin file with the code below:
for k, v in state_dict_save_during_training.items():
torch.save(v, f'./gpu_{k}.bin')
torch.save(v.cpu(), f'./cpu_{k}.bin')
Every gpu_{k}.bin
has a size of 1.6G
and cpu_{k}.bin
is much smaller. I want to know is there any difference in using gpu_{k}.bin
or cpu_{k}.bin
for testing and subsequent training?
from pytorch.
There should not be any difference in the tensor's data, you'll just discard some metadata (like the tensor's CUDA device) by using cpu_{k].bin
. As I said above, you can also try to call .clone()
instead of .cpu()
and see if that fixes the problem.
from pytorch.
There should not be any difference in the tensor's data, you'll just discard some metadata (like the tensor's CUDA device) by using
cpu_{k].bin
. As I said above, you can also try to call.clone()
instead of.cpu()
and see if that fixes the problem.
I tried .clone()
state_dict_save_during_training=torch.load('./ip_adapter_weight.bin') # load weight saved during training
state_dict_to_cpu={k:state_dict_save_during_training[k].clone() for k in state_dict_save_during_training.keys()} # clone
torch.save(state_dict_to_cpu,'./temp_state_dict_cloned.bin') # save cloned weight
The resulting temp_state_dict_cloned.bin
is 98M
. Thanks! But I am still surprised that metadata is so large!
from pytorch.
It's not metadata, your saved tensors seem to be views into a bigger tensor.
x = torch.ones(1000**3) y = x[:5] # y is a view of the first 5 entries of xCalling
torch.save(y, ...)
will still save thex
tensor.
Thank you very much, I understand!
from pytorch.
Related Issues (20)
- Internal uses of `torch.load` are missing `weights_only` and raise FutureWarning HOT 4
- Internal uses of `torch.cuda.amp.autocast` raise FutureWarnings HOT 9
- `atleast_1d()` with no arguments works returning an empty tuple against the doc HOT 1
- `atleast_2d()` with no arguments works returning an empty tuple against the doc HOT 1
- `atleast_{1,2,3}d()` with no arguments works returning an empty tuple against the doc HOT 1
- output=model(input) report error:'gbk' codec can't decode byte 0x8a in position 171: illegal multibyte sequence HOT 1
- DTensor does not support bit shift HOT 1
- Torch C++ extension build failed with `fatal error: nlohmann/json.hpp: No such file or directory`
- TorchScript: `Return value was annotated as having type float but is actually of type int` violates PEP 484
- Pytorch 2.4 RC cu118 wheels do not work on old drivers HOT 3
- The doc of `stack()` should say `tuple` or `list` of tensors for `tensors` argument HOT 1
- Compilation Fails with torch.sparse and "fullgraph=True" HOT 4
- The doc of `cat()` should say `tuple` or `list` of tensors for `tensors` argument HOT 1
- The doc of `hstack()` should say `tuple` or `list` of tensors for `tensors` argument HOT 1
- The doc of `vstack()` should say `tuple` or `list` of tensors for `tensors` argument HOT 1
- The doc of `column_stack()` should say `tuple` or `list` of tensors for `tensors` argument HOT 1
- The doc of `dstack()` should say `tuple` or `list` of tensors for `tensors` argument HOT 1
- fx.wrap() doesn't really work for things in torch/*
- Converting a numpy array of size larger than 32,768 to a tensor causes a segmentation fault HOT 7
- tree_map_only_ doesn't seem work as expected HOT 5
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from pytorch.