i404788 / s5-pytorch Goto Github PK
View Code? Open in Web Editor NEWPytorch implementation of Simplified Structured State-Spaces for Sequence Modeling (S5)
License: Mozilla Public License 2.0
Pytorch implementation of Simplified Structured State-Spaces for Sequence Modeling (S5)
License: Mozilla Public License 2.0
I created two instances of the S5 models and I tried to print the gradients of these two models:
from s5 import S5
import torch.nn as nn
import torch
input_dim, output_dim, state_width, sequence_length = 1, 1, 1, 1
x1 = torch.randn(1, 1, 1)
y1 = torch.randn(1, 1, 1)
x2 = torch.randn(1, 1, 1)
y2 = torch.randn(1, 1, 1)
model1 = S5(width=input_dim, state_width=state_width)
model2 = S5(width=input_dim, state_width=state_width)
criterion = nn.MSELoss()
loss1 = criterion(model1(x1), y1)
loss2 = criterion(model2(x2), y2)
loss1.backward()
loss2.backward()
for name, param in model1.named_parameters():
print(name, param.data, param.grad)
for name, param in model2.named_parameters():
print(name, param.data, param.grad)
The output is as following:
seq.Lambda tensor([-0.5000+0.j]) tensor([-3.8413e-08-6.8010e-07j])
seq.B tensor([[[0.1249, 0.0000]]]) tensor([[[-0.0001, -0.0022]]])
seq.C tensor([[0.0452-0.8008j]]) tensor([[-0.0003+0.j]])
seq.D tensor([0.9082]) tensor([-0.5527])
seq.log_step tensor([-5.3020]) tensor([-1.5514e-05])
seq.Lambda tensor([-0.5000+0.j]) None
seq.B tensor([[[-0.4863, 0.0000]]]) None
seq.C tensor([[0.2791+0.8068j]]) tensor([[-0.0208+0.j]])
seq.D tensor([0.5738]) tensor([4.2298])
seq.log_step tensor([-4.5913]) None
I don't understand why part of model2
's gradients are missing?
Hi there,
Thanks for publishing this library!
This is a question rather than an issue. I'm using
Rather than passing the last latent representation to a FF layer, I can also get the SSM to predict
[Edit] Nvm. I think this functionality is exposed in S5.forward() [\Edit]
Thanks for the good work. Doesn't the encoder come before the S5 layer, or am I missing something here?
Line 352 in 5aad8a1
I tried running the following script and found that S5 is far slower than PyTorch's LSTM. Is this supposed to be the case? Perhaps the scale at which I'm testing it is too small to realize the benefit?
from datetime import datetime
import os
import torch
from s5 import S5
os.environ["KMP_DUPLICATE_LIB_OK"] = "TRUE"
L = 1200
B = 256
x_dim = 128
m = S5(x_dim, 512).cuda()
lstm = torch.nn.LSTM(obs_size, 512).cuda()
x = torch.randn(B, L, obs_size).cuda()
t0 = datetime.now()
for i in range(10):
y, _ = lstm(x)
torch.sum(y).backward()
t1 = datetime.now()
print(t1 - t0)
t2 = datetime.now()
for i in range(10):
y = m(x)
torch.sum(y).backward()
t3 = datetime.now()
print(t3 - t2)
I would greatly appreciate any comment on this. Thanks in advance, and thanks for the implementation!
Hi,
I was trying to use the S5Block with an Adam optimizer with weight decay. However, I got a strange bug, that the sizes of parameters and gradients mismatch. The error only occures with cuda tensors/model and only when weight_decay
is enabled. Below a minimal script that reproduces the bug:
from s5 import S5Block
import torch
x = torch.randn(16, 64, 256).cuda()
a = S5Block(256, 128, False).cuda()
a.train()
# h = torch.optim.Adam(a.parameters(), lr=0.001) # this works
h = torch.optim.Adam(a.parameters(), lr=0.001, weight_decay=0.0001) # this doesn't work
out = a(x.cuda())
out.sum().backward()
h.step()
After a lot of digging I found the part that caused the error: complex data type handling of device parameters is faulty in the _multi_tensor_adam
in the newest version 2.0.1
of pytorch. Specifically in L. 442 in torch/optim/adam.py was a wrong variable used for computing the weight decay.
However, this seems to have been fixed since May 9 with this commit. So with a newer pytorch version this should be working. Right now, this remains broken.
Just posting this here in case anyone else is having this issue.
Dear, how to use prev_state in apply_ssm function since I see it is now purely forward?
I would ideally want to:
x, states = s5(x, states), where apply_ssm carries state such that I can train with memory.
Dear @i404788 ,
do you maybe have an S5 model that works with irregularly sampled data, I mean it can be trained on it, like in pendulum task?
If yes, could you share this branch?
Thanks.
I cloned the repo and running the code just as is from the example.py
file fails with RuntimeError: accessing 'data' under vmap transform is not allowed
.
---------------------------------------------------------------------------
RuntimeError Traceback (most recent call last)
File [/opt/conda/lib/python3.11/site-packages/torchinfo/torchinfo.py:295](https://file+.vscode-resource.vscode-cdn.net/opt/conda/lib/python3.11/site-packages/torchinfo/torchinfo.py:295), in forward_pass(model, x, batch_dim, cache_forward_pass, device, mode, **kwargs)
[294](https://file+.vscode-resource.vscode-cdn.net/opt/conda/lib/python3.11/site-packages/torchinfo/torchinfo.py:294) if isinstance(x, (list, tuple)):
--> [295](https://file+.vscode-resource.vscode-cdn.net/opt/conda/lib/python3.11/site-packages/torchinfo/torchinfo.py:295) _ = model(*x, **kwargs)
[296](https://file+.vscode-resource.vscode-cdn.net/opt/conda/lib/python3.11/site-packages/torchinfo/torchinfo.py:296) elif isinstance(x, dict):
File [/opt/conda/lib/python3.11/site-packages/torch/nn/modules/module.py:1518](https://file+.vscode-resource.vscode-cdn.net/opt/conda/lib/python3.11/site-packages/torch/nn/modules/module.py:1518), in Module._wrapped_call_impl(self, *args, **kwargs)
[1517](https://file+.vscode-resource.vscode-cdn.net/opt/conda/lib/python3.11/site-packages/torch/nn/modules/module.py:1517) else:
-> [1518](https://file+.vscode-resource.vscode-cdn.net/opt/conda/lib/python3.11/site-packages/torch/nn/modules/module.py:1518) return self._call_impl(*args, **kwargs)
File [/opt/conda/lib/python3.11/site-packages/torch/nn/modules/module.py:1568](https://file+.vscode-resource.vscode-cdn.net/opt/conda/lib/python3.11/site-packages/torch/nn/modules/module.py:1568), in Module._call_impl(self, *args, **kwargs)
[1566](https://file+.vscode-resource.vscode-cdn.net/opt/conda/lib/python3.11/site-packages/torch/nn/modules/module.py:1566) args = bw_hook.setup_input_hook(args)
-> [1568](https://file+.vscode-resource.vscode-cdn.net/opt/conda/lib/python3.11/site-packages/torch/nn/modules/module.py:1568) result = forward_call(*args, **kwargs)
[1569](https://file+.vscode-resource.vscode-cdn.net/opt/conda/lib/python3.11/site-packages/torch/nn/modules/module.py:1569) if _global_forward_hooks or self._forward_hooks:
File [~/ML-data/s5_pytorch/s5_model.py:426](https://file+.vscode-resource.vscode-cdn.net/Users/karllundgrens/Nextcloud/Skola/Chalmers/Year%202/Cetasol-thesis/Masters-Thesis/S5/~/ML-data/s5_pytorch/s5_model.py:426), in S5Block.forward(self, x)
[425](https://file+.vscode-resource.vscode-cdn.net/Users/karllundgrens/Nextcloud/Skola/Chalmers/Year%202/Cetasol-thesis/Masters-Thesis/S5/~/ML-data/s5_pytorch/s5_model.py:425) res = fx.clone()
--> [426](https://file+.vscode-resource.vscode-cdn.net/Users/karllundgrens/Nextcloud/Skola/Chalmers/Year%202/Cetasol-thesis/Masters-Thesis/S5/~/ML-data/s5_pytorch/s5_model.py:426) x = F.gelu(self.s5(fx)) + res
[427](https://file+.vscode-resource.vscode-cdn.net/Users/karllundgrens/Nextcloud/Skola/Chalmers/Year%202/Cetasol-thesis/Masters-Thesis/S5/~/ML-data/s5_pytorch/s5_model.py:427) x = self.attn_dropout(x)
File [/opt/conda/lib/python3.11/site-packages/torch/nn/modules/module.py:1518](https://file+.vscode-resource.vscode-cdn.net/opt/conda/lib/python3.11/site-packages/torch/nn/modules/module.py:1518), in Module._wrapped_call_impl(self, *args, **kwargs)
[1517](https://file+.vscode-resource.vscode-cdn.net/opt/conda/lib/python3.11/site-packages/torch/nn/modules/module.py:1517) else:
-> [1518](https://file+.vscode-resource.vscode-cdn.net/opt/conda/lib/python3.11/site-packages/torch/nn/modules/module.py:1518) return self._call_impl(*args, **kwargs)
File [/opt/conda/lib/python3.11/site-packages/torch/nn/modules/module.py:1568](https://file+.vscode-resource.vscode-cdn.net/opt/conda/lib/python3.11/site-packages/torch/nn/modules/module.py:1568), in Module._call_impl(self, *args, **kwargs)
[1566](https://file+.vscode-resource.vscode-cdn.net/opt/conda/lib/python3.11/site-packages/torch/nn/modules/module.py:1566) args = bw_hook.setup_input_hook(args)
-> [1568](https://file+.vscode-resource.vscode-cdn.net/opt/conda/lib/python3.11/site-packages/torch/nn/modules/module.py:1568) result = forward_call(*args, **kwargs)
[1569](https://file+.vscode-resource.vscode-cdn.net/opt/conda/lib/python3.11/site-packages/torch/nn/modules/module.py:1569) if _global_forward_hooks or self._forward_hooks:
File [~/ML-data/s5_pytorch/s5_model.py:377](https://file+.vscode-resource.vscode-cdn.net/Users/karllundgrens/Nextcloud/Skola/Chalmers/Year%202/Cetasol-thesis/Masters-Thesis/S5/~/ML-data/s5_pytorch/s5_model.py:377), in S5.forward(self, signal, step_scale)
[375](https://file+.vscode-resource.vscode-cdn.net/Users/karllundgrens/Nextcloud/Skola/Chalmers/Year%202/Cetasol-thesis/Masters-Thesis/S5/~/ML-data/s5_pytorch/s5_model.py:375) step_scale = torch.ones(signal.shape[0], device=signal.device) * step_scale
--> [377](https://file+.vscode-resource.vscode-cdn.net/Users/karllundgrens/Nextcloud/Skola/Chalmers/Year%202/Cetasol-thesis/Masters-Thesis/S5/~/ML-data/s5_pytorch/s5_model.py:377) return torch.vmap(lambda s, ss: self.seq(s, step_scale=ss))(signal, step_scale)
File [/opt/conda/lib/python3.11/site-packages/torch/_functorch/apis.py:188](https://file+.vscode-resource.vscode-cdn.net/opt/conda/lib/python3.11/site-packages/torch/_functorch/apis.py:188), in vmap.<locals>.wrapped(*args, **kwargs)
[187](https://file+.vscode-resource.vscode-cdn.net/opt/conda/lib/python3.11/site-packages/torch/_functorch/apis.py:187) def wrapped(*args, **kwargs):
--> [188](https://file+.vscode-resource.vscode-cdn.net/opt/conda/lib/python3.11/site-packages/torch/_functorch/apis.py:188) return vmap_impl(func, in_dims, out_dims, randomness, chunk_size, *args, **kwargs)
File [/opt/conda/lib/python3.11/site-packages/torch/_functorch/vmap.py:266](https://file+.vscode-resource.vscode-cdn.net/opt/conda/lib/python3.11/site-packages/torch/_functorch/vmap.py:266), in vmap_impl(func, in_dims, out_dims, randomness, chunk_size, *args, **kwargs)
[265](https://file+.vscode-resource.vscode-cdn.net/opt/conda/lib/python3.11/site-packages/torch/_functorch/vmap.py:265) # If chunk_size is not specified.
--> [266](https://file+.vscode-resource.vscode-cdn.net/opt/conda/lib/python3.11/site-packages/torch/_functorch/vmap.py:266) return _flat_vmap(
[267](https://file+.vscode-resource.vscode-cdn.net/opt/conda/lib/python3.11/site-packages/torch/_functorch/vmap.py:267) func, batch_size, flat_in_dims, flat_args, args_spec, out_dims, randomness, **kwargs
[268](https://file+.vscode-resource.vscode-cdn.net/opt/conda/lib/python3.11/site-packages/torch/_functorch/vmap.py:268) )
File [/opt/conda/lib/python3.11/site-packages/torch/_functorch/vmap.py:38](https://file+.vscode-resource.vscode-cdn.net/opt/conda/lib/python3.11/site-packages/torch/_functorch/vmap.py:38), in doesnt_support_saved_tensors_hooks.<locals>.fn(*args, **kwargs)
[37](https://file+.vscode-resource.vscode-cdn.net/opt/conda/lib/python3.11/site-packages/torch/_functorch/vmap.py:37) with torch.autograd.graph.disable_saved_tensors_hooks(message):
---> [38](https://file+.vscode-resource.vscode-cdn.net/opt/conda/lib/python3.11/site-packages/torch/_functorch/vmap.py:38) return f(*args, **kwargs)
File [/opt/conda/lib/python3.11/site-packages/torch/_functorch/vmap.py:379](https://file+.vscode-resource.vscode-cdn.net/opt/conda/lib/python3.11/site-packages/torch/_functorch/vmap.py:379), in _flat_vmap(func, batch_size, flat_in_dims, flat_args, args_spec, out_dims, randomness, **kwargs)
[378](https://file+.vscode-resource.vscode-cdn.net/opt/conda/lib/python3.11/site-packages/torch/_functorch/vmap.py:378) batched_inputs = _create_batched_inputs(flat_in_dims, flat_args, vmap_level, args_spec)
--> [379](https://file+.vscode-resource.vscode-cdn.net/opt/conda/lib/python3.11/site-packages/torch/_functorch/vmap.py:379) batched_outputs = func(*batched_inputs, **kwargs)
[380](https://file+.vscode-resource.vscode-cdn.net/opt/conda/lib/python3.11/site-packages/torch/_functorch/vmap.py:380) return _unwrap_batched(batched_outputs, out_dims, vmap_level, batch_size, func)
File [~/ML-data/s5_pytorch/s5_model.py:377](https://file+.vscode-resource.vscode-cdn.net/Users/karllundgrens/Nextcloud/Skola/Chalmers/Year%202/Cetasol-thesis/Masters-Thesis/S5/~/ML-data/s5_pytorch/s5_model.py:377), in S5.forward.<locals>.<lambda>(s, ss)
[375](https://file+.vscode-resource.vscode-cdn.net/Users/karllundgrens/Nextcloud/Skola/Chalmers/Year%202/Cetasol-thesis/Masters-Thesis/S5/~/ML-data/s5_pytorch/s5_model.py:375) step_scale = torch.ones(signal.shape[0], device=signal.device) * step_scale
--> [377](https://file+.vscode-resource.vscode-cdn.net/Users/karllundgrens/Nextcloud/Skola/Chalmers/Year%202/Cetasol-thesis/Masters-Thesis/S5/~/ML-data/s5_pytorch/s5_model.py:377) return torch.vmap(lambda s, ss: self.seq(s, step_scale=ss))(signal, step_scale)
File [/opt/conda/lib/python3.11/site-packages/torch/nn/modules/module.py:1518](https://file+.vscode-resource.vscode-cdn.net/opt/conda/lib/python3.11/site-packages/torch/nn/modules/module.py:1518), in Module._wrapped_call_impl(self, *args, **kwargs)
[1517](https://file+.vscode-resource.vscode-cdn.net/opt/conda/lib/python3.11/site-packages/torch/nn/modules/module.py:1517) else:
-> [1518](https://file+.vscode-resource.vscode-cdn.net/opt/conda/lib/python3.11/site-packages/torch/nn/modules/module.py:1518) return self._call_impl(*args, **kwargs)
File [/opt/conda/lib/python3.11/site-packages/torch/nn/modules/module.py:1581](https://file+.vscode-resource.vscode-cdn.net/opt/conda/lib/python3.11/site-packages/torch/nn/modules/module.py:1581), in Module._call_impl(self, *args, **kwargs)
[1580](https://file+.vscode-resource.vscode-cdn.net/opt/conda/lib/python3.11/site-packages/torch/nn/modules/module.py:1580) else:
-> [1581](https://file+.vscode-resource.vscode-cdn.net/opt/conda/lib/python3.11/site-packages/torch/nn/modules/module.py:1581) hook_result = hook(self, args, result)
[1583](https://file+.vscode-resource.vscode-cdn.net/opt/conda/lib/python3.11/site-packages/torch/nn/modules/module.py:1583) if hook_result is not None:
File [/opt/conda/lib/python3.11/site-packages/torchinfo/torchinfo.py:597](https://file+.vscode-resource.vscode-cdn.net/opt/conda/lib/python3.11/site-packages/torchinfo/torchinfo.py:597), in construct_hook.<locals>.hook(module, inputs, outputs)
[596](https://file+.vscode-resource.vscode-cdn.net/opt/conda/lib/python3.11/site-packages/torchinfo/torchinfo.py:596) info.calculate_num_params()
--> [597](https://file+.vscode-resource.vscode-cdn.net/opt/conda/lib/python3.11/site-packages/torchinfo/torchinfo.py:597) info.input_size, _ = info.calculate_size(inputs, batch_dim)
[598](https://file+.vscode-resource.vscode-cdn.net/opt/conda/lib/python3.11/site-packages/torchinfo/torchinfo.py:598) info.output_size, elem_bytes = info.calculate_size(outputs, batch_dim)
File [/opt/conda/lib/python3.11/site-packages/torchinfo/layer_info.py:104](https://file+.vscode-resource.vscode-cdn.net/opt/conda/lib/python3.11/site-packages/torchinfo/layer_info.py:104), in LayerInfo.calculate_size(inputs, batch_dim)
[100](https://file+.vscode-resource.vscode-cdn.net/opt/conda/lib/python3.11/site-packages/torchinfo/layer_info.py:100) # pack_padded_seq and pad_packed_seq store feature into data attribute
[101](https://file+.vscode-resource.vscode-cdn.net/opt/conda/lib/python3.11/site-packages/torchinfo/layer_info.py:101) elif (
[102](https://file+.vscode-resource.vscode-cdn.net/opt/conda/lib/python3.11/site-packages/torchinfo/layer_info.py:102) isinstance(inputs, (list, tuple))
[103](https://file+.vscode-resource.vscode-cdn.net/opt/conda/lib/python3.11/site-packages/torchinfo/layer_info.py:103) and inputs
--> [104](https://file+.vscode-resource.vscode-cdn.net/opt/conda/lib/python3.11/site-packages/torchinfo/layer_info.py:104) and hasattr(inputs[0], "data")
[105](https://file+.vscode-resource.vscode-cdn.net/opt/conda/lib/python3.11/site-packages/torchinfo/layer_info.py:105) and hasattr(inputs[0].data, "size")
[106](https://file+.vscode-resource.vscode-cdn.net/opt/conda/lib/python3.11/site-packages/torchinfo/layer_info.py:106) ):
[107](https://file+.vscode-resource.vscode-cdn.net/opt/conda/lib/python3.11/site-packages/torchinfo/layer_info.py:107) size = list(inputs[0].data.size())
RuntimeError: accessing `data` under vmap transform is not allowed
The above exception was the direct cause of the following exception:
RuntimeError Traceback (most recent call last)
Cell In[7], [line 11](vscode-notebook-cell:?execution_count=7&line=11)
[8](vscode-notebook-cell:?execution_count=7&line=8) # model = S5(32, 32)
[9](vscode-notebook-cell:?execution_count=7&line=9) model = S5Block(dim, 512, block_count=8, bidir=False)
---> [11](vscode-notebook-cell:?execution_count=7&line=11) print(torchinfo.summary(model, (2, 8192, dim), device='cpu', depth=5))
[13](vscode-notebook-cell:?execution_count=7&line=13) for i in range(5):
[14](vscode-notebook-cell:?execution_count=7&line=14) y = model(x)
File [/opt/conda/lib/python3.11/site-packages/torchinfo/torchinfo.py:223](https://file+.vscode-resource.vscode-cdn.net/opt/conda/lib/python3.11/site-packages/torchinfo/torchinfo.py:223), in summary(model, input_size, input_data, batch_dim, cache_forward_pass, col_names, col_width, depth, device, dtypes, mode, row_settings, verbose, **kwargs)
[216](https://file+.vscode-resource.vscode-cdn.net/opt/conda/lib/python3.11/site-packages/torchinfo/torchinfo.py:216) validate_user_params(
[217](https://file+.vscode-resource.vscode-cdn.net/opt/conda/lib/python3.11/site-packages/torchinfo/torchinfo.py:217) input_data, input_size, columns, col_width, device, dtypes, verbose
[218](https://file+.vscode-resource.vscode-cdn.net/opt/conda/lib/python3.11/site-packages/torchinfo/torchinfo.py:218) )
[220](https://file+.vscode-resource.vscode-cdn.net/opt/conda/lib/python3.11/site-packages/torchinfo/torchinfo.py:220) x, correct_input_size = process_input(
[221](https://file+.vscode-resource.vscode-cdn.net/opt/conda/lib/python3.11/site-packages/torchinfo/torchinfo.py:221) input_data, input_size, batch_dim, device, dtypes
[222](https://file+.vscode-resource.vscode-cdn.net/opt/conda/lib/python3.11/site-packages/torchinfo/torchinfo.py:222) )
--> [223](https://file+.vscode-resource.vscode-cdn.net/opt/conda/lib/python3.11/site-packages/torchinfo/torchinfo.py:223) summary_list = forward_pass(
[224](https://file+.vscode-resource.vscode-cdn.net/opt/conda/lib/python3.11/site-packages/torchinfo/torchinfo.py:224) model, x, batch_dim, cache_forward_pass, device, model_mode, **kwargs
[225](https://file+.vscode-resource.vscode-cdn.net/opt/conda/lib/python3.11/site-packages/torchinfo/torchinfo.py:225) )
[226](https://file+.vscode-resource.vscode-cdn.net/opt/conda/lib/python3.11/site-packages/torchinfo/torchinfo.py:226) formatting = FormattingOptions(depth, verbose, columns, col_width, rows)
[227](https://file+.vscode-resource.vscode-cdn.net/opt/conda/lib/python3.11/site-packages/torchinfo/torchinfo.py:227) results = ModelStatistics(
[228](https://file+.vscode-resource.vscode-cdn.net/opt/conda/lib/python3.11/site-packages/torchinfo/torchinfo.py:228) summary_list, correct_input_size, get_total_memory_used(x), formatting
[229](https://file+.vscode-resource.vscode-cdn.net/opt/conda/lib/python3.11/site-packages/torchinfo/torchinfo.py:229) )
File [/opt/conda/lib/python3.11/site-packages/torchinfo/torchinfo.py:304](https://file+.vscode-resource.vscode-cdn.net/opt/conda/lib/python3.11/site-packages/torchinfo/torchinfo.py:304), in forward_pass(model, x, batch_dim, cache_forward_pass, device, mode, **kwargs)
[302](https://file+.vscode-resource.vscode-cdn.net/opt/conda/lib/python3.11/site-packages/torchinfo/torchinfo.py:302) except Exception as e:
[303](https://file+.vscode-resource.vscode-cdn.net/opt/conda/lib/python3.11/site-packages/torchinfo/torchinfo.py:303) executed_layers = [layer for layer in summary_list if layer.executed]
--> [304](https://file+.vscode-resource.vscode-cdn.net/opt/conda/lib/python3.11/site-packages/torchinfo/torchinfo.py:304) raise RuntimeError(
[305](https://file+.vscode-resource.vscode-cdn.net/opt/conda/lib/python3.11/site-packages/torchinfo/torchinfo.py:305) "Failed to run torchinfo. See above stack traces for more details. "
[306](https://file+.vscode-resource.vscode-cdn.net/opt/conda/lib/python3.11/site-packages/torchinfo/torchinfo.py:306) f"Executed layers up to: {executed_layers}"
[307](https://file+.vscode-resource.vscode-cdn.net/opt/conda/lib/python3.11/site-packages/torchinfo/torchinfo.py:307) ) from e
[308](https://file+.vscode-resource.vscode-cdn.net/opt/conda/lib/python3.11/site-packages/torchinfo/torchinfo.py:308) finally:
[309](https://file+.vscode-resource.vscode-cdn.net/opt/conda/lib/python3.11/site-packages/torchinfo/torchinfo.py:309) if hooks:
RuntimeError: Failed to run torchinfo. See above stack traces for more details. Executed layers up to: [LayerNorm: 1]
Other toy models tested with torchinfo.summary()
works without issue. Any idea?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.