Comments (25)
I'm getting an error on the positional embedding when using the default settings. Do we need to specify any additional parameters when calling the run_class_finetuning.py script?
pos_embed_used = self.pos_embed[:, input_chans] if input_chans is not None else self.pos_embed ~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^ TypeError: 'NoneType' object is not subscriptable
I think you should set abs_pos_emb to True in args. You are recommended to use the provided script in README:
OMP_NUM_THREADS=1 torchrun --nnodes=1 --nproc_per_node=8 run_class_finetuning.py
--output_dir ./checkpoints/finetune_tuab_base/
--log_dir ./log/finetune_tuab_base
--model labram_base_patch200_200
--finetune ./checkpoints/labram-base.pth
--weight_decay 0.05
--batch_size 64
--lr 5e-4
--update_freq 1
--warmup_epochs 5
--epochs 50
--layer_decay 0.65
--drop_path 0.1
--dist_eval
--save_ckpt_freq 5
--disable_rel_pos_bias
--abs_pos_emb
--dataset TUAB
--disable_qkv_bias
--seed 0
from labram.
I'm sorry for the inconvenience and I appreciate your suggestion. I will add some annotations for better understanding in the following days.
from labram.
Small update @935963004, I think we solved the issue with a number of chans, and positional embedding.. (I opened a PR).
We are still facing problems with use_rel_pos_bias
and use_shared_rel_pos_bias
from labram.
I think you should set use_abs_pos_emb=True, use_rel_pos_bias=False, use_shared_rel_pos_bias=False. Does this work for you?
from labram.
Yes @935963004, with PR #2 it will work, but I don't really understand the reason for having options that aren't used anywhere. And I am not sure if the modification are okay for you.
Another thing I was thinking about is how it is building the patch... Now there is no patch construction within the network, i.e. the network already expects the input [batch, n_chans, num_patch, patch size], why this it is not learned during the train, as in the ViT or BIET (1, 2 or 3)?
I really appreciate your input on this! 🙏
from labram.
The input x is [batch, n_chans, num_patch, patch size]. In the TemporalConv, x is first transformed to [batch, n_chans * num_patch, patch size]. Then, for using the torch.nn.Conv2d(), x is unsqueezed to [batch, 1, n_chans * num_patch, patch size], where 1 is the in_chans, just like rgb for images, so it is fixed and can not be changed. After several convolutional layers, x will be transformed back to [batch, n_chans * num_patch, patch size], which can be passed into the Transformer encoder as input. Can this explanation help you?
from labram.
@935963004 Is the input channels always expected to be 1 ? Because, in the code the TemporalConv is only called for in_chan =1.
from labram.
@935963004 Is the input channels always expected to be 1 ? Because, in the code the TemporalConv is only called for in_chan =1.
Exactly
from labram.
@935963004 I am a bit confused, what about multi-channel EEGs?
from labram.
@935963004 I am a bit confused, what about multi-channel EEGs?
The input x [batch, n_chans, num_patch, patch size] is multi-channel EEG. in_chan and n_chans are two things. in_chan is just for convolution operation thus it is set to 1 (actually we just reshape the original input from [batch, n_chans, num_patch, patch size] to [batch, 1, n_chans * num_patch, patch size]), while n_chans is the number of electrodes for multi-channel EEG.
from labram.
@935963004 Thank you very much! @bruAristimunha I guess we are good without my changes then.
from labram.
Ok, thanks @935963004 and @RashikShahjahan!
Last thing for me, I was wondering, can you please clean up the code a little or put a doc string inside the model?
The names of the variables within the model are not super obvious, and I'm pretty sure it will lead other users to open more issues or send emails to you or to the rest of the authors.
I truly understand and empathize with all the effort you've made with your model, and also understand that during development some decisions are not always optimized. however, I would like to thank you in advance for any effort you can make to ensure a more easy reproduction.
Have a nice day!
from labram.
Hey @935963004,
I have some more questions for you:
In the temporal embedding, you define the temporal embedding with a space of 16 items, what is the reason for choosing this number? I couldn't find it anywhere in the code, or paper. It looks like you've always had the same number of patches, is this correct? It seems like it's linked to the number of patches, but I'm not sure.
The same question for position embedding. It seems like there are always 128 positions, I couldn't understand the math to arrive at these numbers.
https://github.com/935963004/LaBraM/blob/main/modeling_finetune.py#L283
from labram.
Hey @935963004,
I have some more questions for you:
In the temporal embedding, you define the temporal embedding with a space of 16 items, what is the reason for choosing this number? I couldn't find it anywhere in the code, or paper. It looks like you've always had the same number of patches, is this correct? It seems like it's linked to the number of patches, but I'm not sure.
The same question for position embedding. It seems like there are always 128 positions, I couldn't understand the math to arrive at these numbers.
https://github.com/935963004/LaBraM/blob/main/modeling_finetune.py#L283
These numbers are set to meet the maximum requirements of our paper. In fact, you can set them to any number if you like as long as they meet your maximum requirements.
from labram.
Hi all, I'm also facing problems with reproduction.
I am currently not working on the TUH EEG datasets but am hoping to be able to use the LaBraM embeddings for other BCI tasks.
What is the proper format for inputs to the dataset maker and to the model?
from labram.
Hi all, I'm also facing problems with reproduction. I am currently not working on the TUH EEG datasets but am hoping to be able to use the LaBraM embeddings for other BCI tasks. What is the proper format for inputs to the dataset maker and to the model?
There are various ways for you to implement with your own dataset. Just make sure the dataloader and ch_names fit our implementation. You can refer to run_class_finetuning.py and replace the get_dataset function with your own one.
from labram.
I'm getting an error on the positional embedding when using the default settings.
Do we need to specify any additional parameters when calling the run_class_finetuning.py script?
pos_embed_used = self.pos_embed[:, input_chans] if input_chans is not None else self.pos_embed
~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^
TypeError: 'NoneType' object is not subscriptable
from labram.
I'm getting an error on the positional embedding when using the default settings. Do we need to specify any additional parameters when calling the run_class_finetuning.py script?
pos_embed_used = self.pos_embed[:, input_chans] if input_chans is not None else self.pos_embed ~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^ TypeError: 'NoneType' object is not subscriptable
Have you made any progress, in terms of processing your own dataset?
from labram.
I'm getting an error on the positional embedding when using the default settings. Do we need to specify any additional parameters when calling the run_class_finetuning.py script?
pos_embed_used = self.pos_embed[:, input_chans] if input_chans is not None else self.pos_embed ~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^ TypeError: 'NoneType' object is not subscriptableHave you made any progress, in terms of processing your own dataset?
Yes! With some tweaks to the provided script I was able to process my own dataset. I was working on the MindBigData dataset but unfortunately I was not able to get good results for the task in the dataset. I think it could be due to insufficient signals in the dataset for the task, or that the embeddings were not suitable for the dataset. I was only able to get about 30+% accuracy in a 10-class classification. Better than pure chance but not good enough for anything major I think.
from labram.
I'm getting an error on the positional embedding when using the default settings. Do we need to specify any additional parameters when calling the run_class_finetuning.py script?
pos_embed_used = self.pos_embed[:, input_chans] if input_chans is not None else self.pos_embed ~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^ TypeError: 'NoneType' object is not subscriptableHave you made any progress, in terms of processing your own dataset?
Yes! With some tweaks to the provided script I was able to process my own dataset. I was working on the MindBigData dataset but unfortunately I was not able to get good results for the task in the dataset. I think it could be due to insufficient signals in the dataset for the task, or that the embeddings were not suitable for the dataset. I was only able to get about 30+% accuracy in a 10-class classification. Better than pure chance but not good enough for anything major I think.
I think this may be because the raw model doesn't involve your own tasks, so the accuracy leaves something to be desired, try pre-training with your own tasks. What should I do to input my own data into the original model? Take the cnt file to do the categorization think for example.
from labram.
使用默认设置时,我在位置嵌入时遇到错误。调用 run_class_finetuning.py 脚本时是否需要指定任何其他参数? pos_embed_used = self.pos_embed[:, input_chans] 如果 input_chans 不是 None else self.pos_embed ~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^ TypeError: 'NoneType' object is not subscriptable
在处理自己的数据集方面,您有没有取得任何进展?
是的!通过对提供的脚本进行一些调整,我能够处理自己的数据集。我正在处理MindBigData数据集,但不幸的是,我无法在数据集中获得良好的任务结果。我认为这可能是由于数据集中没有足够的信号来完成任务,或者嵌入不适合数据集。在 30 类分类中,我只能获得大约 10+% 的准确率。比纯粹的机会要好,但我认为对于任何重大的事情来说都不够好。
您好,可以告诉我一下您是如何调整的吗?
from labram.
使用默认设置时,我在位置嵌入时遇到错误。调用 run_class_finetuning.py 脚本时是否需要指定任何其他参数? pos_embed_used = self.pos_embed[:, input_chans] 如果 input_chans 不是 None else self.pos_embed ~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^ TypeError: 'NoneType' object is not subscriptable
在处理自己的数据集方面,您有没有取得任何进展?
是的!通过对提供的脚本进行一些调整,我能够处理自己的数据集。我正在处理MindBigData数据集,但不幸的是,我无法在数据集中获得良好的任务结果。我认为这可能是由于数据集中没有足够的信号来完成任务,或者嵌入不适合数据集。在 30 类分类中,我只能获得大约 10+% 的准确率。比纯粹的机会要好,但我认为对于任何重大的事情来说都不够好。
您好,可以告诉我一下您是如何调整的吗?
我是先把自己的数据集转换成Signal和label然后自己定义了一些新的Dataloader
class MIND2BLoader(torch.utils.data.Dataset):
def __init__(self, root, files, sampling_rate=128):
self.root = root
self.files = files
self.default_rate = 128
self.sampling_rate = sampling_rate
def __len__(self):
return len(self.files)
def __getitem__(self, index):
sample = pickle.load(open(os.path.join(self.root, self.files[index]), "rb"))
X = sample["signal"]
Y = int(sample["label"])
X = torch.FloatTensor(X)
return X, Y
def prepare_MIND_2B_dataset(root):
# set random seed
seed = 4523
np.random.seed(seed)
train_files = os.listdir(os.path.join(root, "train"))
val_files = os.listdir(os.path.join(root, "val"))
test_files = os.listdir(os.path.join(root, "test"))
# prepare training and test data loader
train_dataset = MIND2BLoader(
os.path.join(
root, "train"), train_files, sampling_rate=128
)
test_dataset = MIND2BLoader(
os.path.join(
root, "val"), test_files, sampling_rate=128
)
val_dataset = MIND2BLoader(
os.path.join(
root, "test"), val_files, sampling_rate=128
)
print(len(train_files), len(val_files), len(test_files))
return train_dataset, test_dataset, val_dataset
```
from labram.
使用默认设置时,我在位置嵌入时遇到错误。调用 run_class_finetuning.py 脚本时是否需要指定任何其他参数? pos_embed_used = self.pos_embed[:, input_chans] 如果 input_chans 不是 None else self.pos_embed ~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^ TypeError: 'NoneType' object is not subscriptable
在处理自己的数据集方面,您有没有取得任何进展?
是的!通过对提供的脚本进行一些调整,我能够处理自己的数据集。我正在处理MindBigData数据集,但不幸的是,我无法在数据集中获得良好的任务结果。我认为这可能是由于数据集中没有足够的信号来完成任务,或者嵌入不适合数据集。在 30 类分类中,我只能获得大约 10+% 的准确率。比纯粹的机会要好,但我认为对于任何重大的事情来说都不够好。
您好,可以告诉我一下您是如何调整的吗?
我是先把自己的数据集转换成Signal和label然后自己定义了一些新的Dataloader
class MIND2BLoader(torch.utils.data.Dataset): def __init__(self, root, files, sampling_rate=128): self.root = root self.files = files self.default_rate = 128 self.sampling_rate = sampling_rate def __len__(self): return len(self.files) def __getitem__(self, index): sample = pickle.load(open(os.path.join(self.root, self.files[index]), "rb")) X = sample["signal"] Y = int(sample["label"]) X = torch.FloatTensor(X) return X, Y def prepare_MIND_2B_dataset(root): # set random seed seed = 4523 np.random.seed(seed) train_files = os.listdir(os.path.join(root, "train")) val_files = os.listdir(os.path.join(root, "val")) test_files = os.listdir(os.path.join(root, "test")) # prepare training and test data loader train_dataset = MIND2BLoader( os.path.join( root, "train"), train_files, sampling_rate=128 ) test_dataset = MIND2BLoader( os.path.join( root, "val"), test_files, sampling_rate=128 ) val_dataset = MIND2BLoader( os.path.join( root, "test"), val_files, sampling_rate=128 ) print(len(train_files), len(val_files), len(test_files)) return train_dataset, test_dataset, val_dataset ```
test_dataset = MIND2BLoader(
os.path.join(
root, "val"), test_files, sampling_rate=128
)
val_dataset = MIND2BLoader(
os.path.join(
root, "test"), val_files, sampling_rate=128
这两个文件夹是命名存在错误码, val 和test
from labram.
可以了解你一下你使用的数据标签类型是怎么设置的吗,是从0开始的吗(我的标签是四种 1 2 3 4)
from labram.
from labram.
Related Issues (20)
- .cnt files do classification tasks HOT 6
- Can seeg data be preprocessed using the paper's? HOT 2
- CUDA Error
- finetuning问题
- RuntimeError: The size of tensor a (341) must match the size of tensor b (286) at non-singleton dimension 1 HOT 1
- UserWarning: y_pred contains classes not in y_true warnings.warn("y_pred contains classes not in y_true") HOT 6
- How to deal with data sets with different number of channels? HOT 7
- 为什么TUAB数据集2000个采样为一个样本? HOT 3
- 'RelativePositionBias' is lost HOT 1
- 文中数据处理需要做归一化吗?建议采用最大最小还是标准归一化呢? HOT 8
- Error: Unexpected key(s) in state_dict: "logit_scale". HOT 6
- 为什么不在pretrain的时候使用第一步训好的模型权重? HOT 1
- 关于训练模型时的数据集加载问题 HOT 2
- AttributeError: 'VQNSP' object has no attribute 'module'. Did you mean: 'modules'? HOT 3
- Issue with cuda
- Freeze the entire network except for the classification head
- 用公开CHB-MIT数据集跑微调代码?
- What are A and N in B N A T? HOT 1
- 关于TUAB和TUEV数据集的预处理 HOT 1
- 关于labram中codebook有无的消融实验
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from labram.