I have the following setup and I actually changed the d_input from 38 to 37 <p dir

As I stated before my npz dataset dimension looks like this 👍 <div class="snippet

RuntimeError: Given groups=1, weight of size [48, 37, 11], expected input[8, 691, 18] to have 37 channels, but got 691 channels instead about transformer HOT 6 OPEN

Abdelsater commented on June 11, 2024

RuntimeError: Given groups=1, weight of size [48, 37, 11], expected input[8, 691, 18] to have 37 channels, but got 691 channels instead

from transformer.

Comments (6)

maxjcohen commented on June 11, 2024

Hi, as you can see your input vector has shape (8, 18, 691) from what was printed before the error, whereas you would like to have an input with shape (batch_size, time_length, d_input). My guess is that you tried to cobined R with X in your data preprocessing, which would explain the size 691 = 672 + 19 of you current input vector.

I suggest double checking your data processing function, and to concatenate X and R on the d_input dimension, as to obtain an input vector of shape (7500, 19+8=37, 672). Note that you may need to broadcast R.

from transformer.

Abdelsater commented on June 11, 2024

thank you for your calrification, but I am still not quite sure how to address it, this how the function is converting the dataset from csv to npz :

def csv2npz(dataset_x_path, dataset_y_path, output_path, filename, labels_path='labels.json'):
    """Load input dataset from csv and create x_train tensor."""
    # Load dataset as csv
    x = pd.read_csv(dataset_x_path)
    y = pd.read_csv(dataset_y_path)

    # Load labels, file can be found in challenge description
    with open(labels_path, "r") as stream_json:
        labels = json.load(stream_json)

    m = x.shape[0]
    K = TIME_SERIES_LENGTH  # Can be found through csv

    # Create R and Z
    R = x[labels["R"]].values
    R = np.tile(R, 672)
    R = R.astype(np.float32)

    X = y[[f"{var_name}_{i}" for var_name in labels["X"]
           for i in range(K)]]
    X = X.values.reshape((m, -1, K))
    X = X.astype(np.float32)

    Z = x[[f"{var_name}_{i}" for var_name in labels["Z"]
           for i in range(K)]]
    Z = Z.values.reshape((m, -1, K))
#     Z = Z.transpose((0, 2, 1))
    Z = Z.astype(np.float32)

    np.savez(path.join(output_path, filename), R=R, X=X, Z=Z)

my input and output after reading the csv files look like the following :

d_input : 37
d_output : 8

Could you please point out for me or edit the code directly and thank you again for sharing the code and supporting in troubleshouting

from transformer.

maxjcohen commented on June 11, 2024

my input and output after reading the csv files look like the following :
d_input : 37
d_output : 8

This seems good to me, so the problem probably isn't from the csv2npz function, but rather in your dataloader and how it handles the R and X variables. Sorry for the late answer, please tell me if that helps.

from transformer.

Abdelsater commented on June 11, 2024

As I stated before my npz dataset dimension looks like this 👍

[('R', (7500, 12768), dtype('float32')), ('X', (7500, 8, 672), dtype('float32')), ('Z', (7500, 18, 672), dtype('float32'))]'

the benchmark notebook that I am trying to use looks like this (it is taken from your repo on gihub) and the dataloader specifically looks like this:

import numpy as np
from matplotlib import pyplot as plt
import torch
import torch.nn as nn
import torch.optim as optim
from torch.utils.data import DataLoader, random_split
from tqdm import tqdm
import seaborn as sns

from tst.loss import OZELoss

from src.benchmark import BiGRU, ConvGru
from src.dataset import OzeDataset
from src.utils import compute_loss
from src.visualization import map_plot_function, plot_values_distribution, plot_error_distribution, plot_errors_threshold, plot_visual_sample
# Training parameters
DATASET_PATH = 'Output-Dataset.npz'
BATCH_SIZE = 8
NUM_WORKERS = 4
LR = 1e-4
EPOCHS = 30

# Model parameters
d_model = 48 # Lattent dim
N = 2 # Number of layers
dropout = 0.2 # Dropout rate

d_input = 37 # From dataset
d_output = 8 # From dataset

# Config
sns.set()
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
print(f"Using device {device}")
Using device cuda:0
Training
Load dataset
ozeDataset = OzeDataset(DATASET_PATH)

dataset_train, dataset_val, dataset_test = random_split(ozeDataset, (5500, 1000, 1000))
dataloader_train = DataLoader(dataset_train,
                              batch_size=BATCH_SIZE,
                              shuffle=True,
                              num_workers=NUM_WORKERS,
                              pin_memory=False
                             )

dataloader_val = DataLoader(dataset_val,
                            batch_size=BATCH_SIZE,
                            shuffle=True,
                            num_workers=NUM_WORKERS
                           )

dataloader_test = DataLoader(dataset_test,
                             batch_size=BATCH_SIZE,
                             shuffle=False,
                             num_workers=NUM_WORKERS
                            )
Load network
# Load transformer with Adam optimizer and MSE loss function
net = ConvGru(d_input, d_model, d_output, N, dropout=dropout, bidirectional=True).to(device)
optimizer = optim.Adam(net.parameters(), lr=LR)
loss_function = OZELoss(alpha=0.3)
Train
model_save_path = f'models/model_LSTM_{datetime.datetime.now().strftime("%Y_%m_%d__%H%M%S")}.pth'
val_loss_best = np.inf

# Prepare loss history
hist_loss = np.zeros(EPOCHS)
hist_loss_val = np.zeros(EPOCHS)
for idx_epoch in range(EPOCHS):
    running_loss = 0
    with tqdm(total=len(dataloader_train.dataset), desc=f"[Epoch {idx_epoch+1:3d}/{EPOCHS}]") as pbar:
        for idx_batch, (x, y) in enumerate(dataloader_train):
            optimizer.zero_grad()

            # Propagate input
            netout = net(x.to(device))

            # Comupte loss
            loss = loss_function(y.to(device), netout)

            # Backpropage loss
            loss.backward()

            # Update weights
            optimizer.step()

            running_loss += loss.item()
            pbar.set_postfix({'loss': running_loss/(idx_batch+1)})
            pbar.update(x.shape[0])
        
        train_loss = running_loss/len(dataloader_train)
        val_loss = compute_loss(net, dataloader_val, loss_function, device).item()
        pbar.set_postfix({'loss': train_loss, 'val_loss': val_loss})
        
        hist_loss[idx_epoch] = train_loss
        hist_loss_val[idx_epoch] = val_loss
        
        if val_loss < val_loss_best:
            val_loss_best = val_loss
            torch.save(net.state_dict(), model_save_path)
        
plt.plot(hist_loss, 'o-', label='train')
plt.plot(hist_loss_val, 'o-', label='val')
plt.legend()
print(f"model exported to {model_save_path} with loss {val_loss_best:5f}")'

The error that I am getting is the following:

RuntimeError: Given groups=1, weight of size [48, 37, 11], expected input[8, 13440, 18] to have 37 channels, but got 13440 channels instead'

This error is giving me hard times, since I tried several transformation before , but since you confirmed the same input and output , how we can make this work , by the way I tried the original benchmark using the csv directly and it worked , the code looks like this :

import datetime

import numpy as np
import pandas as pd
from matplotlib import pyplot as plt
from tqdm import tqdm
import torch
import torch.nn as nn
import torch.optim as optim
from torch.utils.data import DataLoader
from pathlib import Path
import sys
import psutil

from src.dataset import OzeDataset, OzeEvaluationDataset, OzeNPZDataset
from src.utils import npz_check, compute_loss, csv2npz
from src.model import BenchmarkLSTM
BATCH_SIZE = 100
# NUM_WORKERS = psutil.cpu_count() # Use this to get number of logical processing units
NUM_WORKERS = psutil.cpu_count(logical=False) # Use this to get number of physical Cores
LR = 1e-2
EPOCHS = 30
HIDDEN_DIM = 100
NUM_LAYERS = 3

device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
print(f"Using device {device}")

#dataset = OzeNPZDataset(dataset_path=npz_check(Path('datasets'), 'dataset'), labels_path="labels.json")
dataset = OzeDataset(dataset_x_path="Datasets/x_train_LsAZgHU.csv", dataset_y_path="Datasets/y_train_EFo1WyE.csv", labels_path="labels.json")
#K = dataset.time_series_length
K= 672

# More info about memory pinning here: https://pytorch.org/docs/stable/data.html#memory-pinning
is_cuda = device == torch.device("cuda:0")
num_workers = 0 if is_cuda else NUM_WORKERS
dataloader = DataLoader(dataset,
                        batch_size=BATCH_SIZE,
                        shuffle=True,
                        pin_memory=is_cuda,
                        num_workers=num_workers)

m, M = dataloader.dataset.m, dataloader.dataset.M

d_input = dataset.get_x_shape()[2]  # From dataset
print('d_input : {}'.format(d_input))
d_output = dataset.get_y_shape()[2]  # From dataset
print('d_output : {}'.format(d_output))
# Load benchmark network with Adam optimizer and MSE loss function
net = BenchmarkLSTM(input_dim=d_input, hidden_dim=HIDDEN_DIM, output_dim=d_output, num_layers=NUM_LAYERS).to(device)
loss_function = nn.MSELoss()
optimizer = optim.Adam(net.parameters(), lr=LR)

model_save_path = f'model_{datetime.datetime.now().strftime("%Y_%m_%d__%H%M%S")}.pth'

def fit():
    """
    Fits selected network
    """
    loss_best = np.inf
    # Prepare loss history
    hist_loss = np.zeros(EPOCHS)
    for idx_epoch in range(EPOCHS):
        running_loss = 0
        with tqdm(total=len(dataloader.dataset), desc=f"[Epoch {idx_epoch+1:3d}/{EPOCHS}]") as pbar:
            for idx_batch, (inp, out) in enumerate(dataloader):
                optimizer.zero_grad()

                # Propagate input
                net_out = net(inp.to(device))

                # Compute loss
                loss = loss_function(out.to(device), net_out)

                # Backpropagate loss
                loss.backward()

                # Update weights
                optimizer.step()

                running_loss += loss.item()
                pbar.set_postfix({'loss': running_loss/(idx_batch+1)})
                pbar.update(inp.shape[0])

            train_loss = running_loss/len(dataloader)
            pbar.set_postfix({'loss': train_loss})

            hist_loss[idx_epoch] = train_loss

            if train_loss < loss_best:
                train_loss_best = train_loss
                torch.save(net.state_dict(), model_save_path)
    print(f"\nmodel exported to {model_save_path} with loss {train_loss_best:5f}")
    return hist_loss

try:
    hist_loss = fit()
except RuntimeError as err:
    if str(err).startswith('CUDA out of memory.'):
        print('\nSwitching device to cpu to workaround CUDA out of memory problem.')
        device = torch.device("cpu")
        net = net.to(device)
        dataloader = DataLoader(dataset,
                                batch_size=BATCH_SIZE,
                                shuffle=True,
                                pin_memory=False,
                                num_workers=NUM_WORKERS)
        hist_loss = fit()
    else:
        sys.exit()

plt.plot(hist_loss, 'o-', label='train')
plt.legend()

Thank you for debugging this with me , my goal is to re-run your experiment so I can build my own transformer in the end, so understanding your experiment will help me a lot. Thank you

from transformer.

RuntimeError: Given groups=1, weight of size [48, 37, 11], expected input[8, 691, 18] to have 37 channels, but got 691 channels instead about transformer HOT 6 OPEN

Comments (6)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent