Simple Code Snippets to load different datasets

Hi,
It would be great if you can provide code snippets to load datasets from the h5 file for all different datasets. (As it helps how the dataset is stored and how we can split it into train, test set etc)

Unable to Get How its Mesh Invariant??

Hi,
I went through the paper and codebase. I have doubts about how it is invariant to mesh size if we train on 48x48 then how to transfer the solution to 85x85 like superresolution in FNO??

Code running error

Setting up PyTorch plugin "filtered_lrelu_plugin"... Failed!

Trouble reproducing results shown in the paper

Overview

For whatever reason, I can not seem to reproduce the results that are shown in the paper. I have a feeling that I am doing something specific that is messing up results as I can not get the same accuracy for UNet, FNO, nor CNO. I am using my own setup, so not a clone of the repo, but I am using the same models and setup, so I should get the same results. Unfortunately, my results for any of the models for the NS case is minimum 10% error, and not 3% error like shown in the paper. In any case, I will outline what exactly is my setup and if there is a glaring mistake, I would appreciate any feedback.

Data

Dataset: Navier Stokes 64x64 dataset. (Assume this is the case for everything below)

Data splits: I split up the data into training and testing like what you do :

# Import data
f = h5py.File('../NavierStokes_64x64_IN.h5', 'r')
x = []
y = []
for key in f.keys():
    x.append(f[key]['input'])
    y.append(f[key]['output'])
X = np.array(x)
y = np.array(y)

# Permute axis
X = X[:, np.newaxis, ...]
y = y[:, np.newaxis, ...]

# Split data up into train, val, test
X_train, y_train = X[:768], y[:768]
X_valid, y_valid = X[768:768 + 128], y[768:768 + 128]
X_test, y_test = X[768 + 128:768 + 128*2], y[768 + 128:768 + 128*2]

# Transform data (it seems you this this normalization based off your code)
min_data, max_data = np.min(X_train), np.max(X_train)
#min_model, max_model = np.min(y_train), np.max(y_train) (this is not used - I only apply the transformation on X)

class NormalizeMinMax(torch.nn.Module):
    def __init__(self, img_min, img_max):
        self.img_min = img_min
        self.img_max = img_max
        
    def __call__(self, img):
        new_img = (img - self.img_min) / (self.img_max - self.img_min)
        return new_img
    
transform = transforms.Compose(
    [
        NormalizeMinMax(min_data, max_data),
    ]
)

# Convert to tensors
X_train = torch.tensor(X_train, dtype=torch.float32)
y_train = torch.tensor(y_train, dtype=torch.float32)

X_valid = torch.tensor(X_valid, dtype=torch.float32)
y_valid = torch.tensor(y_valid, dtype=torch.float32)

X_train = transform(X_train)
X_valid = transform(X_valid)

Hyperparameters

Here, I use the models supplied in the repository with a PyTorch Lightning setup. Below, I will provide some psuedocode to explain the hyperparameter/optimization setups. I apply the LR scheduler per epoch so each epoch the LR becomes LR * 0.98.

loss_function = nn.L1Loss() # CNO and UNet
loss_function = nn.SmoothL1Loss() # FNO

batch_size = 32 # CNO and FNO
batch_size = 10 # UNet

optimizer = torch.optim.AdamW(self.parameters(), lr=1e-3, weight_decay=1e-6) # FNO
optimizer = torch.optim.AdamW(self.parameters(), lr=1e-3, weight_decay=1e-10) # CNO
optimizer = torch.optim.AdamW(self.parameters(), lr=5e-4, weight_decay=1e-6) # UNet

scheduler = torch.optim.lr_scheduler.StepLR(optimizer, step_size=1, gamma=0.98) #unet, cno, and fno

Model Setup

Here are some code snippets that show how I initialize each of the models. Each of these are using the best performing hyperparameters you mention in the paper.

# CNO
class CNO2d(nn.Module):
    def __init__(self,
                in_dim = 1,                    # Number of input channels.
                out_dim = 1,                   # Number of input channels.
                size = 64,                      # Input and Output spatial size (required )
                N_layers = 3,                  # Number of (D) or (U) blocks in the network
                N_res = 1,                 # Number of (R) blocks per level (except the neck)
                N_res_neck = 8,            # Number of (R) blocks in the neck
                channel_multiplier = 32,   # How the number of channels evolve?
                use_bn = False,             # Add BN? We do not add BN in lifting/projection layer
                ):
# FNO
class FNO2d(nn.Module):
    def __init__(self, in_channels = 1, out_channels = 1, device=None):
        super(FNO2d, self).__init__()

        """
        The overall network. It contains 4 layers of the Fourier layer.
        1. Lift the input to the desire channel dimension by self.fc0 .
        2. 4 layers of the integral operators u' = (W + K)(u).
            W defined by self.w; K defined by self.conv .
        3. Project from the channel space to the output space by self.fc1 and self.fc2 .

        input: the solution of the coefficient function and locations (a(x, y), x, y)
        input shape: (batchsize, x=s, y=s, c=3)
        output: the solution
        output shape: (batchsize, x=s, y=s, c=1)
        """
        self.modes1 = 16 #16
        self.modes2 = 16 #16
        self.width = 128 #64
        self.n_layers = 5 #5
        self.retrain_fno = 4 #4
        self.padding = 0 #0
        self.include_grid = 1 #1
        self.input_dim = in_channels
        self.act  = nn.LeakyReLU() 
        self.device = device
 # UNet
 class UNet(nn.Module):
    def __init__(self, n_channels = 1, n_classes = 1, bilinear=False):
        super(UNet, self).__init__()
        self.n_channels = n_channels
        self.n_classes = n_classes
        self.bilinear = bilinear

        self.inc = (DoubleConv(n_channels, 64))
        self.down1 = (Down(64, 128))
        self.down2 = (Down(128, 256))
        self.down3 = (Down(256, 512))
        factor = 2 if bilinear else 1
        self.down4 = (Down(512, 1024 // factor))
        self.up1 = (Up(1024, 512 // factor, bilinear))
        self.up2 = (Up(512, 256 // factor, bilinear))
        self.up3 = (Up(256, 128 // factor, bilinear))
        self.up4 = (Up(128, 64, bilinear))
        self.outc = (OutConv(64, n_classes))

Error Calculations

Finally, I compute the Relative L1 Error as mentioned in your paper. I Apply the below function for each sample in my test set via a simple enumerated for loop, iterating through y_predicted.

X_test = torch.tensor(X_test, dtype=torch.float32)
X_test = transform(X_test)
test_loader = DataLoader(list(X_test), shuffle=False, batch_size=1)
y_predicted = trainer.predict(network, test_loader)

def relative_l1_error(y_pred, y_true):
    return torch.mean(torch.abs(y_pred - y_true)) / np.mean(np.abs(y_true))

Results

Now, comes the questions. As explained above, I can not get results lower than 10% on any of the models. I am attempting to get the 2.5-3.5% Relative L1 errors outlined in the paper for all CNO, FNO, and UNet.

My training accuracy for the SmoothL1Loss goes down to 1e-5 and the validation accuracy goes down to 0.005; however, if I use the L1Loss, I can not seem to break 10% and the loss just seems to stagnate, even with the hyperparameters given.

All models were trained without early stopping and up to 1000 epochs, but most of their loss stopped converging so I stopped it much sooner.

Conclusion

If there is anything within this setup that is clearly an issue, I would appreciate the feedback. If my understanding of any of the results are also incorrect, that would be good to know as well. My main questions personally lie within the model setups as the code in the repo and the paper were quite different, so I was not sure which setup was intended. In this case, I assumed the paper to be the correct setup and followed each table regardless of the code in the repo. Thank you very much for taking the time to read this until now - I hope that the fix is quite simple and that it was just a simple mistake on my part - but regardless, I am interested to hear what you all say. Have a great rest of your day :)

bogdanraonic3 / convolutionalneuraloperator Goto Github PK

convolutionalneuraloperator's Issues

Simple Code Snippets to load different datasets

Unable to Get How its Mesh Invariant??

Code running error

Trouble reproducing results shown in the paper

Overview

Data

Hyperparameters

Model Setup

Error Calculations

Results

Conclusion

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent