stefanonardo / pytorch-esn Goto Github PK

An Echo State Network module for PyTorch.

License: MIT License

Python 100.00%

pytorch esn machine-learning echo-state-networks deep-learning recurrent-neural-networks neural-networks reservoir-computing python

pytorch-esn's Introduction

PyTorch-ESN

PyTorch-ESN is a PyTorch module, written in Python, implementing Echo State Networks with leaky-integrated units. ESN's implementation with more than one layer is based on DeepESN. The readout is trainable by ridge regression or by PyTorch's optimizers.

Its development started under my master thesis titled "An Empirical Comparison of Recurrent Neural Networks on Sequence Modeling", which was supervised by Prof. Alessio Micheli and Dr. Claudio Gallicchio at the University of Pisa.

Prerequisites

PyTorch

Basic Usage

Offline training (ridge regression)

SVD

Mini-batch mode is not allowed with this method.

from torchesn.nn import ESN
from torchesn.utils import prepare_target

# prepare target matrix for offline training
flat_target = prepare_target(target, seq_lengths, washout)

model = ESN(input_size, hidden_size, output_size)

# train
model(input, washout, hidden, flat_target)

# inference
output, hidden = model(input, washout, hidden)

Cholesky or inverse

from torchesn.nn import ESN
from torchesn.utils import prepare_target

# prepare target matrix for offline training
flat_target = prepare_target(target, seq_lengths, washout)

model = ESN(input_size, hidden_size, output_size, readout_training='cholesky')

# accumulate matrices for ridge regression
for batch in batch_iter:
    model(batch, washout[batch], hidden, flat_target)

# train
model.fit()

# inference
output, hidden = model(input, washout, hidden)

Classification tasks

For classification, just use one of the previous methods and pass 'mean' or 'last' to output_steps argument.

model = ESN(input_size, hidden_size, output_size, output_steps='mean')

For more information see docstrings or section 4.7 of "A Practical Guide to Applying Echo State Networks" by Mantas Lukoševičius.

Online training (PyTorch optimizer)

Same as PyTorch.

pytorch-esn's People

Contributors

Stargazers

Watchers

Forkers

oakyms vladperervenko nurfen mizvol gaodawn tomfisher sszbuu loic001 ggzhang0071 leparalamapara bytesumoltd duizendnegen simon5115 virtualroyalty geraldmaale zkjzou shahrokhx jmmaroli buwanmei danielvo594520 jorsorokin xinzezhang jagandecapri danielhesslow chredlinger testmonkey02 shuowang-ai jcassiojr xgxg1314 behdadahmadi iokaf mofushaohua danielesmm kgpergher98 emanuelecosenza jnschnkr bryantanady djp-orz tdl77 sunyi000 majestyv reddy-sachin free-angel

pytorch-esn's Issues

Insights about improving accuracy for MNIST dataset

If I want to improve the accuracy (say around 95%) for classification on the MNIST dataset, do I need to tune the hyperparameters? Or do I need to change anything in the implementation? Can you give some insights about what I may need to improve?

Multiprocessing on Windows

Running mnist.py example throws a freeze_support() error.

On Windows all of your multiprocessing-using code must be guarded by:

if __name__ == "__main__":

source: https://stackoverflow.com/questions/24374288/where-to-put-freeze-support-in-a-python-script

Sequence to Sequence TIme Series using Deep ESN implementaion

How this can be implemented towards sequence to sequence time series using ESN of more than 1 layer (Deep ESN)?

risky judgment on solving the output with 'inv' method

Hi, stefanonardo.

In your src file 'torchesn/nn/echo_state_network.py', at the line of 248:
'''
if torch.det(A) != 0:
'''
The thought of using the determinant to judge whether the left partial number A has an inverse is right. However, since the results of det(A) can be really small but not equal to 0, the above implementation can treat this situation with equal to 0.

An improvement can be:
'''
col = X.size(1)
orig_rank = torch.matrix_rank(A).item()
tag = 'Inverse' if orig_rank == col else 'Pseudo-inverse'

if tag == 'Inverse':
W = torch.mm(torch.inverse(A), self.XTy).t()
else:
W = torch.mm(torch.pinverse(A), self.XTy).t()
'''

How to solve the situation of multiple data in one sample

thank you for your sharing，but i don't know how to solve the data such as(time_step,batch,input_size),time_step is the number of data per sample，and time_step is different in each sample. and input_size=2.
i don't konw the means of for i in steps in the function of Recurrent. it means 5000 samples or time_step is 5000?

Regarding classification task

we have run the code and we have put the features in tensor object for classification but we are getting following exception AttributeError: 'int' object has no attribute 'dim' can you help in elobrating us why this exception is coming The below are the lines at which it was coming
if input.dim() == 2 and bias is not None:
# fused op is marginally faster

how to efficiently predict chaotic serialized data , like lorenz ?

Hi,
Referring to the examples you provide, I have successsfully write the code to predict Lorenz serialized data, though the result is terrible. Um ... I still don't understand some details as follows.

Why we set 'washout ' this variable ? For example, when we set washout be 10 and try to use trx (length 30) to predict trY, we may get an output with length 20 (30-10= 20). BTW, I have read your comments in echo_state_network.py, but still don't understand the reason.
How we recogize 'Deep' this concept ? I have had a glance at the paper of Deep-ESN, and it seems that muti-reserviors have been connected to make this structure be deeper. I have read your code , but can not see this idea.

I am looking for your reply. :) @stefanonardo

Why is the accuracy in the training set and test set both about 10%?

I have not changed any code of mnist in the examples. Why is the accuracy in the training set and test set both about 10%?

Regarding Results Evaluation

we have run your algorithm and got the following results on the sample model
tensor([[ 0.4779],
[ 0.2621],
[ 0.1902],
...,
[ 0.3912],
[ 0.3592],
[ 0.3210]])
We do not know how to get the classification accuracy please guide us we will be really thankful to you

mnist: RuntimeError: linalg.solve: A must be batches of square matrices, but they are 501 by 10 matrices

Running the example/mnist.py I get:

/torchesn/nn/echo_state_network.py", line 237, in fit
    W = torch.linalg.solve(self.XTy,
RuntimeError: linalg.solve: A must be batches of square matrices, but they are 501 by 10 matrices

Onnx inference assistance

I managed to train a ESN model for timeseries prediction and exported same to ONNX file. Later on while inference it ran into issue.

'aux' is a reserved filename on windows

'/torchesn/utils/aux.py' is not valid on Windows because 'aux' is a reserved filename. You can't use/install the module...

And since version 0.4.0, Pytorch now officially supports Windows, so I think it would be better to change the filename.

Problems with mackey example

Here is what i've found:
1.

if dtype == torch.double:
    data = np.loadtxt('../../datasets/mg17.csv', delimiter=',', dtype=np.float64)
elif dtype == torch.float:
data = np.loadtxt('../../datasets/mg17.csv', delimiter=',', dtype=np.float32)

Here are wrong pathes to dataset since it's in the same directory so it should be

if dtype == torch.double:
    data = np.loadtxt('datasets/mg17.csv', delimiter=',', dtype=np.float64)
elif dtype == torch.float:
    data = np.loadtxt('datasets/mg17.csv', delimiter=',', dtype=np.float32)

Next.

X_data = np.expand_dims(data[:, [0]], axis=1)
Y_data = np.expand_dims(data[:, [1]], axis=1)

I guess it should be

X_data = np.expand_dims(data[:, [1]], axis=1)
Y_data = np.expand_dims(data[:, [0]], axis=1)

or am i wrong?
3. When launching with these fixes i'm getting

Training error: 0.051316608186617714
Test error: 0.052949286276803606

It's quite big. Moreover, when i'm trying to make plot to compare real and output i'm getting following picture on "test" data

5000 test outputs

Closer look (100 outputs)

Am i doing something wrong?

How to train the ESN in online mode?

Using svd to train mackey-glass data, the error can decrease to 3.2e-11.

While I change to gradient descent, after 100 epochs, the error becomes about 1e-4. I have no idea about how to tune paramters to make the error drop to that low. Can anybody helps?

    ## transform into dataset class
    train_dataset = torch.utils.data.dataset.TensorDataset(trX, trY)
    test_dataset = torch.utils.data.dataset.TensorDataset(tsX, tsY)

    ## transform dataset into dataloader
    train_dataloader = torch.utils.data.DataLoader(train_dataset, batch_size=1024, shuffle=False)
    test_dataloader = torch.utils.data.DataLoader(test_dataset, batch_size=1024, shuffle=False)


    model = ESN(input_size, hidden_size, output_size, readout_training='gd')
    model.to(device)
    opt = torch.optim.Adam(model.parameters(), lr=0.01, weight_decay=0)
    epochs = 100
    for epoch in range(epochs):
        hidden = None
        train_loss = 0
        for batch in train_dataloader:
            x, y = batch
            x = x.to(device)
            y = y.to(device)
            output, hidden = model(x, washout, hidden)
            loss = loss_fcn(output, y[washout[0]:])
            opt.zero_grad()
            loss.backward()
            opt.step()
            train_loss += loss.item()
        print("Training error:", train_loss/len(train_dataloader))

Error : TypeError: 'module' object is not callable

Hi,
I recently want to predict timeseria data using reservior compute and have refered your code mackey-glass.py,.
My pytorch version is '0.4.0' , so some code lines , like "torch.device", "to(device)", have been replaced for version problem. But it still occur an error as follows.

Traceback (most recent call last):
File "mackey-glass.py", line 37, in
trY_flat = utils.prepare_target(trY.clone(), [trX.size(0)], washout)
File "/usr/local/lib/python3.5/dist-packages/pytorch_esn-1.2.1-py3.5.egg/torchesn/utils/utilities.py", line 25, in prepare_target
TypeError: 'module' object is not callable

BTW, I have tried to move trX and trY data to CPU/GPU, but it did not work.

It will be helpful if you give me some advices. Thanks :)

Thesis/Reference Paper access

Hi, could you please post a link to your master thesis - "An Empirical Comparison of Recurrent Neural Networks on Sequence Modeling"?

Can you add an example for using a DeepESN in online mode?

⠀

The grad is none, please help me.

I don't know why the grad is none.? this is my net code. CNN+ESN

class CNN_ESN(nn.Module):
def init(self, output_size,channel_num, device,drop_prob=0.5):
super(CNN_ESN, self).init()
self.conv = nn.Sequential(
nn.Conv2d(channel_num, 64, (5, 5), padding='same'),
nn.ReLU(),
nn.Conv2d(64, 128, (4, 4), padding='same'),
nn.ReLU(),
nn.Conv2d(128, 256, (4, 4), padding='same'),
nn.ReLU(),
nn.Conv2d(256, 64, (1, 1), padding='same'),
nn.ReLU(),
nn.MaxPool2d((2, 2)),
nn.Flatten(),
nn.Linear(1024, 512)
).to(device)
self.esn = ESN(4,128,128,output_steps='mean', readout_training='svd').to(device)
self.fc = nn.Linear(128,output_size).to(device)
self.dropout = nn.Dropout(drop_prob)
self.sig = nn.Sigmoid()
self.washout_rate = 0.2
def forward(self, x):
conv_result = self.conv(x).reshape(128,x.shape[0],-1)
conv_result =self.dropout(conv_result)
washout_lst = [int(self.washout_rate * conv_result.size(0))] * conv_result.size(1)
out,hn = self.esn(conv_result,washout_lst)
hn = hn.transpose(1,0)
logit = self.fc(hn).squeeze(1)
return self.sig(logit).squeeze(1)

    for inputs, labels in train_loader: 
        net.train()
        inputs, labels = inputs.to(device), labels.to(device)
        optimizer.zero_grad()  
        output = net(inputs) 
        pred = torch.round(output)
        loss = loss_func(output, labels)
        pred_list = [float(i) for i in pred.tolist()]
        for p, l in zip(pred_list, labels.tolist()):
            tr_pre.append(int(p))
            tr_lab.append(l)
        tr_loss.append(loss.item())
        loss.backward()
        optimizer.step()

and I print the grad as below, the 2nd epoch reulst:

conv.0.weight None
conv.0.bias None
conv.2.weight None
conv.2.bias None
conv.4.weight None
conv.4.bias None
conv.6.weight None
conv.6.bias None
conv.10.weight None
conv.10.bias None
esn.reservoir.weight_ih_l0 None
esn.reservoir.weight_hh_l0 None
esn.reservoir.bias_ih_l0 None
esn.readout.weight None
esn.readout.bias None
fc.weight tensor([[ 0.0480, -0.0079, 0.0929, 0.1358, 0.1022, -0.1127, 0.0495, 0.1056,
-0.0923, 0.0720, 0.1122, 0.0139, -0.0619, 0.0796, 0.1433, 0.0295,
0.0884, -0.0504, 0.0305, 0.0264, 0.1352, 0.0467, -0.0607, -0.0363,
-0.0114, -0.1393, -0.0917, 0.0194, 0.1076, -0.0713, -0.0487, 0.0433,
-0.0875, 0.0212, 0.1007, -0.0711, 0.1098, 0.0577, 0.0607, 0.0299,
0.0380, 0.0955, 0.0062, -0.0620, -0.0463, -0.0354, 0.1050, -0.0920,
0.0742, -0.0550, -0.1270, -0.0597, 0.0736, 0.0246, 0.0521, -0.0866,
-0.0065, -0.0764, 0.0087, -0.0810, 0.0551, 0.0999, 0.1078, -0.0082,
-0.0940, -0.0628, -0.0624, 0.0779, -0.0107, -0.0069, 0.0793, -0.0318,
0.0086, -0.1427, 0.0617, 0.0839, -0.0904, 0.0535, 0.0678, -0.0232,
0.0361, -0.0065, -0.0806, 0.0824, -0.0336, 0.0707, -0.1042, -0.0336,
-0.1039, 0.0715, -0.1403, -0.0068, -0.0912, -0.1043, 0.0455, -0.0223,
0.0448, 0.0321, 0.0844, -0.0734, -0.0050, 0.1083, 0.0027, -0.0897,
-0.1176, -0.1059, 0.1006, 0.0873, -0.0715, -0.1130, 0.0707, -0.0015,
0.0257, -0.0536, -0.0769, -0.0671, -0.0329, -0.0666, -0.0631, 0.1100,
-0.1013, -0.0392, -0.1062, 0.1276, -0.0686, -0.1200, -0.0669, 0.0424]])
fc.bias tensor([0.1520])

could tell me the reason?