ethanfetaya / nri Goto Github PK

View Code? Open in Web Editor NEW

735.0 735.0 157.0 88 KB

Neural relational inference for interacting systems - pytorch

License: MIT License

Python 100.00%

nri's Issues

Compatibility with Pytorch 1.xx ?

Hi,
Great paper. Do you plan to make your code compatible with Pytorch 1.xx ?
Thanks.

Request for Kuramoto dataset

According to Section 5.1 of the original paper, I use the code by Laszuk (https://github.com/laszukdawid/Dynamical-systems/blob/master/kuramoto.py) to simulate the Kuramoto model. The settings are listed as follows.

N = 5 # number of particles
intrinsic frequencies \omega uniformly sampled from [1, 10)
initial phases \phi uniformly sampled from [0, 2\pi)
coupling constants k_{ij} = 1 with probability 0.5
subsample factor = 10
length of trajectories T = 50
particle states x = (d\phi / dt, sin \phi, \omega)

For normalization, I use the function load_kuramoto_data from utils.py.

Some important settings of NRI are listed as follows.

encoder: CNN
decoder: MLP
skip_first = True
lr = 5e-4
prediction_step = 10 # teacher forcing in every 10-th time step

It seems I've strictly followed the settings of the original paper, but the accuracy gets stucked at around 54%, and the mse gets stucked at the level of 1e-1. There must be some mistakes in simulation or training. Do you have any advice? Would you mind providing a copy of Kuramoto dataset to help me out?

How to plot the beautiful trajectory like your figure1?

Thanks for your amazing code. How to plot the beautiful trajectory like your figure1?

/

About edge_accuracy() in utils.py

First, thanks a lot for sharing this great repo.
I have two questions with the computation of relation prediction accuracy:

Suppose the model is trained and we only want to evaluate the trained model. The accuracy can be different with different values for the batch-size parameter (however, it should not be influenced by batch-size because the model does not change), especially when the number of test examples is not very large. The reason could be that not all batches have batch-size examples (if num_test_example % batch-size != 0). I feel it is better that edge_accuracy() in utils.py returns the average accuracy and the number of examples in this batch, and then compute the average in the main script by taking the division.
(If I understand correctly), we (or you) do not care about the ''absolute'' class label. It is more like clustering instead of classification. So, for the two-relation cases, the accuracy should be max(acc, 1.0-acc)? Besides, I wonder do you have some ideas to compute the accuracy with multiple (>2) relation cases? (the current edge_accuracy() function seems only suitable for two-relation case).

'args' is an undefined name in utils.py

flake8 testing of https://github.com/ethanfetaya/NRI on Python 3.6.3

$ flake8 . --count --select=E901,E999,F821,F822,F823 --show-source --statistics

./utils.py:459:24: F821 undefined name 'args'
        const = np.log(args.edge_types)
                       ^

Unsupervised learning

In the Appendix, A.2., unsupervised learning was done:

To test whether our model can infer an empty graph, we create a test set of 1000 simulations with 5 non-interacting particles and test an unsupervised NRI model which was trained on the spring simulation dataset with 5 particles as before. We find that it achieves an accuracy of 98.4% in identifying ”no interaction” edges (i.e. the empty graph).

Can someone point out do unsupervised learning from the code in this repo?

what does mean edge_type?

what does num_atoms mean?

what does the argument num_atoms mean in the code?
atom is not shown in the paper.

Support for large graphs?

Many thanks for the interesting work.
Indeed, I am trying to use your model on large biological graphs (more than 10K nodes) but I am facing memory limits.
Basically, you are using the one-hot encoding for all the edges in a fully connected graph to exchange the messages and to facilitate the optimization of the ELBO. For very large graphs such encoding is not an option.
I tried using sparse tensors but the missing strides for torch.matmul (requires contiguous representation for the data) and the unsupported broadcasting for matrix multiplication with torch.mm limited my efforts to patch your implementation.
Do you have please an idea on how we could extend the application of your model on large graphs?
Thank you very much in advance.

non-interaction edge type

For the system in which 2 particles interact or not, such as the spring experiments, if we use z_{ij}=[0,1] to denote interaction, and z_{ij} = [1,0] to denote non-interaction(no message between node i and j), in the decoder should we only consider the interaction edge type, i.e., h^t_(i,j) = z_{ij,0}fe([x^t_i, x^t_j])? Since no message between the non-interaction edge.

Annealed Temperature in Gumbel Softmax

It doesn’t look like the temperature is annealed in your gumbel softmax. Is there a reason for this as it is not standard? @tkipf

Some difference from the paper

Dear ethanfetaya:

I learn the codes of RNNDecoder and find some difference from the equations: (14)-(16) in your paper. In your code, you do not concatenate the MSG and x as the input of GRU and there is not additional hidden state. Why? Which is right?

is there is a plan to release Motion capture data in generate_dataset.py?

hi,thanks for your really great code!
it seems you just implement Physics simulations dataset in your code. i want to apply it to reasoning in video/image, and i dont know the meaning of npy file.
'edges_valid_springs5.npy' is (10000,5,5),what dose the last (5,5) mean in video.
'loc_valid_springs5.npy' is (10000,49,2,5),what dose the last(2,5) mean in video.
vel_valid_springs5.npy' is (10000,49,2,5),what dose the last(2,5) mean in video.
, and can those nodes be output of regien proposal like ROIAlign?
look forward to your reply.

For the type of edge in the experimental setup.

There is no supervised training in training. How to know the first type is the existence side and the second type is the non existence side.
def edge_accuracy(preds, target): _, preds = preds.max(-1) # preds torch.Size([32, 20, 2]) preds_hou torch.Size([32, 20]) correct = preds.float().data.eq( target.float().data.view_as(preds)).cpu().sum() return np.float(correct) / (target.size(0) * target.size(1))

what does mean logits shape?

what does mean logits shape ?
logits = encoder(pts, rel_rec, rel_send)

my pts ----> torch.Size([32, 14, 30, 3])

logits - torch.Size([32, 182, 3])

my_softmax

Why does the my_softmax function seems to be normalizing alongside the batch dimension instead of classes dimension?

Questions regarding the edges that are created in the latent space

What does edges represent here?

NRI/train.py

Line 203 in e63fcb0

edges = gumbel_softmax(logits, tau=args.temp, hard=args.hard)

What do the probabilities represent here?

NRI/train.py

Line 204 in e63fcb0

prob = my_softmax(logits, -1)

Error in class MLP def forward

The step x = F.elu(self.fc1(inputs)) has error. When using the forward in MLP class, the error says "mat1 and mat2 shapes cannot be multiplied (640*16 and 196*512)".

Error in running the simulation

Hi,

I could generate the data using this command:
python generate_dataset.py

But when I want to run this command:

--simulation charged

It gives me this error:

error: '--simulation' is not recognized as an internal or external command, operable program or batch file.

An important issue.

In the test phase, the encoder sees ground truth data that should not be seen, resulting in higher precision. May I ask for some explanation?

Where can I find the code of Eq. 12 in the paper??

Below is the code snippet of MLPDecoder.
I think prediction is ended with Eq. 11 in the paper.
I can't find the code of Eq. 12.
Am I missing something in this code??

Thanks in advance.

    def single_step_forward(self, single_timestep_inputs, rel_rec, rel_send,
                            single_timestep_rel_type):

        # single_timestep_inputs has shape
        # [batch_size, num_timesteps, num_atoms, num_dims]

        # single_timestep_rel_type has shape:
        # [batch_size, num_timesteps, num_atoms*(num_atoms-1), num_edge_types]

        # Node2edge 
        receivers = torch.matmul(rel_rec, single_timestep_inputs)
        senders = torch.matmul(rel_send, single_timestep_inputs)
        # Eq 10 [x_i^t, x_j^t] [#sims(batch_size), #tsteps_indexed, #edges, #dims*2]
        pre_msg = torch.cat([senders, receivers], dim=-1)
        # self.msg_out_shape = #node_features
        all_msgs = Variable(torch.zeros(pre_msg.size(0), pre_msg.size(1),
                                        pre_msg.size(2), self.msg_out_shape))
        if single_timestep_inputs.is_cuda:
            all_msgs = all_msgs.cuda()

        if self.skip_first_edge_type:
            start_idx = 1
        else:
            start_idx = 0

        # Run separate MLP for every edge type
        # NOTE: To exlude one edge type, simply offset range by 1
        # Eq 10 MLP
        for i in range(start_idx, len(self.msg_fc2)):
            msg = F.relu(self.msg_fc1[i](pre_msg))
            msg = F.dropout(msg, p=self.dropout_prob)
            msg = F.relu(self.msg_fc2[i](msg))
            msg = msg * single_timestep_rel_type[:, :, :, i:i + 1] #element-wise product with broadcast
            all_msgs += msg

        # Aggregate all msgs to receiver
        # Eq 11 / rel_rec [#edges, #nodes]
        agg_msgs = all_msgs.transpose(-2, -1).matmul(rel_rec).transpose(-2, -1)
        agg_msgs = agg_msgs.contiguous()

        # Skip connection
        aug_inputs = torch.cat([single_timestep_inputs, agg_msgs], dim=-1)

        # Output MLP
        pred = F.dropout(F.relu(self.out_fc1(aug_inputs)), p=self.dropout_prob)
        pred = F.dropout(F.relu(self.out_fc2(pred)), p=self.dropout_prob)
        pred = self.out_fc3(pred)

        # Predict position/velocity difference / Eq 11 >> Where is Eq 12??
        return single_timestep_inputs + pred

    def forward(self, inputs, rel_type, rel_rec, rel_send, pred_steps=1):
        # NOTE: Assumes that we have the same graph across all samples.
        # Input shape: [num_sims, num_atoms, num_timesteps, num_dims] > [#sims, #tsteps, #nodes, #dims]
        inputs = inputs.transpose(1, 2).contiguous()

        sizes = [rel_type.size(0), inputs.size(1), rel_type.size(1),
                 rel_type.size(2)]
        rel_type = rel_type.unsqueeze(1).expand(sizes)

        time_steps = inputs.size(1)
        assert (pred_steps <= time_steps)
        preds = []

        # Only take n-th timesteps as starting points (n: pred_steps)
        last_pred = inputs[:, 0::pred_steps, :, :]
        curr_rel_type = rel_type[:, 0::pred_steps, :, :]
        # NOTE: Assumes rel_type is constant (i.e. same across all time steps).

        # Run n prediction steps / Eq 10~11
        for step in range(0, pred_steps):
            last_pred = self.single_step_forward(last_pred, rel_rec, rel_send,
                                                 curr_rel_type)
            preds.append(last_pred)

        sizes = [preds[0].size(0), preds[0].size(1) * pred_steps,
                 preds[0].size(2), preds[0].size(3)]

        output = Variable(torch.zeros(sizes))
        if inputs.is_cuda:
            output = output.cuda()

        # Re-assemble correct timeline
        for i in range(len(preds)):
            output[:, i::pred_steps, :, :] = preds[i]
        # last prediction is one step beyond input
        pred_all = output[:, :(inputs.size(1) - 1), :, :]

        return pred_all.transpose(1, 2).contiguous()

Is it possible to learn more than 2 edge-types in unsupervised manner?

Hello, thank you for your great work and nice code.

I saw the supplementary material, and it said that NRI can learn "known" 3 edge types (no-interaction, weak spring, strong spring).
In this sentence, dose "known" mean that NRI can learn the relations only in supervised manner, not in unsupervised manner?
In the source code, is it right that relation-supervised training is not implemented?

Again, thank you for your great work!

relational inference in dynamic systems between different attributes

Hello,

I have read the paper and the code and I'm fascinated about this tool and their possible applications.

In my biological set-up I have different objects from which I want to create an interaction graph. Unfortunately, not all biological objects have the same number of attributes, e.g. fibrines have defined their morphometry but not their phenotype, and cells have defined their phenotype but not their morphology. I would like to know if there is any relation between them.

I have thought about creating an attribute vector containing all the features that are available. Following the example: fibrines would have a vector of 2 attributes with their morphometry leaving their phenotype undefined (using zeros or random numbers), and cells will have their phenotype defined leaving the morphometry undefined.

Can you give me any suggestions about this approach based on your experience?

Thank you,
Daniel Jiménez.

Results of charge experiment

Hi, I can not reproduce the experimental results of the charged simulation dataset. The accuracy is only 50+% and I didn't modify the code (just modify variable() to fit higher pytorch versions). Also, when I try to reproduce the experimental results of the spring simulation dataset, the accuracy is not good when I do not apply --skip_first (only about 70%). Can you help me out? Thank you very much!

How to reproduce some paper results

Hi, thanks for the the code release.

To make sure that I am running the code properly, I am trying to reproduce some of the paper results. What's the correspondence between the results returned by the code and those reported in the paper? My understanding is as follows:

The values reported in Table 1 of the paper should be similar to np.mean(acc_test).
The values reported in Table 2 of the paper correspond to what in the code is called "mse". More precisely, in the code there are two similar variables referring to "mse" for test: mse_test and mean_mse. My understanding is that np.mean(mse_test) should be similar to the first column of Table 2 (because a prediction step of 1 is being used, see line 323 of train.py), and np.mean(mean_mse) should be similar to the third column of Table 2 (because a prediction step of 20 is being used, see line 351 of train.py).

Is this correct? Thank you!

Sport UV dataset

Hi,

Thanks for your great work,

Can you provide the link or the sport basketball dataset you used in your paper?

You also mentioned that you focused on the PnR instances of the game. How to find these instances?

Best,

ethanfetaya / nri Goto Github PK

nri's Issues

my pts ----> torch.Size([32, 14, 30, 3])

Recommend Projects

Recommend Topics

Recommend Org