bnn-upc / gnnetworkingchallenge Goto Github PK

RouteNet baseline for the Graph Neural Networking Challenge (https://bnn.upc.edu/challenge/)

License: Apache License 2.0

communication-networks computer-networks gnn graph-neural-networks machine-learning ml networking

gnnetworkingchallenge's Issues

cannot get the expected outputs

Thanks for your work on the GNN and putting on a big competition. I tried your work on the GPU server following the quick start, but I cannot get the expected outputs. And it showed

Traceback (most recent call last):
File "/data/yj/yes/envs/test1/lib/python3.7/site-packages/tensorflow_core/python/ops/gradients_util.py", line 331, in _MaybeCompile
xla_compile = op.get_attr("_XlaCompile")
File "/data/yj/yes/envs/test1/lib/python3.7/site-packages/tensorflow_core/python/framework/ops.py", line 2330, in get_attr
raise ValueError(str(e))
ValueError: Operation 'route_net_model/UnsortedSegmentSum_6' has no attr named '_XlaCompile'.

Due to space limitation, another exceptions will not be listed. I'd appreciated it if you had any ideas!

Why I predict the result is -1 all？

I have tried the Max number of training steps( 5000 100000 5000000),but I got the same result of -1.

Problem with the Simulator when changing the Max topology number and the max bandwidth

Hello,

I'm a PHD student and i'm trying to use your model.

I want also to create a custom dataset for the validation and prediction purpose but when the (network_size >10 or bandwidth >100000)
the simulator does not work properly and returns an exception related to the existence of the folder /data.{ctr} (not created) even i changed the parameters max topology size and max bandwidth in the Simulate.py file.

So my questions are:
-- there are other parameters that i should change in the simulator?
-- how did you create your validation dataset (50, 100, 150...nodes)?

thanks in advance for helping.

regards,
Sofiane MESSAOUDI

Regarding using index

Hello, I'm a participant for GNN 2023' Challenge.

How do you gain access to dictionary, arrays regarding path or flow index?

It seems that DataGenerator only returns Samples.

Thank you!

Same value for all predictions

Hi,

I have been trying to train the model using the code as is from the repository. The loss does go down for a few epochs, but when checking the predictions (using predictions = model.predict(ds_test, steps=2) all the values in the array are the same. I have tried with the sample data, but also with the challenge data.

Is there something I am missing?

Many thanks

TF version warnings

Hi
I installed the TF version of the challenge. It is running but I got some warnings:
C:\Users\ia-te_5pzizb8\AppData\Local\Programs\Python\Python37\lib\site-packages\tensorflow\python\framework\indexed_slices.py:437: UserWarning: Converting sparse IndexedSlices(IndexedSlices(indices=Tensor("gradients/GatherV2_1_grad/Reshape_1:0", shape=(None,), dtype=int32), values=Tensor("gradients/GatherV2_1_grad/Reshape:0", shape=(None, 32), dtype=float32), dense_shape=Tensor("gradients/GatherV2_1_grad/Cast:0", shape=(2,), dtype=int32))) to a dense Tensor of unknown shape. This may consume a large amount of memory. "shape. This may consume a large amount of memory." % value)

I tried some solutions from the internet but didn't work for me.
https://stackoverflow.com/questions/35892412/tensorflow-dense-gradient-explanation

Any recommendations to solve this problem please ?

I am using:
OS: windows 10
TF version: 2.4.1
CUDA version: 11.2
Python version: 3.7.0

The performance of this baseline project

I have tried to use this baseline project, but I have doubts about the performance of this baseline. Without changing the training parameters, the final MAPE is not satisfactory. And you did not announce the normal MAPE level of this baseline. Is it necessary to modify the model to further improve performance, or is it the problem that my local training did not converge?

Epoch 100/100
1000/1000 [==============================] - 193s 193ms/step - loss: 0.0218 - mean_absolute_percentage_error: 68.2392 - val_loss: 0.0286 - val_mean_absolute_percentage_error: 52.7838

implementation in the given code

Hello,
I am trying to improve the TensorFlow code using our approach.
I have some doubts about the code which you have given as a reference.

Since I am using this code " }, list(nx.get_node_attributes(D_G, 'delay').values())" using delay, do I need to modify in the prediction output?
Whenever I am training all the data files at a time, I am getting errors. Therefore, please help me to overcome this barrier.

With regards,
Raju
IIT Madras

`predict.py` causing errors

ValueError: Unable to load weights saved in HDF5 format into a subclassed Model which has not created its variables yet. Call the Model first, then load the weights.

This error occurs when I call predict.py directly.

Traffic matrix extracted from simulation?

Hi,

I was having a look at the paper and the implementation and there is something I would like to understand a bit better. In this paper, you describe the traffic matrix as the "bandwidth between each pair of nodes in the network". In this other paper, the traffic matrix is defined as follows:

being TM(src, dst) the traffic exchanged by every src-dst pair.

In the implementation I can see that you're loading the traffic matrix here, by using the AvgBw of the flow as described here.

So I was wondering whether the AvgBw is an output of the simulation? Or is that calculated before the simulation (as described in the paper) and then used to generate the simulation?

Many thanks,
Diego

What's the difference between "flow_type" in the data and goole drive CBR, MB itself?

Hello, I'm participant in the GNNNetworkingChallenge.

What's the difference between "flow_type" in the data and goole drive folder CBR, MB itself?

I got confused because data from "CBR" folder has multiple flow types, and vice-versa as well. Does this mean that division of google drive is meaningless?

Thank you!

bnn-upc / gnnetworkingchallenge Goto Github PK

gnnetworkingchallenge's Issues

cannot get the expected outputs

Why I predict the result is -1 all？

Problem with the Simulator when changing the Max topology number and the max bandwidth

Regarding using index

Same value for all predictions

TF version warnings

The performance of this baseline project

implementation in the given code

`predict.py` causing errors

Traffic matrix extracted from simulation?

What's the difference between "flow_type" in the data and goole drive CBR, MB itself?

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent