Coder Social home page Coder Social logo

snap-research / graphless-neural-networks Goto Github PK

View Code? Open in Web Editor NEW
80.0 8.0 20.0 658 KB

[ICLR 2022] Code for Graph-less Neural Networks: Teaching Old MLPs New Tricks via Distillation (GLNN)

License: MIT License

Python 93.09% Shell 6.91%
deep-learning distillation efficient-inference graph-algorithm graph-neural-networks knowledge-distillation pytorch gnn scalability

graphless-neural-networks's Issues

Cannot reproduce the results even with the same random seed

Thanks for sharing the code! The random seed in train_teacher.py seems not to work as every time run python train_teacher.py --exp_setting tran --teacher SAGE --dataset cora will generate different results even with the same seed. Accordingly, we cannot reproduce the exact same results as stated in the paper when running bash experiments/sage_cpf.sh. This seems a bug since the point of random seed is to reproduce the results. Could you please fix this?

The two different results with same random seed 0 (python train_teacher.py --exp_setting tran --teacher SAGE --dataset cora):
image

The results of bash experiments/sage_cpf.sh, which is different to the paper:
image

Error Unpickling the Cora.npz data (and others)

Hi! I am trying to start running the code but I have encountered the following error I can't figure out when trying to load the .npz cora file.
Using backend: pytorch
WARNING:root:The OGB package is out of date. Your version is 1.3.3, while the latest version is 1.3.5.
Traceback (most recent call last):
File "/home/aaron/anaconda3/envs/glnn/lib/python3.6/site-packages/numpy/lib/npyio.py", line 460, in load
return pickle.load(fid, **pickle_kwargs)
_pickle.UnpicklingError: invalid load key, '\x0a'.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "train_teacher.py", line 346, in
main()
File "train_teacher.py", line 329, in main
score = run(args)
File "train_teacher.py", line 210, in run
labelrate_val=args.labelrate_val,
File "/home/aaron/graphless-neural-networks/dataloader.py", line 49, in load_data
kwargs["labelrate_val"],
File "/home/aaron/graphless-neural-networks/dataloader.py", line 85, in load_cpf_data
data = load_npz_to_sparse_graph(data_path)
File "/home/aaron/graphless-neural-networks/dataloader.py", line 526, in load_npz_to_sparse_graph
with np.load(file_name, allow_pickle=True) as loader:
File "/home/aaron/anaconda3/envs/glnn/lib/python3.6/site-packages/numpy/lib/npyio.py", line 463, in load
"Failed to interpret file %s as a pickle" % repr(file))
OSError: Failed to interpret file PosixPath('/home/aaron/graphless-neural-networks/data/cora.npz') as a pickle

I think it has something to do with how the file is saved with different versions of numpy? I have used the same exact requirements.txt file for the conda environment.

Thanks!

speed comparison

As mentioned in the paper, the inductive inference time of GLNN is compared to other inference acceleration methods of GNN on 10 randomly chosen nodes, but the code does not include these experiments. So could you please provide some more details about how the inference time is measured and compared ?

About min-cut

hello, I try to re-run citeseer under transduction setting.
The seeds are 1 2 3 4 5.

I get an average of 71.22, proving the correctness of my experiments.

however, for min-cut: I get
0.7159
0.6828
0.7444
0.9163
0.5613

It is highly unstable.

Meanwhile, for GLNN, I get:
0.9457
0.9499
0.9519
0.9670
0.9278

So maybe the min-cut just work for GLNN well and fail to capture the graph topology

Undefined function

There is an undefined function in your code(dataloader.py, line 257, rand_train_test_idx). I can't find the function from your code and imported packages. What is this?

Failed to build environment

the packages in requirement.txt is incomplete , and fail to use bash ./prepare_env.sh to install some packages

The function graph_split() seems to contradict the inductive scenarios.

I have a question about the function graph_split in the file utils.py.

According to the code, the tensors idx_test_ind and obs_idx_train may overlap.

def graph_split(idx_train, idx_val, idx_test, rate, seed):
    idx_test_ind, idx_test_tran = idx_split(idx_test, rate, seed)

    idx_obs = torch.cat([idx_train, idx_val, idx_test_tran])
    N1, N2 = idx_train.shape[0], idx_val.shape[0]
    obs_idx_all = torch.arange(idx_obs.shape[0])
    obs_idx_train = obs_idx_all[:N1]
    obs_idx_val = obs_idx_all[N1 : N1 + N2]
    obs_idx_test = obs_idx_all[N1 + N2 :]

    return obs_idx_train, obs_idx_val, obs_idx_test, idx_obs, idx_test_ind

For example, let V = [0,1,2,3,4,5] be all nodes in the graph and idx_train = [1,2], idx_val = [3,4], idx_test = [0, 5].

Suppose that idx_test_ind = [0] and idx_test_tran = [5] after the function idx_split(). Then we have idx_obs = [1,2,3,4,5], N1=2, N2 = 2, and obs_idx_all = [0,1,2,3,4]. Hence, the resulting observed sets are obs_idx_train = [0,1], obs_idx_val = [2,3], obs_idx_test = [4].

This means that idx_test_ind and obs_idx_train both have the element 0, which contradicts the inductive scenario.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.