Coder Social home page Coder Social logo

bupt-gamma / cpf Goto Github PK

View Code? Open in Web Editor NEW
73.0 73.0 22.0 22.86 MB

The official code of WWW2021 paper: Extract the Knowledge of Graph Neural Networks and Go Beyond it: An Effective Knowledge Distillation Framework

Home Page: https://arxiv.org/pdf/2103.02885.pdf

Python 100.00%
graph-neural-networks knowledge-distillation

cpf's People

Contributors

j-cabin avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

cpf's Issues

the training of student model

在训练Teacher, 正如你代码的备注一样只是计算labeled node的loss
image
复现结果符合文章中结果
image
但在训练Student的时候,你备注写的也是在labeled node上面,但是却是在unlabeled nodes上面计算loss
image
我把loss的计算改为labeled node上(idx_train),这与文中的结果十分不符合,甚至降低了teacher model的性能
image
如果按照un_labeled node上计算loss,确实和你文章中的结果相似
image
请问是什么原因导致了这个问题,还是我的理解出现了偏差?

Why use idx_no_train

When training student models,why use idx_no_train?

    if conf['model_name'] == 'PLP':
        loss = my_loss(logp[idx_no_train], cas[-1][idx_no_train]) 

有关teacher model---GLP

同学,您好,在我跑GLP这个teacher model时,发现没有这个model,
python train_dgl.py --dataset=cora --teacher=GLP;
此外,我发现代码中有个teacher model---MoNet ,但是您的论文实验结果没有该模型的实验结果。
期待您的解答,非常感谢!

Questions about the results on Citeseer

I tried to run the code on citeseer and use --automl to search best hyper-parameters , but only achieved 0.7276, far lower than the reported results. I used GCN as the teacher model and achieved 0.7110 on TITAN RTX GPU,which is the same as the result in the paper. But when I used RTX 2060 GPU, the result became different.
So I have the following two questions:

  1. Why I can't achieve the result reported in the paper on citeseer dataset even used the Optuna? I would appreciate it if you could give the hyperparameters that can achieve the results of the paper.
  2. Why the results are different when I run it on different GPUs?I checked the seed but I haven't found the reason yet.

Inductive setting和transductive setting的区别

你好,我想问下文中所用的数据集都是节点分类的数据集,比如Cora。而据我理解,Inductive learning只能用于图分类任务,这样可以保证训练的时候测试集的图不会出现在训练过程。而文中的只是加了一个公式6就变成了Inductive setting,请问文中对于Inductive setting和transductive setting的依据是什么呐

Why use idx_no_train ?

idx_no_train contains test nodes and validation sets, but don't we usually use training nodes when calculating loss?

for example:

import torch.nn.functional as F

data = dataset[0]
optimizer = torch.optim.Adam(model.parameters(), lr=0.01)

for epoch in range(200):
    pred = model(data.x, data.edge_index)
    loss = F.cross_entropy(pred[data.train_mask], data.y[data.train_mask])

    # Backpropagation
    optimizer.zero_grad()
    loss.backward()
    optimizer.step()

code question

你好:
我关注你这篇文章很长时间了,感谢你的代码公开分享。
我在运行代码的时候出现一个问题,似乎是我的optuna模块版本不适合,请问你的optuna使用的哪个版本?
谢谢。

复现结果比论文低很多

你好,因为文中使用的cora等数据集的划分和标准划分不同,所以我修改了load data部分代码,teacher logits直接使用独立的GCNII模型的结果,teacher测试精度85%多,但最终student PLP的结果只有60%不到,比原文汇报的低很多,请问是什么原因呢?

部分修改如下:
image.png
关于teacher logits,我也对load_cascades相应进行了修改,直接由 GCNII 的logits经softmax后作为cas。
最终结果截图(teacher 85%,但student 不到60%)——
image.png

代码问题

你好,我想问一下代码的哪个参数是用来设置蒸馏时学生模型属于参数化标签传播(PLP)或者特征变换(FT)或者两者的结合,还有在做知识蒸馏的时候,选用的学生模型和教师模型的基础模型是一致的吗,还是不一致的?

Cora, Citeseer, Pubmed三组数据集的结果

您好,
你在论文中写到,你对数据划分,是随机从每个类别选取20个结点,而不是用的GCN中的固定划分,但是paper with code(https://paperswithcode.com/paper/extract-the-knowledge-of-graph-neural)报告你用的固定划分,这点我很疑惑
Following the experimental settings in previous work [23], we
randomly sample 20 nodes from each class as labeled nodes, 30
nodes for validation and all other nodes for test.

dgl version

您好:
关于dgl我还想请教一个问题,根据其版本不同,在使用相同的dgl.Graph创建dgl图的时候,会有两种类型一种是DGLGraph,还有一种是DGLHeteroGraph,请问后者带有Hetero是什么意思?与前者会有不同结果且会对整体实验有影响吗?谢谢。

About Datasets

Hi,
Could you please let me know from where did you download the datasets for your work?
I would like to work with ppi dataset, but I am unable to find it in your format.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.