Coder Social home page Coder Social logo

sgformer's Introduction

SGFormer: Simplified Graph Transformers

The official implementation for NeurIPS23 paper "SGFormer: Simplifying and Empowering Transformers for Large-Graph Representations".

Related material: [Paper], [Blog], [Video]

SGFormer is a graph encoder backbone that efficiently computes all-pair interactions with one-layer attentive propagation.

SGFormer is built upon our previous works on scalable graph Transformers with linear complexity NodeFormer (NeurIPS22, spotlight) and DIFFormer (ICLR23, spotlight).

What's news

[2023.10.28] We release the code for the model on large graph benchmarks. More detailed info will be updated soon.

[2023.12.20] We supplement more details for how to run the code.

[2024.05.05] We supplement the code for testing time and memory in ./medium/time_test.py

Model and Results

The model adopts a simple architecture and is comprised of a one-layer global attention and a shallow GNN.

image

The following tables present the results for standard node classification tasks on medium-sized and large-sized graphs.

image image

Requirements

For datasets except ogbn-papers100M, we used the environment with package versions indicated in ./large/requirement.txt. For ogbn-papers100M, one needs PyG version >=2.0 for running the code.

Dataset

One can download the datasets (Planetoid, Deezer, Pokec, Actor/Film) from the google drive link below:

https://drive.google.com/drive/folders/1rr3kewCBUvIuVxA6MJ90wzQuF-NnCRtf?usp=drive_link

For Chameleon and Squirrel, we use the new splits that filter out the overlapped nodes.

For the OGB datasets, they will be downloaded automatically when running the code.

Run the codes

Please refer to the bash script run.sh in each folder for running the training and evaluation pipeline.

Citation

If you find our code and model useful, please cite our work. Thank you!

      @inproceedings{
        wu2023sgformer,
        title={SGFormer: Simplifying and Empowering Transformers for Large-Graph Representations},
        author={Qitian Wu and Wentao Zhao and Chenxiao Yang and Hengrui Zhang and Fan Nie and Haitian Jiang and Yatao Bian and Junchi Yan},
        booktitle={Advances in Neural Information Processing Systems (NeurIPS)},
        year={2023}
        }

sgformer's People

Contributors

icarus1411 avatar qitianwu avatar wtaozhao avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

sgformer's Issues

RuntimeError: shape '[10, 18, 7, 18, 7, 32]' is invalid for input of size 5242880

Thank you for your contribution to science, I am having the following problem reproducing your code
Traceback (most recent call last):
File "/tmp/pycharm_project_772/tools/train.py", line 195, in
main()
File "/tmp/pycharm_project_772/tools/train.py", line 184, in main
train_detector(
File "/tmp/pycharm_project_772/mmdet/apis/train.py", line 186, in train_detector
runner.run(data_loaders, cfg.workflow)
File "/opt/anaconda3/envs/py38/lib/python3.8/site-packages/mmcv/runner/epoch_based_runner.py", line 136, in run
epoch_runner(data_loaders[i], **kwargs)
File "/opt/anaconda3/envs/py38/lib/python3.8/site-packages/mmcv/runner/epoch_based_runner.py", line 53, in train
self.run_iter(data_batch, train_mode=True, **kwargs)
File "/opt/anaconda3/envs/py38/lib/python3.8/site-packages/mmcv/runner/epoch_based_runner.py", line 31, in run_iter
outputs = self.model.train_step(data_batch, self.optimizer,
File "/opt/anaconda3/envs/py38/lib/python3.8/site-packages/mmcv/parallel/data_parallel.py", line 77, in train_step
return self.module.train_step(*inputs[0], **kwargs[0])
File "/tmp/pycharm_project_772/mmdet/models/detectors/base.py", line 247, in train_step
losses = self(**data)
File "/opt/anaconda3/envs/py38/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/opt/anaconda3/envs/py38/lib/python3.8/site-packages/mmcv/runner/fp16_utils.py", line 146, in new_func
output = old_func(*new_args, **new_kwargs)
File "/tmp/pycharm_project_772/mmdet/models/detectors/base.py", line 181, in forward
return self.forward_train(img, img_metas, **kwargs)
File "/tmp/pycharm_project_772/mmdet/models/detectors/two_stage.py", line 142, in forward_train
x = self.extract_feat(img)
File "/tmp/pycharm_project_772/mmdet/models/detectors/two_stage.py", line 82, in extract_feat
x = self.backbone(img)
File "/opt/anaconda3/envs/py38/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/tmp/pycharm_project_772/mmdet/models/backbones/sgformer.py", line 484, in forward
x, mask = blk(x, H, W, mask)
File "/opt/anaconda3/envs/py38/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1501, in call_impl
return forward_call(*args, **kwargs)
File "/tmp/pycharm_project_772/mmdet/models/backbones/sgformer.py", line 263, in forward
x
, mask = self.attn(self.norm1(x), H, W, mask)
File "/opt/anaconda3/envs/py38/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/tmp/pycharm_project_772/mmdet/models/backbones/sgformer.py", line 150, in forward
q2, k2, v2 = window_partition(q2, q_window, H, W), window_partition(k2, window_size, H, W),
File "/tmp/pycharm_project_772/mmdet/models/backbones/sgformer.py", line 24, in window_partition
x = x.view(B, H // window_size, window_size, W // window_size, window_size, C)
RuntimeError: shape '[10, 18, 7, 18, 7, 32]' is invalid for input of size 5242880, I entered an image size of 512x512

About Proof for Theorem 2

Hi~ I'm not very good at maths, I want to ask about the proof of (18) in the appendix of the paper.
80a92737f7ec4704f948b2b9c21e96ff

Reproducing results on proteins

Hi,

I run the following command to reproduce ogbn-proteins results

python main-batch.py --method sgformer  --dataset ogbn-proteins --metric rocauc --lr 0.01 --hidden_channels 64 \
    --gnn_num_layers 2  --gnn_dropout 0. --gnn_weight_decay 0. --gnn_use_residual --gnn_use_weight --gnn_use_bn --gnn_use_act \
    --trans_num_layers 1 --trans_dropout 0. --trans_weight_decay 0. --trans_use_residual --trans_use_weight --trans_use_bn \
    --use_graph --graph_weight 0.5 \
    --batch_size 10000 --seed 123 --runs 5 --epochs 1000 --eval_step 9 --device 1

This is the accuracy I am getting:

Chosen epoch: 65 Final Train: 81.17 Final Test: 72.85.

I haven't gotten all 5 runs because it seems very far away from the reported number.

Is the command provided in the run.sh right?

RuntimeError: CUDA error: an illegal memory access was encountered

Hi,

Thanks for open sourcing your implementation. I am trying to run your examples as pointed in the run.sh file, but it seems like it is throwing error.

Traceback (most recent call last):
  File "/localscratch/SGFormer/large/main-batch.py", line 143, in <module>
    out_i = model(x_i, edge_index_i)
  File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1520, in _call_impl
    return forward_call(*args, **kwargs)
  File "/localscratch/SGFormer/large/ours.py", line 268, in forward
    x2 = self.graph_conv(x, edge_index)
  File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1520, in _call_impl
    return forward_call(*args, **kwargs)
  File "/localscratch/SGFormer/large/ours.py", line 86, in forward
    x = conv(x, edge_index, layer_[0])
  File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1520, in _call_impl
    return forward_call(*args, **kwargs)
  File "/localscratch/SGFormer/large/ours.py", line 33, in forward
    adj = SparseTensor(row=col, col=row, value=value, sparse_sizes=(N, N))
  File "/usr/local/lib/python3.10/dist-packages/torch_sparse/tensor.py", line 26, in __init__
    self.storage = SparseStorage(
  File "/usr/local/lib/python3.10/dist-packages/torch_sparse/storage.py", line 69, in __init__
    assert trust_data or int(row.max()) < M
RuntimeError: CUDA error: an illegal memory access was encountered

This usually happens because of incorrect SparseTensor construction, i.e. it waits for rows/columns to be sorted.

The error occurred with this command

python main-batch.py --method sgformer  --dataset ogbn-proteins --metric rocauc --lr 0.01 --hidden_channels 64     --gnn_num_layers 2  --gnn_dropout 0. --gnn_weight_decay 0. --gnn_use_residual --gnn_use_weight --gnn_use_bn --gnn_use_act     --trans_num_layers 1 --trans_dropout 0. --trans_weight_decay 0. --trans_use_residual --trans_use_weight --trans_use_bn     --use_graph --graph_weight 0.5     --batch_size 10000 --seed 123 --runs 5 --epochs 1000 --eval_step 9 --device 1

Can you also clarify why the sparse adj needs to be reconstructed for every layer per forward pass?

    def forward(self, x, edge_index, x0):
        N = x.shape[0]
        row, col = edge_index
        d = degree(col, N).float()
        d_norm_in = (1. / d[col]).sqrt()
        d_norm_out = (1. / d[row]).sqrt()
        value = torch.ones_like(row) * d_norm_in * d_norm_out
        value = torch.nan_to_num(value, nan=0.0, posinf=0.0, neginf=0.0)
        adj = SparseTensor(row=col, col=row, value=value, sparse_sizes=(N, N))
        x = matmul(adj, x)  # [N, D]

        if self.use_init:
            x = torch.cat([x, x0], 1)
            x = self.W(x)
        elif self.use_weight:
            x = self.W(x)

        return x

Why can't you just cache the normalized adj matrix and reuse it?

For amazon2m, I am getting a similar error:

Traceback (most recent call last):
  File "/localscratch/SGFormer/large/main-batch.py", line 143, in <module>
    out_i = model(x_i, edge_index_i)
  File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1520, in _call_impl
    return forward_call(*args, **kwargs)
  File "/localscratch/SGFormer/large/ours.py", line 268, in forward
    x2 = self.graph_conv(x, edge_index)
  File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1520, in _call_impl
    return forward_call(*args, **kwargs)
  File "/localscratch/SGFormer/large/ours.py", line 86, in forward
    x = conv(x, edge_index, layer_[0])
  File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1520, in _call_impl
    return forward_call(*args, **kwargs)
  File "/localscratch/SGFormer/large/ours.py", line 37, in forward
    x = torch.cat([x, x0], 1)
RuntimeError: CUDA error: an illegal memory access was encountered
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.

Unable to read pokec.mat

Hi,

Thanks for your quick responses to my questions with last couple of days.
I have been trying to load the pokec.mat to train on pokec but I am having the following problem:

  File "/localscratch/specformer-dev/large/dataset.py", line 383, in load_pokec_mat
    fulldata = scipy.io.loadmat(f'{data_dir}/pokec/pokec.mat')
  File "/opt/conda/envs/sgformer/lib/python3.9/site-packages/scipy/io/matlab/_mio.py", line 226, in loadmat
    MR, _ = mat_reader_factory(f, **kwargs)
  File "/opt/conda/envs/sgformer/lib/python3.9/site-packages/scipy/io/matlab/_mio.py", line 74, in mat_reader_factory
    mjv, mnv = _get_matfile_version(byte_stream)
  File "/opt/conda/envs/sgformer/lib/python3.9/site-packages/scipy/io/matlab/_miobase.py", line 249, in _get_matfile_version
    raise ValueError('Unknown mat file type, version {}, {}'.format(*ret))
ValueError: Unknown mat file type, version 32, 99

Which scipy version are you using? I tried different ones but the error persists. Can you provide some guidance?

Thanks

about the media datasets

Where can download the media datasets that can be used for this code directly? Looking forward to the datasets!

可以求一份 PPT 吗?

昨晚刷到了大神的三篇工作,NodeFormer,DIFFormer 和 SGFormer,感觉真的是一个非常好的工作,三篇工作能够圈起来成为一个圈,还是逐步递进的,B 站上看到大神的讲解视频,PPT 做的也是很棒,github 链接过期了,请问可以更新一份这三篇工作的 PPT 吗?😶‍🌫️

About time complexity

Thank you very much for your outstanding contribution in the field of graph transformer. I meet a question of SGFormer. Shouldn't the time complexity of equation(3) be $O(N*N)$ because of the product of $K^T and V$? Is there anything wrong with my understanding? I want to figure out it ! Thank U!

Running out of memory on ogbn-arxiv

Hi,

I am currently trying to run the code in the script for ogbn-arxiv, but I am running out of memory. Right now, it attempts to do a matrix multiply between two tensors which are each 169k by 256, and it allocated 100GB of VRam to do so.

I am wondering what settings I need to change in the script to get it to run?

Specifically,

attention_num = torch.sigmoid(torch.einsum("nhm,lhm->nlh", qs, ks)) # [N, L, H]
is where the error occurs.

Am I supposed to use main-batch.py instead of main.py?

Thanks!

GN Block Impact on SGFormer Performance

Hello Qitian,

Firstly, I would like to express my appreciation for your inspiring work and the dedication evident in your code! My primary interest lies in understanding how the GN block influences the performance of SGFormer. To this end, I adjusted the graph_weight parameter within the range of [0, 0.5, 0.8, 1.0] and conducted experiments on medium-sized benchmarks. Below, I present both the reproduced results and those reported in your paper.
截屏2023-12-26 22 36 12

Observations and Questions:

  1. Overall, I am quite satisfied as I achieved similar or nearly identical performance to what was reported in the paper for most datasets. However, there were some exceptions: notably lower results in the squirrel dataset and an unexpectedly higher score in chameleon. Could you please shed some light on why this might be the case?
  2. Furthermore, when setting the graph_weight to 1, SGFormer essentially transformed into a simple GCN model. In this scenario, the performance of the GCN model significantly surpassed what was reported in your paper, especially in the datasets cora, citeseer, pubmed, and squirrel. This observation seems to diminish the relative improvements of SGFormer over the standard GCN model.

Reproducing Data & Environment:

  • I used datasets and splittings from the provided shared link.
  • The following are details of my environment setup:
    • python=3.8.18=h955ad1f_0

    • pytorch=2.1.1=py3.8_cuda11.8_cudnn8.7.0_0

    • pytorch-cluster=1.6.3=py38_torch_2.1.0_cu118

    • pytorch-scatter=2.1.2=py38_torch_2.1.0_cu118

    • pytorch-sparse=0.6.18=py38_torch_2.1.0_cu118

Your insights or suggestions regarding these observations would be highly valuable.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.