Coder Social home page Coder Social logo

kschwethelm / hyperboliccv Goto Github PK

View Code? Open in Web Editor NEW
25.0 7.0 4.0 362 KB

ICLR 2024 | Fully Hyperbolic Convolutional Neural Networks for Computer Vision | Official Implementation

Home Page: https://openreview.net/forum?id=ekz1hN5QNh

License: MIT License

Shell 0.08% Python 99.92%
convolutional-neural-networks hyperbolic-geometry pytorch computer-vision deep-learning fully-hyperbolic

hyperboliccv's Introduction

Hyperbolic Computer Vision - ICLR 2024

Official PyTorch implementation of the ICLR 2024 paper Fully Hyperbolic Convolutional Neural Networks for Computer Vision.

By Ahmad Bdeir, Kristian Schwethelm, Niels Landwehr

Overview

Introduction

TL; DR. In this work, we propose HCNN, a generalization of the convolutional neural network that learns latent feature representations in hyperbolic spaces in every layer, fully leveraging the benefits of hyperbolic geometry. This leads to better image representations and performance.

Abstract. Real-world visual data exhibit intrinsic hierarchical structures that can be represented effectively in hyperbolic spaces. Hyperbolic neural networks (HNNs) are a promising approach for learning feature representations in such spaces. However, current HNNs in computer vision rely on Euclidean backbones and only project features to the hyperbolic space in the task heads, limiting their ability to fully leverage the benefits of hyperbolic geometry. To address this, we present HCNN, a fully hyperbolic convolutional neural network (CNN) designed for computer vision tasks. Based on the Lorentz model, we generalize fundamental components of CNNs and propose novel formulations of the convolutional layer, batch normalization, and multinomial logistic regression. Experiments on standard vision tasks demonstrate the promising performance of our HCNN framework in both hybrid and fully hyperbolic settings. Overall, we believe our contributions provide a foundation for developing more powerful HNNs that can better represent complex structures found in image data.

This repository. In this repository, we provide implementations of the main experiments from our paper. Additionally, we set up a library with many network components for HNNs in the Lorentz model. The following components are included:

Lorentz model:

Poincaré ball:

Additional features:

Note. The curvature K of the Lorentz model is defined differently in our paper and in Geoopt -> geoopt.K = -1/K.

License

This code is released under the MIT License.

Citing our work

If you find our work useful for your research, please cite our paper as follows:

@inproceedings{Bdeir2024,
     title={Fully Hyperbolic Convolutional Neural Networks for Computer Vision},
     author={Ahmad Bdeir and Kristian Schwethelm and Niels Landwehr},
     booktitle={The Twelfth International Conference on Learning Representations},
     year={2024},
     url={https://openreview.net/forum?id=ekz1hN5QNh}
}

Main results

In this section, we provide our experimental results and configuration files for reproduction.

Classification

In this experiment, we evaluate the performance of HNNs on standard image classification tasks using ResNet-18 and three benchmark datasets: CIFAR-10, CIFAR-100, and Tiny-ImageNet. For this, we employ the training procedure of DeVries and Taylor (2017).

Table 1: Main results (Accuracy (%), estimated average and standard deviation from 5 runs).

Model CIFAR-10 CIFAR-100 Tiny-ImageNet Links
Euclidean 95.14 ± 0.12 77.72 ± 0.15 65.19 ± 0.12 config
Hybrid Poincaré 95.04 ± 0.13 77.19 ± 0.50 64.93 ± 0.38 config
Hybrid Lorentz (Ours) 94.98 ± 0.12 78.03 ± 0.21 65.63 ± 0.10 config
HCNN Lorentz (Ours) 95.14 ± 0.08 78.07 ± 0.17 65.71 ± 0.13 config

Image generation

In this experiment, we evaluate the performance of HNNs on image generation tasks using vanilla VAEs and three benchmark datasets: CIFAR-10, CIFAR-100, and CelebA.

Table 2: Main results (FID, estimated average and standard deviation from 5 runs).

CIFAR-10 CIFAR-10 CIFAR-100 CIFAR-100 CelebA CelebA
Model Rec. FID Gen. FID Rec. FID Gen. FID Rec. FID Gen. FID Links
Euclidean 61.21 ± 0.72 92.40 ± 0.80 63.81 ± 0.47 103.54 ± 0.84 54.80 ± 0.29 79.25 ± 0.89 config
Hybrid Poincaré 59.85 ± 0.50 90.13 ± 0.77 62.64 ± 0.43 98.19 ± 0.57 54.62 ± 0.61 81.30 ± 0.56 config
Hybrid Lorentz 59.29 ± 0.47 90.91 ± 0.84 62.14 ± 0.35 98.34 ± 0.62 54.64 ± 0.34 82.78 ± 0.93 config
HCNN Lorentz (Ours) 57.78 ± 0.56 89.20 ± 0.85 61.44 ± 0.64 100.27 ± 0.84 54.17 ± 0.66 78.11 ± 0.95 config

Installation

Requirements

  • Python>=3.8

    We recommend using Anaconda:

    conda create -n HCNN python=3.8 pip 
    conda activate HCNN
  • PyTorch, torchvision

    Get the correct command from here. For example, for Linux systems with CUDA version 11.7:

    conda install pytorch torchvision pytorch-cuda=11.7 -c pytorch -c nvidia
  • Additional requirements:

    pip install -r requirements.txt

Usage

Dataset preparation

We employ CIFAR-10/100 and CelebA implementations from torchvision. Tiny-ImageNet must be downloaded from here and organized as follows

HyperbolicCV/
└── code/
    └── classification/
        └── data/
            └── tiny-imagenet-200/
                └── train/
                    └── images/
                        └── n01443537/
                            └── n01443537_0.JPEG
                            └── ...
                        └── ...
                └── val/
                    └── images/
                        └── n01443537/
                            └── val_68.JPEG
                            └── ...
                        └── ...
                    └── val_annotations.txt

You can follow the following steps to download and organize the Tiny-ImageNet dataset automatically.

  1. Download dataset

    cd code/classification
    bash get_tinyimagenet.sh
  2. Organize dataset

    cd code/classification
    python org_tinyimagenet.py

Training

  • For classification, choose a config file and adapt the following command.

    python code/classification/train.py -c classification/config/L-ResNet18.txt

    You can also add additional arguments without changing the config file. For example:

    python code/classification/train.py -c classification/config/L-ResNet18.txt\
       --output_dir classification/output --device cuda:1 --dataset CIFAR-10
  • For generation, choose a config file and adapt the following command.

    python code/generation/train.py -c generation/config/L-VAE/L-VAE-CIFAR.txt

    You can also add additional arguments without changing the config file. For example:

    python code/generation/train.py -c generation/config/L-VAE/L-VAE-CIFAR.txt\
       --output_dir generation/output --device cuda:1 --dataset CIFAR-100

Evaluation

  • For classification, we provide a test script with the following options.

    • Test accuracy of a model. For example:

      python code/classification/test.py -c classification/config/L-ResNet18.txt\
        --mode test_accuracy --load_checkpoint PATH/TO/WEIGHTS.pth
    • Visualize embeddings of a model. For example:

      python code/classification/test.py -c classification/config/L-ResNet18.txt\
        --mode visualize_embeddings --load_checkpoint PATH/TO/WEIGHTS.pth --output_dir classification/output
    • Adversarial attacks (FGSM and PGD). For example:

      python code/classification/test.py -c classification/config/L-ResNet18.txt\
        --mode fgsm --load_checkpoint PATH/TO/WEIGHTS.pth
      python code/classification/test.py -c classification/config/L-ResNet18.txt\
        --mode pgd --load_checkpoint PATH/TO/WEIGHTS.pth
  • For generation, we provide a similar test script with the following options.

    • Test FID of a model. For example:

      python code/generation/test.py -c generation/config/L-VAE/L-VAE-CIFAR.txt\
        --mode test_FID --load_checkpoint PATH/TO/WEIGHTS.pth
    • Visualize latent embeddings of a model. For example:

      python code/generation/test.py -c generation/config/L-VAE/L-VAE-MNIST.txt\
        --mode visualize_embeddings --load_checkpoint PATH/TO/WEIGHTS.pth --output_dir generation/output
    • Visualize sample generations/reconstructions of a model. For example:

      python code/generation/test.py -c generation/config/L-VAE/L-VAE-CIFAR.txt\
        --mode generate --load_checkpoint PATH/TO/WEIGHTS.pth --output_dir generation/output
      python code/generation/test.py -c generation/config/L-VAE/L-VAE-CIFAR.txt\
        --mode reconstruct --load_checkpoint PATH/TO/WEIGHTS.pth --output_dir generation/output

hyperboliccv's People

Contributors

kschwethelm avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

hyperboliccv's Issues

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.