Coder Social home page Coder Social logo

fms231 / galip Goto Github PK

View Code? Open in Web Editor NEW

This project forked from tobran/galip

0.0 0.0 0.0 1.2 MB

[CVPR2023] A faster, smaller, and better text-to-image model for large-scale training

License: MIT License

Shell 1.97% Python 93.90% Jupyter Notebook 4.13%

galip's Introduction

Visitors License CC BY-NC-SA 4.0 Python 3.9 Packagist hardware Last Commit Maintenance Ask Me Anything !

GALIP: Generative Adversarial CLIPs for Text-to-Image Synthesis (CVPR 2023)

A high-quality, fast, and efficient text-to-image model

Official Pytorch implementation for our paper GALIP: Generative Adversarial CLIPs for Text-to-Image Synthesis by Ming Tao, Bing-Kun Bao, Hao Tang, Changsheng Xu.

Generated Images

Requirements

  • python 3.9
  • Pytorch 1.9
  • At least 1x24GB 3090 GPU (for training)
  • Only CPU (for sampling)

GALIP is a small and fast generative model which can generate multiple pictures in one second even on the CPU.

Installation

Clone this repo.

git clone https://github.com/tobran/GALIP
pip install -r requirements.txt

Install CLIP

Preparation (Same as DF-GAN)

Datasets

  1. Download the preprocessed metadata for birds coco and extract them to data/
  2. Download the birds image data. Extract them to data/birds/
  3. Download coco2014 dataset and extract the images to data/coco/images/

Training

cd GALIP/code/

Train the GALIP model

  • For bird dataset: bash scripts/train.sh ./cfg/bird.yml
  • For coco dataset: bash scripts/train.sh ./cfg/coco.yml

Resume training process

If your training process is interrupted unexpectedly, set state_epoch, log_dir, and pretrained_model_path in train.sh to resume training.

TensorBoard

Our code supports automate FID evaluation during training, the results are stored in TensorBoard files under ./logs. You can change the test interval by changing test_interval in the YAML file.

  • For bird dataset: tensorboard --logdir=./code/logs/bird/train --port 8166
  • For coco dataset: tensorboard --logdir=./code/logs/coco/train --port 8177

Evaluation

Download Pretrained Model

  • GALIP for COCO. Download and save it to ./code/saved_models/pretrained/
  • GALIP for CC12M. Download and save it to ./code/saved_models/pretrained/

Evaluate GALIP models

cd GALIP/code/

set pretrained_model in test.sh

  • For bird dataset: bash scripts/test.sh ./cfg/bird.yml
  • For COCO dataset: bash scripts/test.sh ./cfg/coco.yml
  • For CC12M (zero-shot on COCO) dataset: bash scripts/test.sh ./cfg/coco.yml

Performance

The released model achieves better performance than the paper version.

Model COCO-FID↓ COCO-CS↑ CC12M-ZFID↓
GALIP(paper) 5.85 0.3338 12.54
GALIP(released) 5.01 0.3379 12.54

Sampling

Synthesize images from your text descriptions

  • the sample.ipynb can be used to sample

Citing GALIP

If you find GALIP useful in your research, please consider citing our paper:


@inproceedings{tao2023galip,
  title={GALIP: Generative Adversarial CLIPs for Text-to-Image Synthesis},
  author={Tao, Ming and Bao, Bing-Kun and Tang, Hao and Xu, Changsheng},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  pages={14214--14223},
  year={2023}
}

The code is released for academic research use only. For commercial use, please contact Ming Tao (陶明) ([email protected]).

Reference

galip's People

Contributors

tobran avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.