lynnho / eigengan-tensorflow Goto Github PK

EigenGAN: Layer-Wise Eigen-Learning for GANs (ICCV 2021)

License: MIT License

Python 100.00%

eigengan gan gans

eigengan-tensorflow's Introduction

Gender	Bangs	Body Side	Pose (Yaw)

Lighting	Smile	Face Shape	Lipstick Color

Painting Style	Pose (Yaw)	Pose (Pitch)	Zoom & Rotate

Flush & Eye Color	Mouth Shape	Hair Color	Hue (Orange-Blue)

More Unsupervisedly Learned Dimensions

EigenGAN (ICCV 2021)
_video

TensorFlow implementation of EigenGAN: Layer-Wise Eigen-Learning for GANs
Schema
Manifold Perspective

Usage

Environment
- Python 3.6
- TensorFlow 1.15
- OpenCV, scikit-image, tqdm, oyaml
- we recommend Anaconda or Miniconda, then you can create the environment with commands below
```
conda create -n EigenGAN python=3.6

source activate EigenGAN

conda install opencv scikit-image tqdm tensorflow-gpu=1.15

conda install -c conda-forge oyaml
```
- NOTICE: if you create a new conda environment, remember to activate it before any other command
```
source activate EigenGAN
```
Data Preparation
- CelebA-unaligned (10.2GB, higher quality than the aligned data)
  - download the dataset
    - img_celeba.7z (move to ./data/img_celeba/img_celeba.7z): Google Drive or Baidu Netdisk (password rp0s)
    - annotations.zip (move to ./data/img_celeba/annotations.zip): Google Drive
  - unzip and process the data
```
7z x ./data/img_celeba/img_celeba.7z/img_celeba.7z.001 -o./data/img_celeba/

unzip ./data/img_celeba/annotations.zip -d ./data/img_celeba/

python ./scripts/align.py
```
- Anime
  - download the dataset
```
mkdir -p ./data/anime

rsync --verbose --recursive rsync://176.9.41.242:873/biggan/portraits/ ./data/anime/original_imgs
```
  - process the data
```
python ./scripts/remove_black_edge.py
```

Run (support multi-GPU)

training on CelebA

CUDA_VISIBLE_DEVICES=0,1 \
python train.py \
--img_dir ./data/img_celeba/aligned/align_size(572,572)_move(0.250,0.000)_face_factor(0.450)_jpg/data \
--experiment_name CelebA

training on Anime

CUDA_VISIBLE_DEVICES=0,1 \
python train.py \
--img_dir ./data/anime/remove_black_edge_imgs \
--experiment_name Anime

testing

CUDA_VISIBLE_DEVICES=0 \
python test_traversal_all_dims.py \
--experiment_name CelebA

loss visualization

CUDA_VISIBLE_DEVICES='' \
tensorboard \
--logdir ./output/CelebA/summaries \
--port 6006

Using Trained Weights
- trained weights (move to ./output/*.zip)
  - CelebA.zip
  - Anime.zip
- unzip the file (CelebA.zip for example)
```
unzip ./output/CelebA.zip -d ./output/
```
- testing (see above)

Citation

If you find EigenGAN useful in your research works, please consider citing:

@inproceedings{he2021eigengan,
  title={EigenGAN: Layer-Wise Eigen-Learning for GANs},
  author={He, Zhenliang and Kan, Meina and Shan, Shiguang},
  booktitle={International Conference on Computer Vision (ICCV)},
  year={2021}
}

eigengan-tensorflow's People

Contributors

Stargazers

Watchers

Forkers

liuguoyou zhangqianhui peternara templeblock ricklentz kiku-jw kamish liannice c1a1o1 chaoso alexblack2202 tamwaiban trendingtechnology gokulsg 19971104yc bcc0910 adelay95 taktak1 cv-ip billyxyb pingponglabs peterzhousz torment123 deciding hailangzz satoshirobatofujimoto molu1019 tantantetetao moileehyeji liuziyin618 samsgates peterria joeyzhaojy muyangly jiapengzhao1 wj-huang dillishrestha viplfvm platonchikizmsk blue0316

eigengan-tensorflow's Issues

网络如何学习到权重较大的特征向量呢？

网络中似乎没有对L的大小作约束，只对U的正交性做了约束，它是怎么保证像PCA一样能够找到每一层的主成分呢？

Hello,
Exciting project. Could you please share the minimum resource requirements to train this model?
I am getting memory errors training 500,000 256x256 images on 4 40GB A100 GPU's. Also my CPU memory is 128GB.

I tensorflow/stream_executor/stream.cc:1990] [stream=0x5570d9fd4600,impl=0x5570d9fd2050] did not wait for [stream=0x5570d9fd40a0,impl=0x5570d9fd2940] 2021-05-26 01:09:18.033272: I tensorflow/stream_executor/stream.cc:4925] [stream=0x5570d9fd4600,impl=0x5570d9fd2050] did not memcpy device-to-host; source: 0x2ac3e5d41a00 2021-05-26 01:09:18.033313: F tensorflow/core/common_runtime/gpu/gpu_util.cc:293] GPU->CPU Memcpy failed /hpc/users/marxg01/.lsbatch/1622001386.34151325.shell: line 23: 389547 Aborted (core dumped)

可以用图片做输入更变属性吗？

maximum of n_traversal?

Hi, Thanks for your great work!

I'd like to know What is the Maximum value of parameter 'n_traversal' when I test on Celeb-A dataset generator?

Question on Output dimension of Generator

Hello! I would first like to thank you for sharing codes for such wonderful work.
I have some questions on intermediate output dimension of feature maps in the generator.

According to Figure 16, I presumed output dimension of f (φ1) should be (1024, height, width),
as the first 1x1 convolution is noted as DeConv(2^(11-i), 1, 1).
To my understanding, it would not match noise input dimension of (512,4,4).

Then on examining your module.py code, the output dimension was controlled by nd() function,
which has upper limit of 512 unlike noted output dimension in figure 16.
I presumed such function was introduced to solve the dimension mismatch problem mentioned above.

In addition, i noticed a outlier in dimension of orthonormal basis. To my understanding, dimensions of orthonormal basis uij ∈ R^(Hi×Wi×Ci) should double as it moves up one layer, as height and width values are doubled whereas Ci value is halved. However, as nd(height) is a parameter determining dimension of U on module.py, I noticed dimension quadruples on moving from layer 1 to layer 2 (8192->32768), unlike expected behavior mentioned above.

Regarding aforementioned issues, I would like to ask questions below.

I would like to ask whether nd() function was introduced just to match the dimension, or there is other training gains of using such output dimensions.
I am wondering if there is any test results on setting initial noise dimension to 1024, i.e eps_dim = 1024 so the generator architecture would work as given on figure 16.
I am curious whether I have understood correctly about dimension of orthonormal basis.

If there is anything which I have misunderstood, please kindly point it out.
Thank you for your kind attention.

Attached below is toy code i used to estimate dimensions of intermediate feature map

zs = [tf.random.normal([64, z_dim]) for z_dim in [6] * 6]
h = tf.random.normal([64, 4 * 4 * 512])
h = tf.reshape(h, [-1, 4, 4, 512])

nd = lambda size: min(int(2**(12 - np.log2(size))), 512)
# nd = lambda size : 2048//size

print(f'Noise shape : {h[0].shape}')

for i, z in enumerate(zs):
    height = width = 4 * 2 ** i
#     print(i,height,z.shape[-1])
    U = tf.compat.v1.get_variable(f'U_{i}',initializer=tf.initializers.orthogonal(),shape=[height, width, nd(height), z.shape[-1]])
    L = tf.compat.v1.get_variable(f'L_{i}',shape=[z.shape[-1]],initializer=tf.initializers.constant([3 * i for i in range(z.shape[-1], 0, -1)]))
    mu = tf.compat.v1.get_variable(f'mu_{i}',shape=[height, width, nd(height)],initializer=tf.initializers.zeros())
    print(f'basis dimension : {tf.reshape(U[:,:,:,0],-1).shape}')
    
    h_ = tf.reduce_sum(U[None, ...] * (L[None, :] * z)[:, None, None, None, :], axis=-1) + mu[None, ...]
    h1 = transposed_convolution2d(h_, num_outputs = nd(height), kernel_size = 1) #deconv
    h2 = transposed_convolution2d(h_, num_outputs = nd(height*2), kernel_size = 3, stride = 2)
    
    
    h0 = transposed_convolution2d((h+h1),nd(height * 2), 3, 2)
    print(h1.shape, h2.shape, h0.shape)
    h = transposed_convolution2d((h0+h2),nd(height * 2), 3, 1)
#     print(h_.shape, h1.shape,h2.shape)
    print(f'output shape of layer {i+1} : {h[0].shape}')
print(f'final result :{convolution(h,num_outputs=3,kernel_size = 7)[0].shape}') #conv

Danbooru dataset

Hi~
Could you release a BaiDu NetDisk link of Danbooru dataset.
Thank you so much!

Projecting real images into the latent space

Hello!

I was wondering if there is a planned functionality to get the latent variables (U,L,z etc) of a real image that is similar to the dataset the GAN was trained on, essentially a way to recreate the metrics in Table 1 of the publication...