I trained the network from scratch but got poor results. Below are my training par

I added two lines to the training . It should work now. <code class="notrans

Hi <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="

Thanks for your answer. Besides, when training with <code class=

Some details about the training parameters. about twingan HOT 5 CLOSED

jerryli27 commented on August 23, 2024

Some details about the training parameters.

from twingan.

Comments (5)

jerryli27 commented on August 23, 2024 1

I added two lines to the training script. It should work now.
--gradient_penalty_lambda=0.25 --use_unet=True

The whole script now looks like:

python pggan_runner.py
--program_name=twingan
--dataset_name="image_only"
# Assume you have data like 
# ./data/celeba/train-00000-of-00100.tfrecord,  
# ./data/celeba/train-00001-of-00100.tfrecord ...
--dataset_dir="./data/celeba/"
--unpaired_target_dataset_name="anime_faces"
--unpaired_target_dataset_dir="./data/anime_faces/"
--train_dir="./checkpoints/twingan_faces/"
--dataset_split_name=train
--preprocessing_name="danbooru"
--resize_mode=RESHAPE
--do_random_cropping=True
--learning_rate=0.0001
--learning_rate_decay_type=fixed
--is_training=True
--generator_network="pggan"
--use_unet=True
--num_images_per_resolution=300000
--loss_architecture=dragan
--gradient_penalty_lambda=0.25
--pggan_max_num_channels=256
--generator_norm_type=batch_renorm
--hw_to_batch_size="{4: 8, 8: 8, 16: 8, 32: 8, 64: 8, 128: 4, 256: 3, 512: 2}"

I haven't tested with the multi-gpu setting thoroughly yet due to limits in hardware, so yes there may be some bug, but you can try to add the following flags.

--sync_replicas=False
--replicas_to_aggregate=1
--num_clones=2
--worker_replicas=1

I updated the training readme with the comments above.

from twingan.

jerryli27 commented on August 23, 2024 1

Hi @lionel3 I updated the training documentation. There was indeed a bug in my default parameters. After fixing that I am able to reproduce my previous results.

Please sync to the latest version and see https://github.com/jerryli27/TwinGAN/blob/master/docs/training.md .

The parameters I added are:

--do_pixel_norm=True
--l_content_weight=0.1
--l_cycle_weight=1.0

Please reopen this issue if you cannot reproduce. Thanks!

from twingan.

jerryli27 commented on August 23, 2024

Yes you're right. Sorry for the wrong documentation. I'll push a newer version shortly.

The num_image_per_resolution I used was '300000'. Of course 600000 should also work, but it takes longer to train.
Please change to --resize_mode=RESHAPE.

FYI. The --do_random_cropping=True is in case ~~You can try RANDOM_CROP as well if~~ at inference time the quality is too bad because the face is not at the center of the image.

I am rerunning the exact code that I provided in the training example code. It will take a day or two for me to verify that it works.

from twingan.

lionel3 commented on August 23, 2024

Thanks for your answer.

Besides, when training with
'hw_to_batch_size', '{4: 16, 8: 16, 16: 16, 32: 16, 64: 12, 128: 12, 256: 12, 512: 6}.
I got ResourceExhaustedError: OOM when allocating tensor with ... during fade-in phase from resolution 128 to 256. Same error when trying 2 GPUs.
I am not familiar with Tensorflow. I guess there may be some bug with Multi-GPU training.

I will try to reproduce the error and show more training details once I have idle GPU.

from twingan.

lionel3 commented on August 23, 2024

Thanks, I will try it out asap.

from twingan.

Some details about the training parameters. about twingan HOT 5 CLOSED

Comments (5)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent