eladrich / pixel2style2pixel Goto Github PK

Official Implementation for "Encoding in Style: a StyleGAN Encoder for Image-to-Image Translation" (CVPR 2021) presenting the pixel2style2pixel (pSp) framework

Home Page: https://eladrich.github.io/pixel2style2pixel/

License: MIT License

Python 3.91% C++ 0.04% Cuda 0.26% Jupyter Notebook 95.75% Shell 0.04%

image-translation stylegan generative-adversarial-network stylegan-encoder cvpr2021 pixel2style2pixel psp-model psp-framework

pixel2style2pixel's Introduction

Encoding in Style: a StyleGAN Encoder for Image-to-Image Translation

We present a generic image-to-image translation framework, pixel2style2pixel (pSp). Our pSp framework is based on a novel encoder network that directly generates a series of style vectors which are fed into a pretrained StyleGAN generator, forming the extended W+ latent space. We first show that our encoder can directly embed real images into W+, with no additional optimization. Next, we propose utilizing our encoder to directly solve image-to-image translation tasks, defining them as encoding problems from some input domain into the latent domain. By deviating from the standard "invert first, edit later" methodology used with previous StyleGAN encoders, our approach can handle a variety of tasks even when the input image is not represented in the StyleGAN domain. We show that solving translation tasks through StyleGAN significantly simplifies the training process, as no adversary is required, has better support for solving tasks without pixel-to-pixel correspondence, and inherently supports multi-modal synthesis via the resampling of styles. Finally, we demonstrate the potential of our framework on a variety of facial image-to-image translation tasks, even when compared to state-of-the-art solutions designed specifically for a single task, and further show that it can be extended beyond the human facial domain.

The proposed pixel2style2pixel framework can be used to solve a wide variety of image-to-image translation tasks. Here we show results of pSp on StyleGAN inversion, multi-modal conditional image synthesis, facial frontalization, inpainting and super-resolution.

Description

Official Implementation of our pSp paper for both training and evaluation. The pSp method extends the StyleGAN model to allow solving different image-to-image translation problems using its encoder.

Description
Table of Contents
Recent Updates
Applications
Getting Started
Training
Testing
Additional Applications
- Toonify
Repository structure
TODOs
Credits
Inspired by pSp
pSp in the Media
Citation

Recent Updates

2020.10.04: Initial code release
2020.10.06: Add pSp toonify model (Thanks to the great work from Doron Adler and Justin Pinkney)!
2021.04.23: Added several new features:

Added supported for StyleGANs of different resolutions (e.g., 256, 512, 1024). This can be set using the flag --output_size, which is set to 1024 by default.
Added support for the MoCo-Based similarity loss introduced in encoder4editing (Tov et al. 2021). More details are provided below.

2021.07.06: Added support for training with Weights & Biases. See below for details.

Applications

StyleGAN Encoding

Here, we use pSp to find the latent code of real images in the latent domain of a pretrained StyleGAN generator.

Face Frontalization

In this application we want to generate a front-facing face from a given input image.

Conditional Image Synthesis

Here we wish to generate photo-realistic face images from ambiguous sketch images or segmentation maps. Using style-mixing, we inherently support multi-modal synthesis for a single input.

Super Resolution

Given a low-resolution input image, we generate a corresponding high-resolution image. As this too is an ambiguous task, we can use style-mixing to produce several plausible results.

Getting Started

Prerequisites

Linux or macOS
NVIDIA GPU + CUDA CuDNN (CPU may be possible with some modifications, but is not inherently supported)
Python 2 or 3

Installation

Clone this repo:

git clone https://github.com/eladrich/pixel2style2pixel.git
cd pixel2style2pixel

Dependencies:
We recommend running this repository using Anaconda. All dependencies for defining the environment are provided in environment/psp_env.yaml.

Inference Notebook

To help visualize the pSp framework on multiple tasks and to help you get started, we provide a Jupyter notebook found in notebooks/inference_playground.ipynb that allows one to visualize the various applications of pSp.
The notebook will download the necessary pretrained models and run inference on the images found in notebooks/images.
For the tasks of conditional image synthesis and super resolution, the notebook also demonstrates pSp's ability to perform multi-modal synthesis using style-mixing.

Pretrained Models

Please download the pre-trained models from the following links. Each pSp model contains the entire pSp architecture, including the encoder and decoder weights.

Path	Description
StyleGAN Inversion	pSp trained with the FFHQ dataset for StyleGAN inversion.
Face Frontalization	pSp trained with the FFHQ dataset for face frontalization.
Sketch to Image	pSp trained with the CelebA-HQ dataset for image synthesis from sketches.
Segmentation to Image	pSp trained with the CelebAMask-HQ dataset for image synthesis from segmentation maps.
Super Resolution	pSp trained with the CelebA-HQ dataset for super resolution (up to x32 down-sampling).
Toonify	pSp trained with the FFHQ dataset for toonification using StyleGAN generator from Doron Adler and Justin Pinkney.

If you wish to use one of the pretrained models for training or inference, you may do so using the flag --checkpoint_path.

In addition, we provide various auxiliary models needed for training your own pSp model from scratch as well as pretrained models needed for computing the ID metrics reported in the paper.

Path	Description
FFHQ StyleGAN	StyleGAN model pretrained on FFHQ taken from rosinality with 1024x1024 output resolution.
IR-SE50 Model	Pretrained IR-SE50 model taken from TreB1eN for use in our ID loss during pSp training.
MoCo ResNet-50	Pretrained ResNet-50 model trained using MOCOv2 for computing MoCo-based similarity loss on non-facial domains. The model is taken from the official implementation.
CurricularFace Backbone	Pretrained CurricularFace model taken from HuangYG123 for use in ID similarity metric computation.
MTCNN	Weights for MTCNN model taken from TreB1eN for use in ID similarity metric computation. (Unpack the tar.gz to extract the 3 model weights.)

By default, we assume that all auxiliary models are downloaded and saved to the directory pretrained_models. However, you may use your own paths by changing the necessary values in configs/path_configs.py.

Training

Preparing your Data

Currently, we provide support for numerous datasets and experiments (encoding, frontalization, etc.).
- Refer to configs/paths_config.py to define the necessary data paths and model paths for training and evaluation.
- Refer to configs/transforms_config.py for the transforms defined for each dataset/experiment.
- Finally, refer to configs/data_configs.py for the source/target data paths for the train and test sets as well as the transforms.
If you wish to experiment with your own dataset, you can simply make the necessary adjustments in
1. data_configs.py to define your data paths.
2. transforms_configs.py to define your own data transforms.

As an example, assume we wish to run encoding using ffhq (dataset_type=ffhq_encode). We first go to configs/paths_config.py and define:

dataset_paths = {
    'ffhq': '/path/to/ffhq/images256x256'
    'celeba_test': '/path/to/CelebAMask-HQ/test_img',
}

The transforms for the experiment are defined in the class EncodeTransforms in configs/transforms_config.py.
Finally, in configs/data_configs.py, we define:

DATASETS = {
   'ffhq_encode': {
        'transforms': transforms_config.EncodeTransforms,
        'train_source_root': dataset_paths['ffhq'],
        'train_target_root': dataset_paths['ffhq'],
        'test_source_root': dataset_paths['celeba_test'],
        'test_target_root': dataset_paths['celeba_test'],
    },
}

When defining our datasets, we will take the values in the above dictionary.

Training pSp

The main training script can be found in scripts/train.py.
Intermediate training results are saved to opts.exp_dir. This includes checkpoints, train outputs, and test outputs.
Additionally, if you have tensorboard installed, you can visualize tensorboard logs in opts.exp_dir/logs.

Training the pSp Encoder

python scripts/train.py \
--dataset_type=ffhq_encode \
--exp_dir=/path/to/experiment \
--workers=8 \
--batch_size=8 \
--test_batch_size=8 \
--test_workers=8 \
--val_interval=2500 \
--save_interval=5000 \
--encoder_type=GradualStyleEncoder \
--start_from_latent_avg \
--lpips_lambda=0.8 \
--l2_lambda=1 \
--id_lambda=0.1

Frontalization

python scripts/train.py \
--dataset_type=ffhq_frontalize \
--exp_dir=/path/to/experiment \
--workers=8 \
--batch_size=8 \
--test_batch_size=8 \
--test_workers=8 \
--val_interval=2500 \
--save_interval=5000 \
--encoder_type=GradualStyleEncoder \
--start_from_latent_avg \
--lpips_lambda=0.08 \
--l2_lambda=0.001 \
--lpips_lambda_crop=0.8 \
--l2_lambda_crop=0.01 \
--id_lambda=1 \
--w_norm_lambda=0.005

Sketch to Face

python scripts/train.py \
--dataset_type=celebs_sketch_to_face \
--exp_dir=/path/to/experiment \
--workers=8 \
--batch_size=8 \
--test_batch_size=8 \
--test_workers=8 \
--val_interval=2500 \
--save_interval=5000 \
--encoder_type=GradualStyleEncoder \
--start_from_latent_avg \
--lpips_lambda=0.8 \
--l2_lambda=1 \
--id_lambda=0 \
--w_norm_lambda=0.005 \
--label_nc=1 \
--input_nc=1

Segmentation Map to Face

python scripts/train.py \
--dataset_type=celebs_seg_to_face \
--exp_dir=/path/to/experiment \
--workers=8 \
--batch_size=8 \
--test_batch_size=8 \
--test_workers=8 \
--val_interval=2500 \
--save_interval=5000 \
--encoder_type=GradualStyleEncoder \
--start_from_latent_avg \
--lpips_lambda=0.8 \
--l2_lambda=1 \
--id_lambda=0 \
--w_norm_lambda=0.005 \
--label_nc=19 \
--input_nc=19

Notice with conditional image synthesis no identity loss is utilized (i.e. --id_lambda=0)

Super Resolution

python scripts/train.py \
--dataset_type=celebs_super_resolution \
--exp_dir=/path/to/experiment \
--workers=8 \
--batch_size=8 \
--test_batch_size=8 \
--test_workers=8 \
--val_interval=2500 \
--save_interval=5000 \
--encoder_type=GradualStyleEncoder \
--start_from_latent_avg \
--lpips_lambda=0.8 \
--l2_lambda=1 \
--id_lambda=0.1 \
--w_norm_lambda=0.005 \
--resize_factors=1,2,4,8,16,32

Additional Notes

See options/train_options.py for all training-specific flags.
See options/test_options.py for all test-specific flags.
If you wish to resume from a specific checkpoint (e.g. a pretrained pSp model), you may do so using --checkpoint_path.
By default, we assume that the StyleGAN used outputs images at resolution 1024x1024. If you wish to use a StyleGAN at a smaller resolution, you can do so by using the flag --output_size (e.g., --output_size=256).
If you wish to generate images from segmentation maps, please specify --label_nc=N and --input_nc=N where N is the number of semantic categories.
Similarly, for generating images from sketches, please specify --label_nc=1 and --input_nc=1.
Specifying --label_nc=0 (the default value), will directly use the RGB colors as input.

** Identity/Similarity Losses **
In pSp, we introduce a facial identity loss using a pre-trained ArcFace network for facial recognition. When operating on the human facial domain, we highly recommend employing this loss objective by using the flag --id_lambda.
In a more recent paper, encoder4editing, the authors generalize this identity loss to other domains by using a MoCo-based ResNet to extract features instead of an ArcFace network. Applying this MoCo-based similarity loss can be done by using the flag --moco_lambda. We recommend setting --moco_lambda=0.5 in your experiments.
Please note, you cannot set both id_lambda and moco_lambda to be active simultaneously (e.g., to use the MoCo-based loss, you should specify, --moco_lambda=0.5 --id_lambda=0).

Weights & Biases Integration

To help track your experiments, we've integrated Weights & Biases into our training process. To enable Weights & Biases (wandb), first make an account on the platform's webpage and install wandb using pip install wandb. Then, to train pSp using wandb, simply add the flag --use_wandb.

Note that when running for the first time, you will be asked to provide your access key which can be accessed via the Weights & Biases platform.

Using Weights & Biases will allow you to visualize the training and testing loss curves as well as intermediate training results.

Testing

Inference

Having trained your model, you can use scripts/inference.py to apply the model on a set of images.
For example,

python scripts/inference.py \
--exp_dir=/path/to/experiment \
--checkpoint_path=experiment/checkpoints/best_model.pt \
--data_path=/path/to/test_data \
--test_batch_size=4 \
--test_workers=4 \
--couple_outputs

Additional notes to consider:

During inference, the options used during training are loaded from the saved checkpoint and are then updated using the test options passed to the inference script. For example, there is no need to pass --dataset_type or --label_nc to the inference script, as they are taken from the loaded opts.
When running inference for segmentation-to-image or sketch-to-image, it is highly recommend to do so with a style-mixing, as is done in the paper. This can simply be done by adding --latent_mask=8,9,10,11,12,13,14,15,16,17 when calling the script.
When running inference for super-resolution, please provide a single down-sampling value using --resize_factors.
Adding the flag --couple_outputs will save an additional image containing the input and output images side-by-side in the sub-directory inference_coupled. Otherwise, only the output image is saved to the sub-directory inference_results.
By default, the images will be saved at resolutiosn of 1024x1024, the original output size of StyleGAN. If you wish to save outputs resized to resolutions of 256x256, you can do so by adding the flag --resize_outputs.

Multi-Modal Synthesis with Style-Mixing

Given a trained model for conditional image synthesis or super-resolution, we can easily generate multiple outputs for a given input image. This can be done using the script scripts/style_mixing.py.
For example, running the following command will perform style-mixing for a segmentation-to-image experiment:

python scripts/style_mixing.py \
--exp_dir=/path/to/experiment \
--checkpoint_path=/path/to/experiment/checkpoints/best_model.pt \
--data_path=/path/to/test_data/ \
--test_batch_size=4 \
--test_workers=4 \
--n_images=25 \
--n_outputs_to_generate=5 \
--latent_mask=8,9,10,11,12,13,14,15,16,17

Here, we inject 5 randomly drawn vectors and perform style-mixing on the latents [8,9,10,11,12,13,14,15,16,17].

Additional notes to consider:

To perform style-mixing on a subset of images, you may use the flag --n_images. The default value of None will perform style mixing on every image in the given data_path.
You may also include the argument --mix_alpha=m where m is a float defining the mixing coefficient between the input latent and the randomly drawn latent.
When performing style-mixing for super-resolution, please provide a single down-sampling value using --resize_factors.
By default, the images will be saved at resolutiosn of 1024x1024, the original output size of StyleGAN. If you wish to save outputs resized to resolutions of 256x256, you can do so by adding the flag --resize_outputs.

Computing Metrics

Similarly, given a trained model and generated outputs, we can compute the loss metrics on a given dataset.
These scripts receive the inference output directory and ground truth directory.

Calculating the identity loss:

python scripts/calc_id_loss_parallel.py \
--data_path=/path/to/experiment/inference_outputs \
--gt_path=/path/to/test_images \

Calculating LPIPS loss:

python scripts/calc_losses_on_images.py \
--mode lpips
--data_path=/path/to/experiment/inference_outputs \
--gt_path=/path/to/test_images \

Calculating L2 loss:

python scripts/calc_losses_on_images.py \
--mode l2
--data_path=/path/to/experiment/inference_outputs \
--gt_path=/path/to/test_images \

Additional Applications

To better show the flexibility of our pSp framework we present additional applications below.

As with our main applications, you may download the pretrained models here:

Path	Description
Toonify	pSp trained with the FFHQ dataset for toonification using StyleGAN generator from Doron Adler and Justin Pinkney.

Toonify

Using the toonify StyleGAN built by Doron Adler and Justin Pinkney, we take a real face image and generate a toonified version of the given image. We train the pSp encoder to directly reconstruct real face images inside the toons latent space resulting in a projection of each image to the closest toon. We do so without requiring any labeled pairs or distillation!

This is trained exactly like the StyleGAN inversion task with several changes:

Change from FFHQ StyleGAN to toonifed StyleGAN (can be set using --stylegan_weights)
- The toonify generator is taken from Doron Adler and Justin Pinkney and converted to Pytorch using rosinality's conversion script.
- For convenience, the converted generator Pytorch model may be downloaded here.
Increase id_lambda from 0.1 to 1
Increase w_norm_lambda from 0.005 to 0.025

We obtain the best results after around 6000 iterations of training (can be set using --max_steps)

Repository structure

Path	Description
pixel2style2pixel	Repository root folder
├ configs	Folder containing configs defining model/data paths and data transforms
├ criteria	Folder containing various loss criterias for training
├ datasets	Folder with various dataset objects and augmentations
├ environment	Folder containing Anaconda environment used in our experiments
├ models	Folder containting all the models and training objects
│ ├ encoders	Folder containing our pSp encoder architecture implementation and ArcFace encoder implementation from TreB1eN
│ ├ mtcnn	MTCNN implementation from TreB1eN
│ ├ stylegan2	StyleGAN2 model from rosinality
│ └ psp.py	Implementation of our pSp framework
├ notebook	Folder with jupyter notebook containing pSp inference playground
├ options	Folder with training and test command-line options
├ scripts	Folder with running scripts for training and inference
├ training	Folder with main training logic and Ranger implementation from lessw2020
├ utils	Folder with various utility functions

TODOs

Add multi-gpu support

Credits

StyleGAN2 implementation:
https://github.com/rosinality/stylegan2-pytorch
Copyright (c) 2019 Kim Seonghyeon
License (MIT) https://github.com/rosinality/stylegan2-pytorch/blob/master/LICENSE

MTCNN, IR-SE50, and ArcFace models and implementations:
https://github.com/TreB1eN/InsightFace_Pytorch
Copyright (c) 2018 TreB1eN
License (MIT) https://github.com/TreB1eN/InsightFace_Pytorch/blob/master/LICENSE

CurricularFace model and implementation:
https://github.com/HuangYG123/CurricularFace
Copyright (c) 2020 HuangYG123
License (MIT) https://github.com/HuangYG123/CurricularFace/blob/master/LICENSE

Ranger optimizer implementation:
https://github.com/lessw2020/Ranger-Deep-Learning-Optimizer
License (Apache License 2.0) https://github.com/lessw2020/Ranger-Deep-Learning-Optimizer/blob/master/LICENSE

LPIPS implementation:
https://github.com/S-aiueo32/lpips-pytorch
Copyright (c) 2020, Sou Uchida
License (BSD 2-Clause) https://github.com/S-aiueo32/lpips-pytorch/blob/master/LICENSE

Please Note: The CUDA files under the StyleGAN2 ops directory are made available under the Nvidia Source Code License-NC

Inspired by pSp

Below are several works inspired by pSp that we found particularly interesting:

Reverse Toonification
Using our pSp encoder, artist Nathan Shipley transformed animated figures and paintings into real life. Check out his amazing work on his twitter page and website.

Deploying pSp with StyleSpace for Editing
Awesome work from Justin Pinkney who deployed our pSp model on Runway and provided support for editing the resulting inversions using the StyleSpace Analysis paper. Check out his repository here.

Encoder4Editing (e4e)
Building on the work of pSp, Tov et al. design an encoder to enable high quality edits on real images. Check out their paper and code.

Style-based Age Manipulation (SAM)
Leveraging pSp and the rich semantics of StyleGAN, SAM learns non-linear latent space paths for modeling the age transformation of real face images. Check out the project page here.

ReStyle
ReStyle builds on recent encoders such as pSp and e4e by introducing an iterative refinment mechanism to gradually improve the inversion of real images. Check out the project page here.

pSp in the Media

Citation

If you use this code for your research, please cite our paper Encoding in Style: a StyleGAN Encoder for Image-to-Image Translation:

@InProceedings{richardson2021encoding,
      author = {Richardson, Elad and Alaluf, Yuval and Patashnik, Or and Nitzan, Yotam and Azar, Yaniv and Shapiro, Stav and Cohen-Or, Daniel},
      title = {Encoding in Style: a StyleGAN Encoder for Image-to-Image Translation},
      booktitle = {IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
      month = {June},
      year = {2021}
}

pixel2style2pixel's People

Contributors

Stargazers

Watchers

Forkers

m-hussien liannice rogalag trendingtechnology mrtomerlevi louis-chevallier mhunesi davenw16hd rozgo hadryan huzeyfecoskun darrenxc tchigher ml-and-ai-repo syedrz taktak1 suigenk not-nam-or-am-i jacobwjs kp-forks liuguoyou rajputjay41 zhangxuan1918 diegosiqueir4 samuelpietri johndpope lilith5th lolotica123 templeblock briggs599 c1a1o1 hjguyhan ryanh2 ibmua xiaohunlt xeransis anglex-zhi superxingzai dian-yi facial-micro-expressiongc peterzhousz dheerajpatta 3k jiesonshan lwzbuaa y742035557 qqyouhappy lz118 konatasick mohammedkassem kgiszewski diegolameira blackaller universewill nihon-gorilla ryanmross dlintin lsheiba justmaulik levindabhi xrosliang hajungong007 sakuraicml blakecheng otistav cv-ip lenkerr gu-ma optimusprimeultra n1ckfg eko666 zhigaloff rtsrc hefengxiyulu vastyao helixngc7293 roxanneluo ideaplexus kokizzu fajarardiyanto dreyk longjohncoder tamnil rebotnix jhlee17 jonathanrein alphalzz ruanjiyang darylfung96 ziqiren minha12 piterskiy staskh idg7 xiaohaipeng baicaipcx brightmart dearkafka justinpinkney summonswar

pixel2style2pixel's Issues

stylegan2 pytorch weights

that's so excellent result for image encoding. And i want to kown how convert stylegan2 weights from official tensorflow implement to pytorch? thanks!

can you please share an example in colab to train.

I have a set of images.
can you share an example in colab to train a set of images

using a different stylegan weights

Hi, thanks for the awesome work.

I'd like to train an encoder using a pre-trained 256x256 stylegan model (https://github.com/rosinality/stylegan2-pytorch) but I'm not sure using this model would be okay for your training code. Since in my case the resolution only goes up to 256, there should be 14x512 latent vectors which are less than your latent vector size, 18x512. My question is, can I still use 256x256 stylegan pre-trained model for this work?

Thanks in advance.

WNormLoss

Thanks for your open code. I want to konw how the WNormLoss work in the training, and if I don't use it, it will or not influence the results?

Training for targets other than human faces?

Is it possible to train psp for any domain or does it only work for human faces?

UnpicklingError: invalid load key, '<'.

Hi?

I encountered this error message
UnpicklingError: invalid load key, '<'.

after I run this cell
model_path = EXPERIMENT_ARGS['model_path']
ckpt = torch.load(model_path, map_location='cpu')

Can someone help me?

I'm trying this in colab notebook.

Encoding my own images

Hello, newbie here.
I opened your project in Colab but couldn't manage to encode my own images. I tried manually uploading my images to the "inversion_images" folder, but got some undesirable results. I believe the reason is that the images need to be aligned and resized before being fed to pSp. So, how can I achieve this?

No instructions on how to train an inpainting model

pSp Encoder, Frontalization, Sketch to Face, Segmentation Map to Face and Super Resolution do have instructions, but there are none for inpainting. Would it be possible to add instructions to the readme? And I assume there is no mask related code, since I can't find code related to that. How are masks supposed to be used or needs this to be implemented?

Pretrained Model for Inpainting

Hi,

I was wondering if there is a checkpoint for the inpainting model.

Thank you very much !

Running inference on a CPU

Thank you for sharing this code.

In the README you say it might be possible to run this on a CPU. I am specifically interesting in running the inference on a CPU. Can you point out what that needs to be changed in order to adapt inference to run on a CPU?

Also a side question on parameters tuning for training.
What parameters should I tune in order to improve the ability for the model to include more details like facial marks (freckles, moles, wrinkles). It seams my model trained below parameters is omitting these details.
--lpips_lambda=0.8
--l2_lambda=1
--id_lambda=0
--w_norm_lambda=0.005
--lpips_lambda_crop=0.8

Thanks again!

hi ,how do what and get good result in ffhq_encode

hi
can u give some idea , how to use
batch_size': '8',
'board_interval': 50,
'checkpoint_path': None,
'dataset_type': 'ffhq_encode',
'device': 'cuda:0',
'encoder_type': 'GradualStyleEncoder',
'exp_dir': '',
'id_lambda': 0.1,
'image_interval': 100,
'input_nc': 3,
'l2_lambda': 1.0,
'l2_lambda_crop': 0,
'label_nc': 0,
'learn_in_w': False,
'learning_rate': 0.0001,
'lpips_lambda': 0.8,
'lpips_lambda_crop': 0,
'max_steps': 300000,
'optim_name': 'ranger',
'resize_factors': None,
'save_interval': '10000',
'start_from_latent_avg': True,
'stylegan_weights': '',
'test_batch_size': '6',
'test_workers': 2,
'train_decoder': False,
'val_interval': 2500,
'w_norm_lambda': 0,
'workers': '8'}

Colab Error

Hi!
Thanks for your wonderful work.

But, I founded that an error occurs when i execute your colab example, even i did not touch any code.
Error :

can you please check it?

why our model generate a image suround by a block shadow

The Source Domain data of stylegan2, the pre-training model we adopted, is FFHQ dataset, while the target Domain dataset is some cartoon images. Stylegan2, on the other hand, produces good cartoon images. When we use this Stylegan2 to train the PSP model, both the Source Domain and target Domain datasets of the PSP model are FFHQ data. Something weird happens, and all the generated images are surrounded by a shadow of a square. We don't know if you have encountered this problem or if you have any solutions?

RuntimeError: psp_ffhq_toonify.pt is a zip archive (did you mean to use torch.jit.load()?)

environments:
python 3.6.7
torch 1.3.1
torchvision 0.4.2
CUDA 10.1

Traceback (most recent call last):
File "/root/anaconda3/envs/psp_env/lib/python3.6/tarfile.py", line 189, in nti
n = int(s.strip() or "0", 8)
ValueError: invalid literal for int() with base 8: 'ightq\x04ct'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/root/anaconda3/envs/psp_env/lib/python3.6/tarfile.py", line 2297, in next
tarinfo = self.tarinfo.fromtarfile(self)
File "/root/anaconda3/envs/psp_env/lib/python3.6/tarfile.py", line 1093, in fromtarfile
obj = cls.frombuf(buf, tarfile.encoding, tarfile.errors)
File "/root/anaconda3/envs/psp_env/lib/python3.6/tarfile.py", line 1035, in frombuf
chksum = nti(buf[148:156])
File "/root/anaconda3/envs/psp_env/lib/python3.6/tarfile.py", line 191, in nti
raise InvalidHeaderError("invalid header")
tarfile.InvalidHeaderError: invalid header

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/root/anaconda3/envs/psp_env/lib/python3.6/site-packages/torch/serialization.py", line 595, in _load
return legacy_load(f)
File "/root/anaconda3/envs/psp_env/lib/python3.6/site-packages/torch/serialization.py", line 506, in legacy_load
with closing(tarfile.open(fileobj=f, mode='r:', format=tarfile.PAX_FORMAT)) as tar,
File "/root/anaconda3/envs/psp_env/lib/python3.6/tarfile.py", line 1589, in open
return func(name, filemode, fileobj, **kwargs)
File "/root/anaconda3/envs/psp_env/lib/python3.6/tarfile.py", line 1619, in taropen
return cls(name, mode, fileobj, **kwargs)
File "/root/anaconda3/envs/psp_env/lib/python3.6/tarfile.py", line 1482, in init
self.firstmember = self.next()
File "/root/anaconda3/envs/psp_env/lib/python3.6/tarfile.py", line 2309, in next
raise ReadError(str(e))
tarfile.ReadError: invalid header

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "scripts/inference.py", line 131, in
run()
File "scripts/inference.py", line 43, in run
ckpt = torch.load(test_opts.checkpoint_path, map_location='cpu')
File "/root/anaconda3/envs/psp_env/lib/python3.6/site-packages/torch/serialization.py", line 426, in load
return _load(f, map_location, pickle_module, **pickle_load_args)
File "/root/anaconda3/envs/psp_env/lib/python3.6/site-packages/torch/serialization.py", line 599, in _load
raise RuntimeError("{} is a zip archive (did you mean to use torch.jit.load()?)".format(f.name))
RuntimeError: psp_ffhq_toonify.pt is a zip archive (did you mean to use torch.jit.load()?)

Toonify model seems not work

I used notebook of Inference_playground.ipynb and try various experiment type with default settings. All other experiment type work well except type of "toonify":
it seems that no matter which kinds of image i used, the results are all similar like this:

how to get each picture's latent code?

I set the 'return_latents=True' in psp.py , but get TypeError ：

【
Traceback (most recent call last):
File "/home/zhz1/anaconda3/envs/env_36/lib/python3.7/site-packages/PIL/Image.py", line 2751, in fromarray
mode, rawmode = _fromarray_typemap[typekey]
KeyError: ((1, 1, 1, 256), '|u1')

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File "scripts/inference.py", line 136, in
run()
File "scripts/inference.py", line 82, in run
result = tensor2im(result_batch[i])
File "./utils/common.py", line 23, in tensor2im
return Image.fromarray(var.astype('uint8'))
File "/home/zhz1/anaconda3/envs/env_36/lib/python3.7/site-packages/PIL/Image.py", line 2753, in fromarray
raise TypeError("Cannot handle this data type: %s, %s" % typekey) from e
TypeError: Cannot handle this data type: (1, 1, 1, 256), |u1
】

could you please tell me how to output each picture's latent code?

How to get W+ space for the encoded images ?

I tried running inference on a set of images for encoding in the FFHQ latent space, it gave me a set of projected images. But no .npy file associated with it for W+ space as cited in the paper to do further image maniputations.

Is it possible to get the W+ space vectors ?

Does this support 1024*1024?

Hi, thanks again for the great work.
I'm trying to train a Toonify model on it. But I just notice that the default resolution is 256256 according to the provided transforms_config.py. Can I train this model under 10241024 resolution by simply changing the transforms_config.py settings?

Latent representation of the encoded image

Hello,
I am a newbie. How can I find the latent representation of the encoded image? I want to play around with it but I really don't know where to find it.

Running on Windows

Hi,
Is it possible to run the codes on Windows 10?

Inference.py error

Hello again

I've moved from working in the jupyter notebook to working via terminal in the conda environment. I have been using style_mixing.py with no problems and with good results. However, when trying to run inference.py (which worked fine in the jupyter notebook) I am getting the following error:

Apologies for asking again!

How can we just transfer image style rather than get similar image?

Hello, we use your toonify parameter setting train a psp model and we get good results, but all these transfered images are just similar with the original images, like this:
original image:

transfered image by using psp:

but actually we want to get image like this:

So, how can we just transfer image style rather than get similar but different image?

dependency install issue

Hello. I just use the following command in Window 10 64bit home edition
conda env create -f C:\Users\User\Downloads\pixel2style2pixel\environment\psp_env.yaml

And I get the following error
**ResolvePackageNotFound:

ncurses==6.2=he6710b0_1
python==3.6.7=h0371630_0
libstdcxx-ng==9.1.0=hdf63c60_0
readline==7.0=h7b6447c_5
sqlite==3.31.1=h62c20be_1
zlib==1.2.11=h7b6447c_3
libedit==3.1.20181209=hc058e9b_0
xz==5.2.5=h7b6447c_0
libffi==3.2.1=hd88cf55_4
ninja==1.10.0=hc9558a2_0
tk==8.6.8=hbc83047_0
libgcc-ng==9.1.0=hdf63c60_0
openssl==1.1.1g=h516909a_0**

How would I solve this issue? Thank you for any reply.

Question for toonification model

Hi, thanks again for the great work.

"We train the pSp encoder to directly reconstruct real face images inside the toons latent space resulting in a projection of each image to the closest toon."

Regarding the above statement, did you mean that your pSp encoder was trained with a dataset consisting of only FFHQ images, not Toonified images? If so, were reconstructed images (Toonified by generator) and input images (FFHQ ) were used for calculating losses?

Colab Sketch To Face not working as expected

I've been playing with the Google Colab notebook and everything seems to be working fine with the celebs_sketch_to_face model, but if I change the sketch image path to my own source image, the output is always the same default image and doesn't seem to bear any resemblance to the source sketch. Is this maybe a file naming issue? I've tried various combinations of file names, and JPG or PNG formats, and nothing seems to resolve it.

The network output is slightly different from our expectations

Hi,
we use about 50k pairs of pictures generated by Toonify Stylegan to do pairs-training, and use the Toonify model provided in the readme as the weight.
After training, the results in the log are all very good. However, other results are not as expected.

For example, this image is given by the latent code optimized by Stylegan.

However, pixel2style2pixel gives this one.

More training results from pixel2style2pixel:

Downloading models

I have permission issue downloading pre-trained models from provided google drive links

How to use generator to generate random image?

Great job! @eladrich.
I was wondering how to use you Generator implementation to generate some random images.
I found the forward code in models/stylegan2/model.py and it looks like:

    def forward(
            self,
            styles,
            return_latents=False,
            return_features=False,
            inject_index=None,
            truncation=1,
            truncation_latent=None,
            input_is_latent=False,
            noise=None,
            randomize_noise=True,
    ):
        if not input_is_latent:
            styles = [self.style(s) for s in styles]

        if noise is None:
            if randomize_noise:
                noise = [None] * self.num_layers
            else:
                noise = [
                    getattr(self.noises, f'noise_{i}') for i in range(self.num_layers)
                ]

I am a little about the argument styles for? can I generate a noise via make_noise() and then generate a random face image?

Best regard!

CelebAMask-HQ

Hello authors, thanks for this amazing repo!

I know it's not really your problem, but the CelebAMask-HQ Dataset is not currently available from the original authors (their Google Drive download link is broken).

Since you have a copy of the dataset, and the dataset isn't very big (I think it's < 3GB), would you be able to put it on Google Drive or some other platform?

Minimum GPU requirements for 1024

Thanks for your great work !!!

I want to train the StyleGAN encoding model with FFHQ 1024x1024 dataset, what is the minimum gpu requirements for the training?

Thank you.

Using inference for own images problem

Hi @eladrich. Thanks for sharing.
I would like to try inference with own srcs , but maybe i am to stupid or it is not made for.
Am i right,that for input there is an folder which parameter is:
--data_path
this points to folder where my input images are (more than one?)

output folder is:
-exp_dir

i tried with input_img.jpg in folder:
python inference.py --checkpoint_path ../pretrained_models/psp_celebs_super_resolution.pt --exp_dir ../../out --data_path ../../in/ --resize_factors 2 --test_batch_size 2

Loading pSp from checkpoint: ../pretrained_models/psp_celebs_super_resolution.pt
Loading dataset for celebs_super_resolution
Performing down-sampling with factors: [2]
0it [00:00, ?it/s]
/.../anaconda3/envs/pixel2style2pixel/lib/python3.6/site-packages/numpy/core/fromnumeric.py:3335: RuntimeWarning: Mean of empty slice.
out=out, **kwargs)
/.../anaconda3/envs/pixel2style2pixel/lib/python3.6/site-packages/numpy/core/_methods.py:161: RuntimeWarning: invalid value encountered in double_scalars
ret = ret.dtype.type(ret / rcount)
/.../anaconda3/envs/pixel2style2pixel/lib/python3.6/site-packages/numpy/core/_methods.py:217: RuntimeWarning: Degrees of freedom <= 0 for slice
keepdims=keepdims)
/.../anaconda3/envs/pixel2style2pixel/lib/python3.6/site-packages/numpy/core/_methods.py:186: RuntimeWarning: invalid value encountered in true_divide
arrmean, rcount, out=arrmean, casting='unsafe', subok=False)
/.../anaconda3/envs/pixel2style2pixel/lib/python3.6/site-packages/numpy/core/_methods.py:209: RuntimeWarning: invalid value encountered in double_scalars
ret = ret.dtype.type(ret / rcount)
Runtime nan+-nan

Thanks

About alignment

Thanks for your excellent work! I have read your code, and I find you don't use any alignment in training encoder for id_loss, so if I want to training on my own datasets, alignment should be done before training, right?

Have you tried multi-GPU training successfully?

class EncodeTransforms in transfroms_config.py

In configs/transforms_config.py line 16, I was wondering why no transformation is applied in the transform_source part. Is this part deleted by some chance?

def get_transforms(self):
transforms_dict = {
'transform_gt_train': transforms.Compose([
transforms.Resize((256, 256)),
transforms.RandomHorizontalFlip(0.5),
transforms.ToTensor(),
transforms.Normalize([0.5, 0.5, 0.5], [0.5, 0.5, 0.5])]),
'transform_source': None,
'transform_test': transforms.Compose([
transforms.Resize((256, 256)),
transforms.ToTensor(),
transforms.Normalize([0.5, 0.5, 0.5], [0.5, 0.5, 0.5])]),
'transform_inference': transforms.Compose([
transforms.Resize((256, 256)),
transforms.ToTensor(),
transforms.Normalize([0.5, 0.5, 0.5], [0.5, 0.5, 0.5])])
}
return transforms_dict

Licensing

I know it's licensed under MIT.

However, would you consider adding a license file (which can be parsed automatically) and a copyright notice? The MIT license is confusing without a copyright notice.

In addition, users of this might not immediately be aware of the restrictions of those files under the Nvidia Source Code License-NC: using this repo as if it was wholly MIT-licensed isn't safe.

It'd be nice to put a mention about these files in the readme.

I think there are other files with third-party rights associated with them.

In general, the licensing status of this project is quite unclear. Would you consider clarifying it a bit?

Conda environment error

Hello

I am receiving an error when running this via jupyter notebook. I have installed the specified torch/torchvision versions from the .yaml (along with everything else) to my conda environment and even uninstalled/reinstalled to check. Is this error something to do with the conflicting and 'incompatible' versions listed in the .yaml or have I missed out something when installing them into the conda environment? I am new to both linux and gan/ml work.

Cheers!

Latents Portability

I was interested to pick your brain regarding the portability of the generated latent representation, especially across models and frameworks.

I did extensive experiments with StyleGAN, mostly in its Tensorflow version, and have seen code like the rosinality one, meant to convert model checkpoints.

To train a new psp model I need to specify a previously trained Pytorch StyleGAN2 model, and then I can simply obtain projected/encoded latents from the psp model and use them in my original Pytorch one. However these latents don't seem to be portable to a Tensorflow model, say one I converted to Pytorch with a utility as mentioned above.

Did you do any experiments regarding this? Would you suggest me to simply move my latents-editing setup to Pytorch or do you have any ideas?

how to test toonify

Thank you for your great work and share the codes.
Can you supply a brief explanation about how to test the Toonify?

data set used in this work?

How to get the training data(front faces, sketches) to train the psp encoder?

how to perform loss for Generating face from sketches

I assume the input is the sketch image (A) but the generated image (B) is a normal face. Is the LPIPS loss performed between A and B? It seems doesn't make sense for me. Could you please clarify more here regarding the loss perform?

ninja: build stopped: subcommand failed.

Thanks for your great work! I have some problems during training the psp.
I got the error
RuntimeError: Error building extension 'fused': [1/2] :/usr/local/cuda-9.0:/usr/local/cuda-10.0/bin/nvcc -DTORCH_EXTENSION_NAME=fused -D
TORCH_API_INCLUDE_EXTENSION_H -isystem /hd1/lvyueming/anaconda3/envs/psp_env/lib/python3.6/site-packages/torch/include -isystem /hd1/lvy
ueming/anaconda3/envs/psp_env/lib/python3.6/site-packages/torch/include/torch/csrc/api/include -isystem /hd1/lvyueming/anaconda3/envs/ps
p_env/lib/python3.6/site-packages/torch/include/TH -isystem /hd1/lvyueming/anaconda3/envs/psp_env/lib/python3.6/site-packages/torch/incl
ude/THC -isystem :/usr/local/cuda-9.0:/usr/local/cuda-10.0/include -isystem /hd1/lvyueming/anaconda3/envs/psp_env/include/python3.6m -D_
GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constex
pr -gencode=arch=compute_75,code=sm_75 --compiler-options '-fPIC' -std=c++11 -c /home/lvyueming/pixel2style2pixel/models/stylegan2/op/fu
sed_bias_act_kernel.cu -o fused_bias_act_kernel.cuda.o
FAILED: fused_bias_act_kernel.cuda.o
:/usr/local/cuda-9.0:/usr/local/cuda-10.0/bin/nvcc -DTORCH_EXTENSION_NAME=fused -DTORCH_API_INCLUDE_EXTENSION_H -isystem /hd1/lvyueming/
anaconda3/envs/psp_env/lib/python3.6/site-packages/torch/include -isystem /hd1/lvyueming/anaconda3/envs/psp_env/lib/python3.6/site-packa
ges/torch/include/torch/csrc/api/include -isystem /hd1/lvyueming/anaconda3/envs/psp_env/lib/python3.6/site-packages/torch/include/TH -is
ystem /hd1/lvyueming/anaconda3/envs/psp_env/lib/python3.6/site-packages/torch/include/THC -isystem :/usr/local/cuda-9.0:/usr/local/cuda-
10.0/include -isystem /hd1/lvyueming/anaconda3/envs/psp_env/include/python3.6m -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS__ -
D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_75,code=sm_75 --compiler-opti
ons '-fPIC' -std=c++11 -c /home/lvyueming/pixel2style2pixel/models/stylegan2/op/fused_bias_act_kernel.cu -o fused_bias_act_kernel.cuda.o
/bin/sh: :/usr/local/cuda-9.0:/usr/local/cuda-10.0/bin/nvcc: No such file or directory
ninja: build stopped: subcommand failed.

could you help me fix it ？Thank you very much!

Comparison with stylegan2-distillation

Great work!

Just wondering, any comparison data with https://github.com/EvgenyKashin/stylegan2-distillation

Question about ffhq_encode

I tried to train a toon model with source/target images as below
Source

Target

Result

Trained it till like 6000 iters as mentioned in the thread with the same settings above is just example trained the data with like 1000 images.
It does not really give the expected output it actually changes the structure of the whole face. can you give some rough idea of what I could be doing wrong? and how do I preserve the input face and somehow make the eyes a bit large?

Pausing for testing during training

I am in the process of retraining the encoder and was wondering what the procedure would be for testing the checkpoints as expectedly there isn't enough memory to train and test at the same time.

It's been training for roughly 2/3 days now. Would I pause training and test, or terminate training and test and then if desired restart training from the best checkpoint?

Are there flags for pausing etc?

Cheers!

File not compiling

I have run model before it was compiling and working fine but after that immediately next day it stopped working model is not compiling I have to again again interrupt compiling and sometimes even GPU gets lost. But model does not compile or file does not compile and solution to this.

enviroment error?

Hi~ thanks for your amazing work. I try to follow your work but I am currently experiencing some problems.
Based on experience, I guess it should be an environmental problem.
My enviroment:

pytorch - 1.3.1

torchvision - 0.4.2

cuda - 10.1.0

cudnn - 7.6.3
Hope to get your help. Many Thanks!

Traceback (most recent call last):
File "/home/dongshichao/.local/lib/python3.6/site-packages/torch/utils/cpp_extension.py", line 1030, in _build_extension_module
check=True)
File "/usr/lib/python3.6/subprocess.py", line 438, in run
output=stdout, stderr=stderr)
subprocess.CalledProcessError: Command '['ninja', '-v']' returned non-zero exit status 1.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/usr/local/lib/python3.6/dist-packages/pudb/init.py", line 153, in runscript
dbg.runscript(mainpyfile)
File "/usr/local/lib/python3.6/dist-packages/pudb/debugger.py", line 468, in runscript
self.run(statement)
File "/usr/lib/python3.6/bdb.py", line 434, in run
exec(cmd, globals, locals)
File "", line 1, in
File "dsc_demo.py", line 18, in
from models.psp import pSp
File "../models/psp.py", line 9, in
from models.encoders import psp_encoders
File "../models/encoders/psp_encoders.py", line 8, in
from models.stylegan2.model import EqualLinear
File "../models/stylegan2/model.py", line 7, in
from models.stylegan2.op import FusedLeakyReLU, fused_leaky_relu, upfirdn2d
File "../models/stylegan2/op/init.py", line 1, in
from .fused_act import FusedLeakyReLU, fused_leaky_relu
File "../models/stylegan2/op/fused_act.py", line 13, in
os.path.join(module_path, 'fused_bias_act_kernel.cu'),
File "/home/dongshichao/.local/lib/python3.6/site-packages/torch/utils/cpp_extension.py", line 661, in load
is_python_module)
File "/home/dongshichao/.local/lib/python3.6/site-packages/torch/utils/cpp_extension.py", line 830, in jit_compile
with_cuda=with_cuda)
File "/home/dongshichao/.local/lib/python3.6/site-packages/torch/utils/cpp_extension.py", line 883, in write_ninja_file_and_build
build_extension_module(name, build_directory, verbose)
File "/home/dongshichao/.local/lib/python3.6/site-packages/torch/utils/cpp_extension.py", line 1043, in build_extension_module
raise RuntimeError(message)
RuntimeError: Error building extension 'fused': [1/3] :/data/cuda/cuda-10.1/cuda/:/data/cuda/cuda-10.1/cuda/bin/nvcc -DTORCH_EXTENSION_NAME=fused -DTORCH_API_INCLUDE_EXTENSION_H -isystem /home/dongshichao/.local/lib/python3.6/site-packages/torch/include -isystem /home/dongshichao/.local/lib/python3.6/site-packages/torch/include/torch/csrc/api/include -isystem /home/dongshichao/.local/lib/python3.6/site-packages/torch/include/TH -isystem /home/dongshichao/.local/lib/python3.6/site-packages/torch/include/THC -isystem :/data/cuda/cuda-10.1/cuda/:/data/cuda/cuda-10.1/cuda/include -isystem /usr/include/python3.6m -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS -D__CUDA_NO_HALF_CONVERSIONS -D__CUDA_NO_HALF2_OPERATORS --expt-relaxed-constexpr -gencode=arch=compute_75,code=sm_75 --compiler-options '-fPIC' -std=c++11 -c /data/jupyter/pixel2style2pixel/models/stylegan2/op/fused_bias_act_kernel.cu -o fused_bias_act_kernel.cuda.o
FAILED: fused_bias_act_kernel.cuda.o
:/data/cuda/cuda-10.1/cuda/:/data/cuda/cuda-10.1/cuda/bin/nvcc -DTORCH_EXTENSION_NAME=fused -DTORCH_API_INCLUDE_EXTENSION_H -isystem /home/dongshichao/.local/lib/python3.6/site-packages/torch/include -isystem /home/dongshichao/.local/lib/python3.6/site-packages/torch/include/torch/csrc/api/include -isystem /home/dongshichao/.local/lib/python3.6/site-packages/torch/include/TH -isystem /home/dongshichao/.local/lib/python3.6/site-packages/torch/include/THC -isystem :/data/cuda/cuda-10.1/cuda/:/data/cuda/cuda-10.1/cuda/include -isystem /usr/include/python3.6m -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_75,code=sm_75 --compiler-options '-fPIC' -std=c++11 -c /data/jupyter/pixel2style2pixel/models/stylegan2/op/fused_bias_act_kernel.cu -o fused_bias_act_kernel.cuda.o
/bin/sh: 1: :/data/cuda/cuda-10.1/cuda/:/data/cuda/cuda-10.1/cuda/bin/nvcc: not found
[2/3] c++ -MMD -MF fused_bias_act.o.d -DTORCH_EXTENSION_NAME=fused -DTORCH_API_INCLUDE_EXTENSION_H -isystem /home/dongshichao/.local/lib/python3.6/site-packages/torch/include -isystem /home/dongshichao/.local/lib/python3.6/site-packages/torch/include/torch/csrc/api/include -isystem /home/dongshichao/.local/lib/python3.6/site-packages/torch/include/TH -isystem /home/dongshichao/.local/lib/python3.6/site-packages/torch/include/THC -isystem :/data/cuda/cuda-10.1/cuda/:/data/cuda/cuda-10.1/cuda/include -isystem /usr/include/python3.6m -D_GLIBCXX_USE_CXX11_ABI=0 -fPIC -std=c++11 -c /data/jupyter/pixel2style2pixel/models/stylegan2/op/fused_bias_act.cpp -o fused_bias_act.o
ninja: build stopped: subcommand failed.

Is it possible to run inference on cpu?

I tried it to make it work(with beginner knowledge) but I keep getting different errors.

Permission Denied

Hello, when I run your project on the sever, it found that there is a permission error as follow:
(py3.6) beryl@delta:~/github/pixel2style2pixel$ python scripts/train.py --dataset_type=ffhq_encode --exp_dir=./experiment --workers=8 --batch_size=8 --test_batch_size=8 --test_workers=8 --val_interval=2500 --save_interval=5000 --encoder_type=GradualStyleEncoder --start_from_latent_avg --lpips_lambda=0.8 --l2_lambda=1 --id_lambda=0.1 Traceback (most recent call last): File "scripts/train.py", line 14, in <module> from training.coach import Coach File "./training/coach.py", line 19, in <module> from models.psp import pSp File "./models/psp.py", line 9, in <module> from models.encoders import psp_encoders File "./models/encoders/psp_encoders.py", line 8, in <module> from models.stylegan2.model import EqualLinear File "./models/stylegan2/model.py", line 7, in <module> from models.stylegan2.op import FusedLeakyReLU, fused_leaky_relu, upfirdn2d File "./models/stylegan2/op/__init__.py", line 1, in <module> from .fused_act import FusedLeakyReLU, fused_leaky_relu File "./models/stylegan2/op/fused_act.py", line 13, in <module> os.path.join(module_path, 'fused_bias_act_kernel.cu'), File "/mnt/lab/beryl/anaconda3/envs/py3.6/lib/python3.6/site-packages/torch/utils/cpp_extension.py", line 898, in load is_python_module) File "/mnt/lab/beryl/anaconda3/envs/py3.6/lib/python3.6/site-packages/torch/utils/cpp_extension.py", line 1075, in _jit_compile if baton.try_acquire(): File "/mnt/lab/beryl/anaconda3/envs/py3.6/lib/python3.6/site-packages/torch/utils/file_baton.py", line 36, in try_acquire self.fd = os.open(self.lock_file_path, os.O_CREAT | os.O_EXCL) PermissionError: [Errno 13] Permission denied: '/tmp/torch_extensions/fused/lock'

But when I use the sudo demand, another error accurs:
(py3.6) beryl@delta:~/github/pixel2style2pixel$ sudo python scripts/train.py --dataset_type=ffhq_encode --exp_dir=./experiment --workers=8 --batch_size=8 --test_batch_size=8 --test_workers=8 --val_interval=2500 --save_interval=5000 --encoder_type=GradualStyleEncoder --start_from_latent_avg --lpips_lambda=0.8 --l2_lambda=1 --id_lambda=0.1 Traceback (most recent call last): File "scripts/train.py", line 14, in <module> from training.coach import Coach File "./training/coach.py", line 3, in <module> import matplotlib ImportError: No module named matplotlib
This error seems that the file is not run on the proper environment, because I am using the anaconda environment while the environment has matplotlib package. Can you provide me some help? Thanks!

What's the functionality of applying EqualLinear in psp_encoders

For psp_encoders.py L148
x = x.view(-1, 512) It's supposed to get the latent from here but it seems you do an EqualLinear further in the next step:
x = self.linear(x)
What's the function here? I see all these Encoders have applied for EqualLinear.
Any experimental differences if get rid of this layer? I didn't find them in the paper so post here.
Looking forward to hearing your clarifying! Appreciate it!

how I get output of encoder ? W=dlatens or called W+

in the funtion of run_on_batch(). it only inversion image and generated fake image but not have output of image

if I want the dlatens (or W+) in FFHQ model , how I get ?

eladrich / pixel2style2pixel Goto Github PK

pixel2style2pixel's Introduction

Encoding in Style: a StyleGAN Encoder for Image-to-Image Translation

Description

Table of Contents

Recent Updates

Applications

StyleGAN Encoding

Face Frontalization

Conditional Image Synthesis

Super Resolution

Getting Started

Prerequisites

Installation

Inference Notebook

Pretrained Models

Training

Preparing your Data

Training pSp

Training the pSp Encoder

Frontalization

Sketch to Face

Segmentation Map to Face

Super Resolution

Additional Notes

Weights & Biases Integration

Testing

Inference

Multi-Modal Synthesis with Style-Mixing

Computing Metrics

Additional Applications

Toonify

Repository structure

TODOs

Credits

Inspired by pSp

pSp in the Media

Citation

pixel2style2pixel's People

Contributors

Stargazers

Watchers

Forkers

pixel2style2pixel's Issues

environments: python 3.6.7 torch 1.3.1 torchvision 0.4.2 CUDA 10.1

Recommend Projects

Recommend Topics

Recommend Org

environments:
python 3.6.7
torch 1.3.1
torchvision 0.4.2
CUDA 10.1