Coder Social home page Coder Social logo

hairclip's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

hairclip's Issues

Hairstyles can only show so much

First of all , thanks for you excellent work!
There are many hairstyles in hairstyle.txt, but actually I found only a few styles in result images after trying all styles. More or less repeat the following images.

  • cornrows cut hairstyle
    image

  • crew cut hairstyle
    image
    (the points on left glasses in right image is mouse)

the following is my command:

python scripts/inference.py 
--exp_dir=../result/test_1/
--checkpoint_path=../pretrained_models/hairclip.pt
--latents_test_path=../inference_data/test_1/latent.pt
--editing_type=hairstyle
--input_type=text
--hairstyle_description="hairstyle_list.txt"

What's the problem? Should I train with my own dataset?

I list some hairstyles which have the same effect:

    1. the same as cornrows: crown braid hairstyle, dreadlocks hairstyle, finger waves hairstyle, french braid hairstyle and so on.
    1. the same as crew cut hairstyle: caesar cut hairstyle, dido flip hairstyle, extensions hairstyle, fade hairstyle, fauxhawk hairstyle, frosted tops hairstyle ,full crown hairstyle, harvard clip hairstyle, high and tigh hairstyle, hime cut hairstyle, hi-top fade hairstyle and so son.

Demo Play ?

Hi. 🤗
This is an awesome work. 👍
Thanks for all of you, the contributors. 🌹
I am wondering if you could tell me if you have any plan to make one demo public on huggingface/spaces, etc. 🤔 ?

About training details

Hi,
I am trying to re-implement your paper but can not get good results on both image and text path.
So I would like to verify some implementation details:

  1. Below is my implementation of Modulation Module inside Mapper(in pytorch):
import torch
import torch.nn as nn
import torch.nn.functional as F
from models.stylegan2.model import EqualLinear

class MapperBlock(nn.Module):
    def __init__(self, channels=512):
        super(MapperBlock, self).__init__()
        self.fc = EqualLinear(channels,channels)
        self.f_gamma = nn.Sequential(
            EqualLinear(channels,channels), nn.LayerNorm(channels), nn.LeakyReLU(0.2),
            EqualLinear(channels,channels)
        )
        self.f_beta = nn.Sequential(
            EqualLinear(channels,channels), nn.LayerNorm(channels), nn.LeakyReLU(0.2),
            EqualLinear(channels,channels)
        )
        self.act = nn.LeakyReLU(0.2)
    
    def modulation(self, x, e):
        gamma = self.f_gamma(e)
        beta = self.f_beta(e)

        # norm x
        x = F.layer_norm(x, (x.shape[-1],))
        
        # modulation
        return (1.0 + gamma) * x + beta

    def forward(self, x, e):
        x = self.fc(x)
        x = self.modulation(x, e)
        return self.act(x)

Is it correct?

  1. According to your paper, the reference style/text is randomly set to image or text. My understanding is the image/text manipulation loss is only calculated when image/text reference is used, but the total loss value range is vary in different condition. Does the loss weights always keep the same in all condition or need to adjust for different condition?

  2. In your paper: "we also generated several edited images using our text-guided hair editing method to augment the diversity of the
    reference image set." Could you elaborate more details about your method? Or any other reference paper?

Thanks for your help.

Except hair coloe change only, but hair style of some results are change

I want to change hair color on FFHQ data, however hairstyle of some of results are change.
Did I do wrong?
The following is my command

python scripts/inference.py
--exp_dir=./experiment
--checkpoint_path=../pretrained_models/hairclip.pt
--latents_test_path=./latents.pt
--editing_type=color
--input_type=text
--color_description=red

00001-0000-red hair

Ask about train hairstyles

Hi,

When I train my own hairstyle model, do I need to convert the images under the --hairstyle_ref_img_train_path=/path/to/celeba_hq_train \ parameter into latents through the e4e algorithm. So instead of --latents_train_path=/path/to/train_faces.pt \

About the training details.

Thank you for your great project!

In this paper, you said “We train and evaluate our hair mapper on the CelebA-HQ dataset. Since we use e4e [43] as our inversion encoder, we follow its division of the training set and test set.” However, I found that e4e used the FFHQ dataset for training and the CelebA-HQ test dataset for evaluation. Hence, I feel confused.
My question is that how to split the training and test datasets on the CelebA-HQ dataset?

local variable 'shape' referenced before assignment

I test the feature on replicate but notice that some photo can result local variable 'shape' referenced before assignment. Is there any way we can fix this?

File "predict.py", line 168, in run_alignment
aligned_image = align_face(filepath=image_path, predictor=predictor)
File "/src/encoder4editing/utils/alignment.py", line 35, in align_face
lm = get_landmark(filepath, predictor)
File "/src/encoder4editing/utils/alignment.py", line 21, in get_landmark
t = list(shape.parts())
UnboundLocalError: local variable 'shape' referenced before assignment

Can I use my own image test?

Hello, can I use my own image for the resend test, I found that the input was test_face.pt (test data set ?) file, and I did not find the input image content in the code, The only thing that feels like an input image is w(w=torch.Size([1, 18, 512])), But it's not the size of a picture

About Video Hair Editing

Thanks you for you great works! Do you think video hair editing based on HairCLIP is achievable? I have a little try, but the region of hairstyle still hard to control. Consistency in hair styles is quite difficult to maintain. Can you give me some insights about video-hairstyle-editing?

about pretrained unet infer

mask_512 = (torch.unsqueeze(torch.max(labels_predict, 1)[1], 1)==13).float()
1.why hair equal 13, bg not equal 13?
2.unet infer results that have 19 channels, what did they means?

How to preserve facial details better like "Babershop"?

Hi, thanks for your work.
I have found that the hairclip algorithm is not very good at preserving facial details, such as work like "Barbershop: Hair Transfer with GAN-Based Image Compositing Using Segmentation Masks". I tried "HFGI: High-Fidelity GAN Inversion for Image Attribute Editing (CVPR 2022)" as a latent encoder, but the effect is not very good.

Can you give some advice or methods on how to preserve the facial details? Very much looking forward to your answer, thank you!

  • hairclip demo
    left image: input image, right image: result
    hairclip_demo

  • hairclip demo use HFGI latents
    left image: input image, right image: result
    hairclip_hfgi_demo

  • Babershop demo
    babershop_demo

add web demo/models to Huggingface

Hi, would you be interested in adding HairCLIP to Hugging Face? The Hub offers free hosting, and it would make your work more accessible and visible to the rest of the ML community. Models/datasets/spaces(web demos) can be added to a user account or organization similar to github.

Example from other organizations:
Keras: https://huggingface.co/keras-io
Microsoft: https://huggingface.co/microsoft
Facebook: https://huggingface.co/facebook

Example spaces with repos:
github: https://github.com/salesforce/BLIP
Spaces: https://huggingface.co/spaces/salesforce/BLIP

github: https://github.com/facebookresearch/omnivore
Spaces: https://huggingface.co/spaces/akhaliq/omnivore

and here are guides for adding spaces/models/datasets to your org

How to add a Space: https://huggingface.co/blog/gradio-spaces
how to add models: https://huggingface.co/docs/hub/adding-a-model
uploading a dataset: https://huggingface.co/docs/datasets/upload_dataset.html

Please let us know if you would be interested and if you have any questions, we can also help with the technical implementation.

Getting error on inference when using reference image hairstyle to paste on input image.

e1
I am getting error when I try to take inference. I am using this command
python scripts/inference.py
--exp_dir=/content/resultss
--editing_type=both
--input_type=image_image
--hairstyle_ref_img_test_path=/content/oriental1.png
--color_ref_img_test_path=/content/oriental1.png
--num_of_ref_img 1
--checkpoint_path=/content/drive/MyDrive/data/hairclip.pt
--latents_test_path=/content/drive/MyDrive/data/latents.pt
What I am trying to is to take transfer hairstyle of refrence Image to input Image. I have converted input image to e4e to get latent code. Please do let me know. thnx.

code

Can you provide a script file to input a single picture for final prediction?

用两张图片测试的时候报错

输入命令:
E:\Linux\XSpace\papers\HairCLIP\mapper>python scripts/inference.py --exp_dir=E:\Linux\XSpace\pap
ers\HairCLIP\data\exp --checkpoint_path=F:\Dataset\CelebA\Data\hairclip.pt --latents_test_path=F:\Dataset\CelebA\Data\test_faces.pt --editin
g_type=color --input_type=image --hairstyle_description="hairstyle_list.txt" --color_ref_img_test_path=E:\Linux\XSpace\papers\HairCLIP\data
ref

在 latent_mappers.py 中的 x = clip_model.encode_image(masked_generated_renormed) 报错了,错误信息如下:

*** RuntimeError: The following operation failed in the TorchScript interpreter.
Traceback of TorchScript, serialized code (most recent call last):
File "code/torch/multimodal/model/multimodal_transformer/___torch_mangle_9591.py", line 19, in encode_image
_0 = self.visual
input = torch.to(image, torch.device("cuda:0"), 5, False, False, None)
return (_0).forward(input, )
~~~~~~~~~~~ <--- HERE
def encode_text(self: torch.multimodal.model.multimodal_transformer.___torch_mangle_9591.Multimodal,
input: Tensor) -> Tensor:
File "code/torch/multimodal/model/multimodal_transformer.py", line 34, in forward
x2 = torch.add(x1, torch.to(_4, 5, False, False, None), alpha=1)
x3 = torch.permute((_3).forward(x2, ), [1, 0, 2])
x4 = torch.permute((_2).forward(x3, ), [1, 0, 2])
~~~~~~~~~~~ <--- HERE
_15 = torch.slice(x4, 0, 0, 9223372036854775807, 1)
x5 = torch.slice(torch.select(_15, 1, 0), 1, 0, 9223372036854775807, 1)
File "code/torch/multimodal/model/multimodal_transformer/___torch_mangle_9477.py", line 8, in forward
def forward(self: torch.multimodal.model.multimodal_transformer.___torch_mangle_9477.Transformer,
x: Tensor) -> Tensor:
return (self.resblocks).forward(x, )
~~~~~~~~~~~~~~~~~~~~~~~ <--- HERE
def forward1(self: torch.multimodal.model.multimodal_transformer.___torch_mangle_9477.Transformer,
x: Tensor) -> Tensor:
File "code/torch/torch/nn/modules/container/___torch_mangle_9476.py", line 29, in forward
_8 = getattr(self, "3")
_9 = getattr(self, "2")
_10 = (getattr(self, "1")).forward((getattr(self, "0")).forward(x, ), )
~~~~~~~~~~~~~~~~~~~~~~~~~~~ <--- HERE
_11 = (_7).forward((_8).forward((_9).forward(_10, ), ), )
_12 = (_4).forward((_5).forward((_6).forward(_11, ), ), )
File "code/torch/multimodal/model/multimodal_transformer/___torch_mangle_9376.py", line 13, in forward
_0 = self.mlp
_1 = self.ln_2
_2 = (self.attn).forward((self.ln_1).forward(x, ), )
~~~~~~~~~~~~~~~~~~ <--- HERE
x0 = torch.add(x, _2, alpha=1)
x1 = torch.add(x0, (_0).forward((_1).forward(x0, ), ), alpha=1)
File "code/torch/torch/nn/modules/activation/___torch_mangle_9369.py", line 38, in forward
_16 = [-1, int(torch.mul(bsz, CONSTANTS.c0)), _8]
v0 = torch.transpose(torch.view(_15, _16), 0, 1)
attn_output_weights = torch.bmm(q2, torch.transpose(k0, 1, 2))
~~~~~~~~~ <--- HERE
input = torch.softmax(attn_output_weights, -1, None)
attn_output_weights0 = torch.dropout(input, 0., True)

Traceback of TorchScript, original code (most recent call last):
/opt/conda/lib/python3.7/site-packages/torch/nn/functional.py(4294): multi_head_attention_forward
/opt/conda/lib/python3.7/site-packages/torch/nn/modules/activation.py(985): forward
/opt/conda/lib/python3.7/site-packages/torch/nn/modules/module.py(709): _slow_forward
/opt/conda/lib/python3.7/site-packages/torch/nn/modules/module.py(725): _call_impl
/root/workspace/multimodal-pytorch/multimodal/model/multimodal_transformer.py(45): attention
/root/workspace/multimodal-pytorch/multimodal/model/multimodal_transformer.py(48): forward
/opt/conda/lib/python3.7/site-packages/torch/nn/modules/module.py(709): _slow_forward
/opt/conda/lib/python3.7/site-packages/torch/nn/modules/module.py(725): _call_impl
/opt/conda/lib/python3.7/site-packages/torch/nn/modules/container.py(117): forward
/opt/conda/lib/python3.7/site-packages/torch/nn/modules/module.py(709): _slow_forward
/opt/conda/lib/python3.7/site-packages/torch/nn/modules/module.py(725): _call_impl
/root/workspace/multimodal-pytorch/multimodal/model/multimodal_transformer.py(63): forward
/opt/conda/lib/python3.7/site-packages/torch/nn/modules/module.py(709): _slow_forward
/opt/conda/lib/python3.7/site-packages/torch/nn/modules/module.py(725): _call_impl
/root/workspace/multimodal-pytorch/multimodal/model/multimodal_transformer.py(93): forward
/opt/conda/lib/python3.7/site-packages/torch/nn/modules/module.py(709): _slow_forward
/opt/conda/lib/python3.7/site-packages/torch/nn/modules/module.py(725): _call_impl
/root/workspace/multimodal-pytorch/multimodal/model/multimodal_transformer.py(221): visual_forward
/opt/conda/lib/python3.7/site-packages/torch/jit/_trace.py(940): trace_module
(36): export_torchscript_models
(3):
/opt/conda/lib/python3.7/site-packages/IPython/core/interactiveshell.py(3418): run_code
/opt/conda/lib/python3.7/site-packages/IPython/core/interactiveshell.py(3338): run_ast_nodes
/opt/conda/lib/python3.7/site-packages/IPython/core/interactiveshell.py(3147): run_cell_async
/opt/conda/lib/python3.7/site-packages/IPython/core/async_helpers.py(68): _pseudo_sync_runner
/opt/conda/lib/python3.7/site-packages/IPython/core/interactiveshell.py(2923): _run_cell
/opt/conda/lib/python3.7/site-packages/IPython/core/interactiveshell.py(2878): run_cell
/opt/conda/lib/python3.7/site-packages/IPython/terminal/interactiveshell.py(555): interact
/opt/conda/lib/python3.7/site-packages/IPython/terminal/interactiveshell.py(564): mainloop
/opt/conda/lib/python3.7/site-packages/IPython/terminal/ipapp.py(356): start
/opt/conda/lib/python3.7/site-packages/traitlets/config/application.py(845): launch_instance
/opt/conda/lib/python3.7/site-packages/IPython/init.py(126): start_ipython
/opt/conda/bin/ipython(8):
RuntimeError: cublas runtime error : unknown error at C:/cb/pytorch_1000000000000/work/aten/src/THC/THCBlas.cu:225
(Pdb) img_tensor.shape
torch.Size([1, 3, 1024, 1024])

请问是输入的tensor大小不对吗

F and C

Hello, boss. I noticed that the neural network structure diagram may be incorrectly drawn in the paper. F should be fine, meaning high-level semantic information; C should be coarse, meaning low-level semantic information.

Hosting HairCLIP model

Hi!

First off, thank you for your work!

I'm trying to create a Colab Notebook to play with your model, but since the weights and stuff are hosted inside google drive, the download limits seems to restrict me from simply downloading it with gdown or wget.

Could I download it and move it to another hosting service (i.e archive.org) to avoid this issue? Of course, I would add all the references to all the authors and parties involved.

Again, thanks for your work!

Is is normal speed?

微信截图_20220412182041
Hello, I want to ask if the speed of run the inferrence.py for testing is normal. This is my executive code: cd mapper
python scripts/inference.py
--exp_dir=/home/ps/HairCLIP/mapper/path/to/experiment
--checkpoint_path=/home/ps/HairCLIP/pretrained_models/hairclip.pt
--latents_test_path=/home/ps/HairCLIP/mapper/path/to/test_faces.pt
--editing_type=hairstyle
--input_type=text
--hairstyle_description="/home/ps/HairCLIP/mapper/hairstyle_list.txt" \

about color_ref_img_in_domain_path

hello thanks for your talented work. I have a question about color_ref_img_in_domain_path. When I finished pre-train with the argument hairstyle_manipulation_prob=0 --color_manipulation_prob=1 --both_manipulation_prob=0 --hairstyle_text_manipulation_prob=0.5 --color_text_manipulation_prob=1 --. How should I set the color_ref_img_in_domain_path. Is that path should be logs/image_train, but I got this error, I don't know where to find these documents. Looking forward to your reply

the error is
FileNotFoundError: [Errno 2] No such file or directory: '/home/code/HairCLIP/logs/images_train/red hair/02951.jpg'

question of split database(train.pt and test.pt)

@wty-ustc Thank you for the amazing work!
I try to split the CelebA-HQ by official list_eval_partition.txt. Eventually, I got 24183/2993/2824 images for training/validation/testing split. but i found the len of train.pt is 24176 ...so... I'm very confused about what data you're used?

The generated image is quite different from the reference image

I tested the effect and found that the hair style of the generated image is quite different from that of the reference image. Here is my test script. The reference image is selected from CelebAMask-HQ dataset. Is there a problem in my test process?

python scripts/inference.py \ --exp_dir=../outputs/0321/ \ --checkpoint_path=../pretrained_models/hairclip.pt \ --latents_test_path=../pretrained_models/test_faces.pt \ --editing_type=both \ --input_type=image_image \ --color_ref_img_test_path=../input/16 \ --hairstyle_ref_img_test_path=../input/16 --num_of_ref_img 1

image

what is ACD?

In your paper you mention using ACD as a measure of color differences. Where does this indicator come from? Is there any code we can use?

Error while training the model on my dataset.

s3

This is the command I am using.
%%shell
eval "$(conda shell.bash hook)"
conda activate myenviroment
python scripts/train.py
--exp_dir=/content/outss
--hairstyle_description="hairstyle_list.txt"
--color_description=black,brown,yellow
--checkpoint_path=/content/drive/MyDrive/data/hairclip.pt
--ir_se50_weights=/content/drive/MyDrive/data/model_ir_se50.pth
--latents_train_path=/content/drive/MyDrive/data/trainlatent/latents.pt
--latents_test_path=/content/drive/MyDrive/data/testlatent/latents1.pt
--hairstyle_ref_img_train_path=/content/inversions
--hairstyle_ref_img_test_path=/content/test/inversions
--color_ref_img_train_path=/content/inversions
--color_ref_img_test_path=/content/test/inversions
--color_ref_img_in_domain_path=/content/inversions
--hairstyle_manipulation_prob=0.5
--color_manipulation_prob=0.2
--both_manipulation_prob=0.27
--hairstyle_text_manipulation_prob=0.5
--color_text_manipulation_prob=0
--color_in_domain_ref_manipulation_prob=0.25 \

Using images to edit hairstyle and color does not work

Based on the pre-trained model you provided, edit the hair style with text and edit the hair color with image, but the hair color editing did not work. Do I have to retrain the new model myself? And How to obtain the model specified by the test parameter "--parsenet_weights"?

How to run predict.py

HI ,Thank you for your work

In line 11, (from cog import BasePredictor, Path, Input)this sentence means ?
I have a red wavy line here

Will stylegan inversion encoder be trained?

Will stylegan inversion encoder be trained? I found that CLIP image encoder and CLIP Text Encoder used detach() to make it untrained. I look forward to your answer. Thank you!

About modulation module

Hi,
Great work!
But I have a question about the modulation module of mapper network.
I assume the dimension of x and e should be 1x1xC.
If so, what is the mean and std of x? channel-wise average?
And how about the output dimensions of fr(e) and fb(e)?

Thanks.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.