yijunmaverick / universalstyletransfer Goto Github PK

View Code? Open in Web Editor NEW

591.0 591.0 91.0 59.03 MB

The source code of NIPS17 'Universal Style Transfer via Feature Transforms'.

License: MIT License

Lua 100.00%

universalstyletransfer's People

Contributors

Stargazers

Watchers

Forkers

johndpope liuguoyou benjamesbabala xychen9459 sarathknv implus shubhampachori12110095 lulllabs markusvfx lavrovd computervision-generation ryfan-rs bugraoral paojianghu alexandr2110pro junweima kevinlemon nekusakuraba joshbrew quantamhd ningyu1991 o2over rpidanny asanakoy pandinosaurus majinghang oberen inkimage ajinkyapuar marvin521 alanderex emezac yulunzhang gq124 hyzcn dypark86 ctaylor1118 mingsun-tse hologerry insad seominlee jianqiangren ml-lab iiidimaiii winwinjjiang mrj-taffy yangbaipinecone sandipanxlr8 xuyanging gaukeleier facexteam calvinlcchen idanazuri klqulei zhaoxichenderrick nimoatwoodway thebluesmoke codeaudit peterzs fangmath dypromise denizhanpak jamirando cymoon59 owwowoow yonghoonkwon kp-forks kapitsa2811 son1cman lulu1315 iamjunwei malikpaetzold thestarboy sikonglingyun lianhan buddhics yinxuping ankitshah009 vb6hobbyst7 dottrosa cocoruss fanhaotian skipper17 jaoh yuechuanli falseu gusre miscal lanzehua nillisgit phtu-cs

universalstyletransfer's Issues

Video transfer

Hi there is a patchy areas that jump around between images in video transfer.
This isnt so much temporal issues with the style transfer of features as much as little patches of luminance change that flicker around....Is this something to do with theWhiten-Color Transform as it doesnt happen when using the Adain method with your implementation?

License

Is there a license for this software?

Obtaining whitened images without style transfer

Hi,
I'm trying to get the whitened content images without applying any target style, as you have shown as example in the paper.
I'm working on the pytorch version, and to make things simple I have simply removed the code portion that I thought would apply the style transfer, returning whiten_CF instead of targetFeature:
` ......
whiten_cF = torch.mm(step2,cF)

    s_d = (s_e[0:k_s]).pow(0.5)
    targetFeature = torch.mm(torch.mm(torch.mm(s_v[:,0:k_s],torch.diag(s_d)),(s_v[:,0:k_s].t())),whiten_cF)
    targetFeature = targetFeature + s_mean.unsqueeze(1).expand_as(targetFeature)
    return targetFeature`

That should correspond in your original torch code at removing those lines:
local swap_latent = swap:forward(whiten_contentFeature:resize(sg[1], sg[2], sg[3])):clone() local swap_latent1 = swap_latent:view(sg[1], sg[2]*sg[3]) targetFeature = (s_v[{{},{1,k_s}}]:cuda()) * (torch.diag(s_d1:cuda())) * (s_v[{{},{1,k_s}}]:t():cuda()) * swap_latent1

Leaving in the function as last lines something like that:
`whiten_contentFeature = (c_v[{{},{1,k_c}}]:cuda()) * torch.diag(c_d:cuda()) * (c_v[{{},{1,k_c}}]:t():cuda()) *contentFeature1
local Whiten_contentFeature= whiten_contentFeature:resize(sg[1], sg[2], sg[3])

return Whiten_contentFeature

`
I tried this modification in the pytorch version and the output image is not whitened, but results the same as the content image. Could you suggest me how fix it?

training details

Thank you for your work. Can you provide the training code? I want to know some training details, such as learning rates, optimization methods and so on.

Questions about vgg16 net

I use torch.utils.serialization.load_lua Python pkg to load the vgg_normalised_conv5_1.t7, the net is as followed:

In [7]: vgg1 = load_lua('models/vgg_normalised_conv1_1.t7')

In [8]: vgg1
Out[8]:
nn.Sequential {
  [input -> (0) -> (1) -> (2) -> (3) -> output]
  (0): nn.SpatialConvolution(3 -> 3, 1x1)
  (1): nn.SpatialReflectionPadding(1, 1, 1, 1)
  (2): nn.SpatialConvolution(3 -> 64, 3x3)
  (3): nn.ReLU
}

the encoder code is as followed:

class encoder1(nn.Module):
    def __init__(self,vgg1):
        super(encoder1,self).__init__()
        # dissemble vgg2 and decoder2 layer by layer
        # then resemble a new encoder-decoder network
        # 224 x 224
        self.conv1 = nn.Conv2d(3,3,1,1,0)
        self.conv1.weight = torch.nn.Parameter(vgg1.get(0).weight.float())
        self.conv1.bias = torch.nn.Parameter(vgg1.get(0).bias.float())
        # 224 x 224
        self.reflecPad1 = nn.ReflectionPad2d((1,1,1,1))
        # 226 x 226
        self.conv2 = nn.Conv2d(3,64,3,1,0)
        self.conv2.weight = torch.nn.Parameter(vgg1.get(2).weight.float())
        self.conv2.bias = torch.nn.Parameter(vgg1.get(2).bias.float())

        self.relu = nn.ReLU(inplace=True)
        # 224 x 224
    def forward(self,x):
        out = self.conv1(x)
        out = self.reflecPad1(out)
        out = self.conv2(out)
        out = self.relu(out)
        return out

My question is: in the original VGG16, there is no 1x1 convolution and padding to 226x226 procedure, but the code looks to do it, did I misunderstand the net?

One more thing, as you have said, the decoder is not no good that the generated picture is not so precise, and as for the artistic performance, I think use 4 decoder is better than all 5.

This is decoder 1-5.

This is decoder 1-4.

I think the bottom one looks better.

Look forward to your reply.

C# Implementation [Not an issue]

Your algorithm was implemented in C# for use on Windows. You can find it here:https://github.com/ColorfulSoft/Demos/tree/master/Style%20Transfer/2017.%20Universal%20Style%20Transfer%20via%20Feature%20Transforms..

Using multiple masks

Am I right in assuming the usage of multiple masks which leave part of the picture untransformed (as demonstrated in the (first picture)[https://github.com/Yijunmaverick/UniversalStyleTransfer/blob/master/figs/p3.jpg]) is not implemented in the LUA script?

The command from the readme uses a binary mask with styles for fg/bg, not multiple alpha masks, right?

th test_wct_mask.lua -content YourConentPath -style YourStylePath1,YourStylePath2 -mask YourBinaryMaskPath

This is not a problem as it's fairly easy to edit afterwards, but I'm curious as to whether I'm missing something.

Hi,please help me.

May I ask how to generate the Inverting white features for Figure 2,thanks a lot！

can't replay evaluation speed

I want to test speed of UniversalStyleTransfer.

so i evaluate with below image at titan V

floydhub/torch:latest-gpu-py3

but 256*256 eval speed 2sec

paper about 0.8 sec
did not measure model load time

how do you measure speed?

CPU mode?

Hi is there a CPU mode only flag available?

Grammian - based style transfer

Is it possible to use a gram matrix to transfer style instead of CVD?

Requires cuDNN 5 even after downloading encoders& decoders.

I downloaded encoders&decoders and replaced decoders with the these decoders (for those not having cuDNN). But actually I have cuDNN7 (cuDNN5 is not supported with my CUDA version 9.1).

Then I run
th test_wct.lua -content input/content/04.jpg -style input/style/09.jpg -alpha 0.6

and get the following error:

/home/apogentus/Programs/Torch/install/bin/luajit: ...ntus/Programs/Torch/install/share/lua/5.1/trepl/init.lua:389: ...ntus/Programs/Torch/install/share/lua/5.1/trepl/init.lua:389: ...entus/Programs/Torch/install/share/lua/5.1/cudnn/ffi.lua:1603: 'libcudnn (R5) not found in library path.
Please install CuDNN from https://developer.nvidia.com/cuDNN
Then make sure files named as libcudnn.so.5 or libcudnn.5.dylib are placed in
your library load path (for example /usr/local/lib , or manually add a path to LD_LIBRARY_PATH)

Alternatively, set the path to libcudnn.so.5 or libcudnn.5.dylib
to the environment variable CUDNN_PATH and rerun torch.
For example: export CUDNN_PATH="/usr/local/cuda/lib64/libcudnn.so.5"

stack traceback:
	[C]: in function 'error'
	...ntus/Programs/Torch/install/share/lua/5.1/trepl/init.lua:389: in function 'require'
	test_wct.lua:9: in main chunk
	[C]: in function 'dofile'
	...rams/Torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:150: in main chunk
	[C]: at 0x55e65adf5570

What can I do to run the style transfer?

GPU memory requirements

Hey!

I've tried running this on a single GPU with 4GB DDR, but I get:

cuda runtime error (2) : out of memory at ~/torch/extra/cutorch/lib/THC/generic/THCStorage.cu:66

Before I break open my PC to install more cards, do you have a rough estimate what the GPU memory requirements are?

Is it suitable for multi-style transfer without masks?

I see you explained in the paper that the model can achieve multi-style transfer using masks. Is it means that without the masks, the model is not suitable for multi-style transfer? Thank you in advance.

SSIM metrics

Is there any evaluation script for SSIM metrics?