Coder Social home page Coder Social logo

Comments (22)

tadax avatar tadax commented on August 16, 2024 1

@ruiann I fixed bugs and trained it again. I run it on GeForce GTX 1070 with CUDA 8/TensorFlow 1.1 (ubuntu 16.04). Running 20 epochs takes about 100 minutes.

from srgan.

ruiann avatar ruiann commented on August 16, 2024

Or I say as VGG19 contains 16 convolutional layer & 5 max pooling layer, the paper want to use the convoluted artifacts to make the content loss? So you need to process the input image to VGG19 and get the middle creature to do the test?

from srgan.

ruiann avatar ruiann commented on August 16, 2024

And also I can see your VGG19 doesn't exactly follow the conv net configurations given by

VERY DEEP CONVOLUTIONAL NETWORKS FOR LARGE-SCALE IMAGE RECOGNITION

could you plz tell the reason

from srgan.

tadax avatar tadax commented on August 16, 2024

You are right. SRGAN uses the VGG19 to make the content loss. It inputs the real and fake images to the VGG19 network and compares among feature maps obtained within it.

And also I can see your VGG19 doesn't exactly follow the conv net configurations given by
VERY DEEP CONVOLUTIONAL NETWORKS FOR LARGE-SCALE IMAGE RECOGNITION
could you plz tell the reason

I'm sorry but it's my mistake. I built the conv network roughly. You should build the VGG19 exactly.
But I suppose that doesn't count much since the conv network is used just for obtaining the feature maps.

from srgan.

ruiann avatar ruiann commented on August 16, 2024

It looks like you get all the feature maps generated by every layer of your 6-layer VGG19, but as the paper said to obtain the feature map obtained by the j-th convolution (after activation) before the i-th maxpooling layer within the VGG19 network, I suppose that you may make some mistake in the generation of phi, or it's my mistake for misunderstanding the meaning of phi. In fact I don't really understand the meaning of i and j of the definition of phi in that paper, but as the paper said, you can choose different i and j, for example phi_2_2 & phi_5_4, so I still wonder what does it means for phi_i_j

from srgan.

tadax avatar tadax commented on August 16, 2024

As you said, I get all the feature maps generated by every layer (i.e. phi_1_2, phi_2_2, phi_3_4, phi_4_4, and phi_5_4 within the VGG19 network).
I also tried to do the training with one feature phi_5_4 but it does not go well.

from srgan.

ruiann avatar ruiann commented on August 16, 2024

Yeah I know what phi_i_j means finally.
Why it doesn't work? and also you can use pre trained VGG19 models instead like this, I think it may works and no need to re train the VGG19

from srgan.

tadax avatar tadax commented on August 16, 2024

I tried to train the VGG19 and SRGAN again, but it doesn't work.
It is blur (like scale) and the edges have no color.

001
with 1 feature map (phi_5_4)

002

with 5 feature maps (phi_1_2, phi_2_2, phi_3_4, phi_4_4, and phi_5_4)

from srgan.

ruiann avatar ruiann commented on August 16, 2024

I guess for small pics you'd better use phi_2_2, since the phi_5_4, which describe the global feature map, is a small tensor for low resolution pics

just a guess

from srgan.

ruiann avatar ruiann commented on August 16, 2024

It looks like you have made something wrong for loss function, why do you use L2 normalization? The content loss & generation loss seems differs a lot from the paper

from srgan.

tadax avatar tadax commented on August 16, 2024

This implementation adopts the least squares loss function instead of the sigmoid cross entropy loss function for the discriminator.
cf. Squares Generative Adversarial Networks

The results seem not bad. But I haven't evaluated them yet.

from srgan.

ruiann avatar ruiann commented on August 16, 2024

I havn't read that paper but I think something maybe wrong with your d_loss

d_loss try to make true_output near 1 & fake_output near 0, I cannot understand your meaning by

d_loss_fake = tf.reduce_mean(tf.nn.l2_loss(fake_output + tf.ones_like(fake_output)))

I think for least squares, it would be

d_loss_fake = tf.reduce_mean(tf.nn.l2_loss(tf.zeros_like(fake_output) - fake_output))

I use

alpha = 1e-3
g_loss = tf.nn.sigmoid_cross_entropy_with_logits(logits=fake_output, labels=tf.ones_like(fake_output))
d_loss_true = tf.nn.sigmoid_cross_entropy_with_logits(logits=true_output, labels=tf.ones_like(true_output))
d_loss_fake = tf.nn.sigmoid_cross_entropy_with_logits(logits=fake_output, labels=tf.zeros_like(fake_output))
d_loss = (d_loss_true + d_loss_fake) / 2

as the output tensor of shape [batch_size] havn't gone through sigmoid, I think it may be work. But maybe I'm wrong as I havn't read that paper

from srgan.

tadax avatar tadax commented on August 16, 2024

Thank you. I've made a mistake.
I'll deal with it within a few days.

from srgan.

ruiann avatar ruiann commented on August 16, 2024

@tadax can you tell me your content loss range ? I try to use the pretrained VGG19 linked in above posts, the content loss can up to 1e6 and totally guide the gradient ascent.

from srgan.

tadax avatar tadax commented on August 16, 2024

The content loss should be less than 1e4 in the beginning.

It may be a good idea to use batch normalization in the VGG19 network.

from srgan.

ruiann avatar ruiann commented on August 16, 2024

Thx, and can you tell me the device of your training machine & how long does it cost to convergent, my training task seems hard to go it, I've run 5 epochs but cannot get a good result

from srgan.

DunguLock avatar DunguLock commented on August 16, 2024

My result is not good too, there are some bad patches in the result just like the images putting on the author's web.

from srgan.

joydeepdas avatar joydeepdas commented on August 16, 2024

Can you please tell me how you have decided the alpha multiplier in loss functions, also during iteration steps the losses doesn't does not seem to reduce, Can you give an estimate of how may epochs to run for getting a good low generator loss

from srgan.

tadax avatar tadax commented on August 16, 2024

Unfortunately, it is a rule of thumb and I can't decide to do learning rate decay for ADAM.

from srgan.

guker avatar guker commented on August 16, 2024

I find that you have not modify the mistake as @ruiann mentions

from srgan.

guker avatar guker commented on August 16, 2024

I tried to train the VGG19 and SRGAN again, but it doesn't work.
It is blur (like scale) and the edges have no color.

001
with 1 feature map (phi_5_4)

002

with 5 feature maps (phi_1_2, phi_2_2, phi_3_4, phi_4_4, and phi_5_4)

I tried to train the VGG19 and SRGAN again, but it doesn't work.
It is blur (like scale) and the edges have no color.

001
with 1 feature map (phi_5_4)

002

with 5 feature maps (phi_1_2, phi_2_2, phi_3_4, phi_4_4, and phi_5_4)

I think this code have some issues which about color

 def save_img(imgs, label, epoch):
        for i in range(batch_size):
            fig = plt.figure()
            for j, img in enumerate(imgs):
               im = np.uint8((img[i]+1)*127.5) # image pixel perhaps over 255
               im = cv2.cvtColor(im, cv2.COLOR_BGR2RGB)

use tanh function, also preprocesing that normalize to [-1,1], this can solve color problem

   with tf.variable_scope('deconv5'):
           x = deconv_layer(x,[3,3,3,16],[self.batch_size, 96,96,3],1)
           x = tf.nn.tanh(x)  

other, I find a mistake and modify as follow:

  def inference_adversarial_loss_with_sigmoid(real_output, fake_output):
            alpha = 1e-3
            g_loss = tf.reduce_mean(tf.nn.sigmoid_cross_entropy_with_logits(labels=tf.ones_like(fake_output),
                                                            logits=fake_output))
            d_loss_real = tf.reduce_mean(tf.nn.sigmoid_cross_entropy_with_logits(labels=tf.ones_like(real_output),
                                                                 logits=real_output))
            d_loss_fake = tf.reduce_mean(tf.nn.sigmoid_cross_entropy_with_logits(labels=tf.zeros_like(fake_output),
                                                                 logits=fake_output))
            d_loss = d_loss_real + d_loss_fake
            return (g_loss*alpha, d_loss*alpha)

from srgan.

wqz960 avatar wqz960 commented on August 16, 2024

Hi @tadax @guker after downloading the pretrained model VGG19, where to put the pretrained model and how to load it to restore the VGG19? Thank you!!!

from srgan.

Related Issues (15)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.