Coder Social home page Coder Social logo

Comments (44)

imlixinyang avatar imlixinyang commented on June 8, 2024 2

@oldrive I don't know if this is the reason. In my experiments, the real images are also resized to specific resolution first and saved in a folder just like @HyZhu39 did:

I resized and saved the images that calculated FID with as "easy_use.py" did:
transform = transforms.Compose([transforms.Resize(image_size), transforms.ToTensor(), transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))])
x = transform(Image.open('image_save_path here').convert('RGB')).unsqueeze(0)
vutils.save_image(((x + 1) / 2), save_path, padding=0)
Could you have a try?

from hisd.

imlixinyang avatar imlixinyang commented on June 8, 2024

There are some differences between the cleaned code and the original code indeed. But I do think that it would be better rather than worse.
Sorry for that and I would try my best to help you to reproduce the quantitative results.
I will response to you tomorrow, please wait.

from hisd.

HyZhu39 avatar HyZhu39 commented on June 8, 2024

There are some differences between the cleaned code and the original code indeed. But I do think that it would be better rather than worse.
Sorry for that and I would try my best to help you to reproduce the quantitative results.
I will response to you tomorrow, please wait.

thanks for your attention and your quick reply, I will look forward to your reply!

from hisd.

imlixinyang avatar imlixinyang commented on June 8, 2024

@HyZhu39 Hello, how you get the FID between images generated by 5 style codes and the real images?
The generated images for 5 style codes should be put into 5 folders as expected and calculate the average FID between each of them and the real images. Each folder has the same number of images as the original source images.
For disentanglement in our experiments, the reference-guided style codes are randomly sampled from all images with bangs.

from hisd.

HyZhu39 avatar HyZhu39 commented on June 8, 2024

@HyZhu39 Hello, how you get the FID between images generated by 5 style codes and the real images?
The generated images for 5 style codes should be put into 5 folders as expected and calculate the average FID between each of them and the real images. Each folder has the same number of images as the original source images.
For disentanglement in our experiments, the reference-guided style codes are randomly sampled from all images with bangs.

Actually, I did put them in one folder and calculated two folders' FID as the result, and for disentanglement experiments, I just selected from test images with bangs as reference images. Thanks for pointing out that, I'll have a try as you said and tell you the results.
I think what you said actually the point. Thanks again.

from hisd.

imlixinyang avatar imlixinyang commented on June 8, 2024

You're welcomed. Since there are same identities in one folder, the FID (which uses the variance of the image features) would definitely become bigger.

from hisd.

HyZhu39 avatar HyZhu39 commented on June 8, 2024

You're welcomed. Since there are same identities in one folder, the FID (which uses the variance of the image features) would definitely become bigger.

Sorry for bothering again. I tried to put generated images with 5 different style codes separately by style code they used and tested, but it seems that the results are getting worse... that's wired, I think.
I did two group experiments with the self-trained model I used in my first comment.

experiment 1:
realism:
(input images: all images with attirbute "without_bangs" of test images(first 3000 images) translated to "with_bangs";
reference images: randomly sampled 5 images with attribute "with_bangs" in all images;
calculate FID with: all images with attribute "with_bangs" of test images, and resized to 128×128)
L: R: G:
0: 26.45 26.59
1: 26.47 26.64
2: 26.44 27.04
3: 26.84 28.99
4: 25.90 26.38
average: 26.42 27.13 0.71
(randomly chosen references images:5645.jpg、6245.jpg、13652.jpg、14380.jpg、27363.jpg)

disentanglement:
(input images: all images with attirbutes "without_bangs"、" young"、“male“ of test images, translated to "with_bangs";
reference images: randomly sampled 5 images with attribute "with_bangs" in all images;
calculate FID with: all images with attributes "with_bangs"、" young"、“male“ of test images, and resized to 128×128)
L: R: G:
0: 88.79 87.49
1: 88.28 85.61
2: 87.23 92.51
3: 89.40 86.07
4: 88.30 88.11
average: 88.40 87.96 0.44
(randomly chosen references images:426.jpg、19849.jpg、22869.jpg、26513.jpg、28732.jpg)

experiment 2:
realism:same setting as experiment 1;
L: R: G:
0: 27.53 26.78
1: 32.38 26.40
2: 25.72 31.98
3: 28.18 27.48
4: 26.58 27.02
average: 28.08 27.93 0.17
(randomly chosen references images:5645.jpg、6245.jpg、13652.jpg、14380.jpg、27363.jpg)

disentanglement:same setting as experiment 1;
L: R: G:
0: 86.59 86.61
1: 89.13 90.18
2: 85.41 94.21
3: 89.02 87.94
4: 86.36 91.12
average: 87.30 90.01 2.71
(randomly chosen references images:923.jpg、1232.jpg、12886.jpg、24491.jpg、26797.jpg)

I resized and saved the images that calculated FID with as "easy_use.py" did:
transform = transforms.Compose([transforms.Resize(image_size), transforms.ToTensor(), transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))])
x = transform(Image.open('image_save_path here').convert('RGB')).unsqueeze(0)
vutils.save_image(((x + 1) / 2), save_path, padding=0)
by the way, I trained the model with a single GTX 1080Ti 11GB GPU for 200000 iter steps as the config file: celeba-hq.yaml.

from hisd.

imlixinyang avatar imlixinyang commented on June 8, 2024

Actually, you need to randomly sample the reference images for each source image. If you sample only one reference image to translate all the source images into 'with_bangs', the bangs in the translated folder will be the same, right?
So the process should be like:

For i in range(5):
  For each source image x:
    randomly sample a reference image y
    translate x using y as reference
  calculate FID
calculate Average FID

So the problem may be that you put sample a reference image y before the loop of source images.

from hisd.

HyZhu39 avatar HyZhu39 commented on June 8, 2024

Actually, you need to randomly sample the reference images for each source image. If you sample only one reference image to translate all the source images into 'with_bangs', the bangs in the translated folder will be the same, right?
So the process should be like:

For i in range(5):
  For each source image x:
    randomly sample a reference image y
    translate x using y as reference
  calculate FID
calculate Average FID

So the problem may be that you put sample a reference image y before the loop of source images.

Thank you very much for your patience and help, I will try again as soon as possible and give you feedback.

from hisd.

HyZhu39 avatar HyZhu39 commented on June 8, 2024

Sorry for mistakes I made and my misunderstanding of your experiment settings, I think I understand your experiment settings actually now. I randomly sample the reference images for each source image as your proposed logic.
Then I redid the experiments as you said, and get relative more stable results than before like these:

realism:
L: R: G:
group 1:
0: 25.70 25.66
1: 25.60 25.53
2: 25.48 25.56
3: 25.69 25.58
4: 25.53 25.74
avg: 25.60 25.61 0.01
group 2:
0: 25.61 25.60
1: 25.53 25.61
2: 25.60 25.55
3: 25.65 25.66
4: 25.60 25.63
avg: 25.60 25.61 0.01

distanglement:
L: R: G:
group 1:
0: 85.71 84.91
1: 86.57 84.96
2: 86.14 85.51
3: 85.50 85.61
4: 86.56 85.51
avg: 86.10 85.30 0.80
group 2:
0: 85.89 85.87
1: 86.41 85.13
2: 86.11 85.91
3: 85.88 84.57
4: 87.12 85.80
avg: 86.28 85.46 0.82

However, the results are still much worse than the paper's, I think that might be something wrong with my training stage, I think maybe I should re-train my model and have another try.
While, considering that I use exactly the same hardware-conditions and exactly the same training settings, yet the results are worse. There is also a possibility that because the training code has been changed, the previous training settings may not make the current training model fully converge. (In fact, according to the loss curve during training, the adversarial losses'(generator's and discriminator's) curves are quite unstable, yet this is also might because of the characteristics of the GAN structure itself).

In fact, I don't know much about image translation, I'm just a beginner of image translation researchers in a way, I hope you don't get bored because of my ignorance.

from hisd.

imlixinyang avatar imlixinyang commented on June 8, 2024

It's always encouraged to ask in research.
Can you share the qualitative results of your self-trained checkpoint here?

from hisd.

HyZhu39 avatar HyZhu39 commented on June 8, 2024

Many thanks for your help. I have packed some qualitative experiment results and the images of my quantitative experiment (if you needed) in the following Baidu Yun link. Thank you for your willingness to help.
https://pan.baidu.com/s/1r1deZsdbJ4RgFhTXRUKjpQ
Extraction code: HISD
and my checkpoint file(if needed):
https://pan.baidu.com/s/1C6_Pm-gEpwGQFRDaMBDNNg
Extraction code: HISD

from hisd.

imlixinyang avatar imlixinyang commented on June 8, 2024

The qualitative results seem to be promising.
I calculate FID using StarGANv2's script.
I check the difference between StarGANv2's and pytorch-FID and find that these is a preprocessing in the former one, which is

def get_eval_loader(root, img_size=256, batch_size=32,
                    imagenet_normalize=True, shuffle=True,
                    num_workers=4, drop_last=False):
    print('Preparing DataLoader for the evaluation phase...')
    if imagenet_normalize:
        height, width = 299, 299
        mean = [0.485, 0.456, 0.406]
        std = [0.229, 0.224, 0.225]
    else:
        height, width = img_size, img_size
        mean = [0.5, 0.5, 0.5]
        std = [0.5, 0.5, 0.5]

    transform = transforms.Compose([
        transforms.Resize([img_size, img_size]),
        transforms.Resize([height, width]),
        transforms.ToTensor(),
        transforms.Normalize(mean=mean, std=std)
    ])

    dataset = DefaultDataset(root, transform=transform)
    return data.DataLoader(dataset=dataset,
                           batch_size=batch_size,
                           shuffle=shuffle,
                           num_workers=num_workers,
                           pin_memory=True,
                           drop_last=drop_last)

So there may be a proprecessing (a simple normalization) which you need to add in your code. Let me know the results and I think we are close to make it.

from hisd.

HyZhu39 avatar HyZhu39 commented on June 8, 2024

The qualitative results seem to be promising.
I calculate FID using StarGANv2's script.
I check the difference between StarGANv2's and pytorch-FID and find that these is a preprocessing in the former one, which is

def get_eval_loader(root, img_size=256, batch_size=32,
                    imagenet_normalize=True, shuffle=True,
                    num_workers=4, drop_last=False):
    print('Preparing DataLoader for the evaluation phase...')
    if imagenet_normalize:
        height, width = 299, 299
        mean = [0.485, 0.456, 0.406]
        std = [0.229, 0.224, 0.225]
    else:
        height, width = img_size, img_size
        mean = [0.5, 0.5, 0.5]
        std = [0.5, 0.5, 0.5]

    transform = transforms.Compose([
        transforms.Resize([img_size, img_size]),
        transforms.Resize([height, width]),
        transforms.ToTensor(),
        transforms.Normalize(mean=mean, std=std)
    ])

    dataset = DefaultDataset(root, transform=transform)
    return data.DataLoader(dataset=dataset,
                           batch_size=batch_size,
                           shuffle=shuffle,
                           num_workers=num_workers,
                           pin_memory=True,
                           drop_last=drop_last)

So there may be a proprecessing (a simple normalization) which you need to add in your code. Let me know the results and I think we are close to make it.

Sorry for delaying the reply. Actually, without this preprocessing caused the much lower FID results, with StarGANv2's script, the FID results of my latest released results improved to :
group 1:
L R G
realism:
21.27 21.34 0.07
disentanglement:
72.55 72.51 0.04
group2:
L R G
realism:
21.28 21.24 0.04
disentanglement:
72.31 72.33 0.02
compared to paper's results:
Realism:
L:21.37 R:21.49 G:0.12
Disentanglement:
L:71.85 R:71.48 G:0.37
though the results of "disentanglement"'s results are still a little worse, I am not sure about the approximate range of FID fluctuations under normal circumstances, maybe it's acceptable?

from hisd.

imlixinyang avatar imlixinyang commented on June 8, 2024

I do think this is acceptable.
In the paper, we also discuss about the contradiction point between the Realism and Disentanglement (see Sec 4.3 about model without tag-irrelevant conditions). Therefore achieving better results in both Realism and Disentanglement also surprise me in the beginning.
After all, the differences between the released code and the original one are:

  • the original one does not use ALI in adversarial loss (which you can turn off by set all s[:]=0 in discriminator forwarding).
  • the original one uses tag-irrelevant conditions containing Other Tags (use labels of hair color and bangs for tag glasses as well).

I've change the README to clarify the corrected FID script I use in the quantitative results, thank you for your enthusiastic reproduction!

from hisd.

HyZhu39 avatar HyZhu39 commented on June 8, 2024

Thank you for your help again. It's your selfless help that I can successfully reproduce your experiment results.
We communicate in English here for the convenience of other people’s references.
Here I would like to thank you again privately:
感谢一直以来的耐心帮助,诚心祝愿后续科研工作顺利~

from hisd.

imlixinyang avatar imlixinyang commented on June 8, 2024

你也是~

from hisd.

oldrive avatar oldrive commented on June 8, 2024

Many thanks for your help. I have packed some qualitative experiment results and the images of my quantitative experiment (if you needed) in the following Baidu Yun link. Thank you for your willingness to help. https://pan.baidu.com/s/1r1deZsdbJ4RgFhTXRUKjpQ Extraction code: HISD and my checkpoint file(if needed): https://pan.baidu.com/s/1C6_Pm-gEpwGQFRDaMBDNNg Extraction code: HISD

Could you share the images of your quantitative experiment again because the Baidu Yun link is invalid? I am also reproducing the quantitative experiment results in the paper, following your issue but I can not get the result close to the paper.

from hisd.

imlixinyang avatar imlixinyang commented on June 8, 2024

@oldrive What‘s your detailed setting for your reproduction.

from hisd.

oldrive avatar oldrive commented on June 8, 2024

@oldrive What‘s your detailed setting for your reproduction.

config: celeba-hq.yaml
checkpoint: checkpoint_128_celeba-hq.pt
compute_fid_script: use fid.py in stargan2 to compute fid between fake_images and real_images
realism fid of L:
fake_images = [latent_images_0, latent_images_1, latent_images_2, latent_images_3, latent_images_4]
latent_images_i is generated from test_bangs_without accroding to Test_Bangs_without.txt use random latent as guide.
real_images = [test_bangs_with images accrodding to Test_Bangs_with.txt]
realism_latent_fid_average = ( fid(fake_images[0], real_images) + ... + fid(fake_images[4], real_images) ) / 5

realism fid of G:
fake_images = [reference_images_0, reference_images_1, reference_images_2, reference_images_3, reference_images_4]
reference_images_i is generated from all_bangs_with according to Bangs_with.txt and Test_Bangs_with.txt use random reference guide.
real_images = [test_bangs_with images accrodding to Test_Bangs_with.txt]
realism_reference_fid_average = ( fid(fake_images[0], real_images) + ... + fid(fake_images[4], real_images) ) / 5

from hisd.

oldrive avatar oldrive commented on June 8, 2024

The result of the realism_fid as follows:
Group 1:
realism_fid_latent_0: 31.692982996524883
realism_fid_latent_1: 31.671476972145367
realism_fid_latent_2: 31.620433186098698
realism_fid_latent_3: 31.629911284997206
realism_fid_latent_4: 31.73387679777522
realism_fid_reference_0: 32.591734278849
realism_fid_reference_1: 32.215290934387426
realism_fid_reference_2: 32.18949934088806
realism_fid_reference_3: 32.287988762946526
realism_fid_reference_4: 32.304219580808336
realism_fid_latent_average: 31.669736247508276
realism_fid_reference_average: 32.31774657957587

Group 2:
realism_fid_latent_0: 31.642293517652654
realism_fid_latent_1: 31.623934807071
realism_fid_latent_2: 31.68461378392377
realism_fid_latent_3: 31.631847657251797
realism_fid_latent_4: 31.67548435280436
realism_fid_reference_0: 32.29246639585722
realism_fid_reference_1: 32.288538090496914
realism_fid_reference_2: 32.11632434611198
realism_fid_reference_3: 32.15312062309697
realism_fid_reference_4: 32.23484964483734
realism_fid_latent_average: 31.651634823740714
realism_fid_reference_average: 32.21705982008008

from hisd.

imlixinyang avatar imlixinyang commented on June 8, 2024

What's the command you used when you calculate the FID?

from hisd.

oldrive avatar oldrive commented on June 8, 2024

What's the command you used when you calculate the FID?

Just like this:
latent_fid_value = calculate_fid_given_paths([real_path, fake_latent_path[i]], args.img_size, args.batch_size)

from hisd.

imlixinyang avatar imlixinyang commented on June 8, 2024

The "args.img_size" is set to be 128, right?

from hisd.

oldrive avatar oldrive commented on June 8, 2024

The "args.img_size" is set to be 128, right?

right.
parser.add_argument('--img_size', type=int, default=128, help='image resolution')

from hisd.

imlixinyang avatar imlixinyang commented on June 8, 2024

What about the qualitative results.

from hisd.

oldrive avatar oldrive commented on June 8, 2024

What about the qualitative results.

the results have replied in above mention.

from hisd.

imlixinyang avatar imlixinyang commented on June 8, 2024

I mean the visual results.

from hisd.

oldrive avatar oldrive commented on June 8, 2024

I mean the visual results.

Oh, I misunderstand your means.

some results of realism_latent_0 are here
0 jpg_output
1 jpg_output
2 jpg_output
some results of realism_reference_0 are here
0 jpg_output
1 jpg_output
2 jpg_output

from hisd.

oldrive avatar oldrive commented on June 8, 2024

I mean the visual results.

Every image in a fold has a different style of bangs.

from hisd.

imlixinyang avatar imlixinyang commented on June 8, 2024

The visual results seems normal. Please change the image size used in FID to 256 or 224. I don't quite remember the setting here, since that the inception network is trained at a specific resolution.

from hisd.

oldrive avatar oldrive commented on June 8, 2024

The visual results seems normal. Please change the image size used in FID to 256 or 224. I don't quite remember the setting here, since that the inception network is trained at a specific resolution.

I'll have a try as you said and tell you the results. Thanks for your reply!

from hisd.

oldrive avatar oldrive commented on June 8, 2024

The visual results seems normal. Please change the image size used in FID to 256 or 224. I don't quite remember the setting here, since that the inception network is trained at a specific resolution.

Sorry for bothering again. I tried to compute the realism_fid with two groups. Group1 with the argument(--img_size = 256), and compute fid between the fake images(256256, generated with the 256.config and 256.checkpoint) and real images, group2 with the same argument(--img_size = 256), and compute fid between the fake images(128128, generated with the 128.config and 128.checkpoint) and real images, but it seems that the results are getting worse... That is so wired.

realism_fid(256*256 fake_images and real_images, fid(fake_images, real_images, arg.img_size = 256)):
realism_fid_latent_0: 37.70455934722888
realism_fid_reference_0: 38.05122125169506
realism_fid_latent_1: 37.59272856627348
realism_fid_reference_1: 37.81830888013152
realism_fid_latent_2: 37.698022304952914
realism_fid_reference_2: 38.03778528813959
realism_fid_latent_3: 37.610822585752246
realism_fid_reference_3: 38.0628089612687
realism_fid_latent_4: 37.688711544348806
realism_fid_reference_4: 37.91353803968795
realism_fid_latent_average: 37.65896886971126
realism_fid_reference_average: 37.976732484184566

realism_fid(128*128 fake_images and real_images, fid(fake_images, real_images, arg.img_size = 256)):
realism_fid_latent_0: 69.20908546448136
realism_fid_reference_0: 69.23383364990423
realism_fid_latent_1: 69.11336443484716
realism_fid_reference_1: 69.34028775602908
realism_fid_latent_2: 69.18649394941102
realism_fid_reference_2: 69.52593890927548
realism_fid_latent_3: 69.09191563199727
realism_fid_reference_3: 69.40835510741587
realism_fid_latent_4: 69.0797907168618
realism_fid_reference_4: 69.28953132218695
realism_fid_latent_average: 69.13613003951971
realism_fid_reference_average: 69.35958934896232

from hisd.

imlixinyang avatar imlixinyang commented on June 8, 2024

Could you share some real images in test_bangs_with.txt?

from hisd.

oldrive avatar oldrive commented on June 8, 2024

Could you share some real images in test_bangs_with.txt?

There are the first five images in test_bangs_with:
15
17
31
43
44

Screenshot from 2021-11-16 13-59-09

from hisd.

oldrive avatar oldrive commented on June 8, 2024

@oldrive I don't know if this is the reason. In my experiments, the real images are also resized to specific resolution first and saved in a folder just like @HyZhu39 did:

I resized and saved the images that calculated FID with as "easy_use.py" did:
transform = transforms.Compose([transforms.Resize(image_size), transforms.ToTensor(), transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))])
x = transform(Image.open('image_save_path here').convert('RGB')).unsqueeze(0)
vutils.save_image(((x + 1) / 2), save_path, padding=0)
Could you have a try?

Before computing fid use real images, I did not resize them or save them to a folder, I'll have a try.

from hisd.

oldrive avatar oldrive commented on June 8, 2024

@oldrive I don't know if this is the reason. In my experiments, the real images are also resized to specific resolution first and saved in a folder just like @HyZhu39 did:

I resized and saved the images that calculated FID with as "easy_use.py" did:
transform = transforms.Compose([transforms.Resize(image_size), transforms.ToTensor(), transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))])
x = transform(Image.open('image_save_path here').convert('RGB')).unsqueeze(0)
vutils.save_image(((x + 1) / 2), save_path, padding=0)
Could you have a try?

Oh, the reason is that, and I get the result of realism fid and disentangle fid closer to the paper.
realism_fid:
realism_fid_latent_0: 20.912922731584082
realism_fid_reference_0: 21.046019767649355
realism_fid_latent_1: 20.76848449633095
realism_fid_reference_1: 21.04662247713575
realism_fid_latent_2: 20.800978320503397
realism_fid_reference_2: 21.0600877899802
realism_fid_latent_3: 20.775910991635065
realism_fid_reference_3: 20.92837823926883
realism_fid_latent_4: 20.68396588649034
realism_fid_reference_4: 20.94170026977707
realism_fid_latent_average: 20.788452485308767
realism_fid_reference_average: 21.004561708762242

disentangle_fid:
disentangle_fid_latent_0: 71.39510730377387
disentangle_fid_reference_0: 70.64971902519095
disentangle_fid_latent_1: 71.06008491519601
disentangle_fid_reference_1: 70.88973207966575
disentangle_fid_latent_2: 71.40558227571222
disentangle_fid_reference_2: 71.33517553604398
disentangle_fid_latent_3: 71.2109615470645
disentangle_fid_reference_3: 71.0546303462186
disentangle_fid_latent_4: 71.48734756970637
disentangle_fid_reference_4: 71.08293051285575
disentangle_fid_latent_average: 71.31181672229059
disentangle_fid_reference_average: 71.00243749999501

from hisd.

oldrive avatar oldrive commented on June 8, 2024

Thank you for your patient help and quick reply again. With your help can I reproduce the quantitative experiment results in the paper.
由衷地感谢作者大大的热心帮助,祝愿作者大大今后的科研工作一路顺风~

from hisd.

imlixinyang avatar imlixinyang commented on June 8, 2024

Ideally it should be the same for these two resizing steps. I think the reason maybe the the transform.resize module. As this link says, when inputing a PIL image, the resize function would use antialias mode by default.
不客气哈,也同样非常感谢关注这篇工作。一切顺利!

from hisd.

zhushuqi2333 avatar zhushuqi2333 commented on June 8, 2024

@HyZhu39 Hello, how you get the FID between images generated by 5 style codes and the real images? The generated images for 5 style codes should be put into 5 folders as expected and calculate the average FID between each of them and the real images. Each folder has the same number of images as the original source images. For disentanglement in our experiments, the reference-guided style codes are randomly sampled from all images with bangs.

Sorry to disturb you, I am reshowing the experimental results of this paper. There are 568 real images with bangs, and 2432 images without bangs. After translation, I will get 2432 imgs with bangs. May I ask whether I should directly calculate FID for these two photo sets or select 568 images from 2432 images for calculation? Looking forward to your reply.Thank you!

from hisd.

imlixinyang avatar imlixinyang commented on June 8, 2024

Yes. The FID evaluation separately calculates the distribution mean and var of two folders, so you don't need to worry about the different number of images. @zhushuqi2333

from hisd.

zhushuqi2333 avatar zhushuqi2333 commented on June 8, 2024

Thank you for your reply~I have carefully read all the answers to this question and conducted relevant experiments. All my experiments are carried out under 128 x 128 pictures. Because I trained the model by using celeba-hq.yaml,the resolution of which is 128 x128
The experimental configuration is as follows:

I have one difference from the above content, which is the calculation of FID -- img_ size=128. All the resolution of real_imgs is 128 X128, and the size of all translated pictures is also 128 X128.
My experimental results are as follows:

realism: disentanglement:
L: 22.63 L: 72.46
R: 21.17 R: 71.63
G: 1.46 G: 0.83

compared to the paper's results:
Realism: Disentanglement:
L:21.37 L:71.85
R:21.49 R:71.48
G:0.12 G:0.37

L is a little bigger than the paper's, G is too big. Can you give some suggestions?Looking forward to your reply~

from hisd.

imlixinyang avatar imlixinyang commented on June 8, 2024

I think the difference between these two results is acceptable if you only calculate once. You can try:

  1. calculate the average FID of 5 random (different seeds for L and G) results.
  2. use different checkpoint. The latest checkpoint is not always the best.

@zhushuqi2333

from hisd.

zhushuqi2333 avatar zhushuqi2333 commented on June 8, 2024

Thank you for your reply~Results above are average,I will try many different seeds and use different checkpoint.

from hisd.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.