Hello. I have been looking at the current implementation and I have found that it uses

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Add dynamic channel expansion. about ms_ssim_pytorch HOT 11 CLOSED

one-sixth commented on July 22, 2024

Add dynamic channel expansion.

from ms_ssim_pytorch.

Comments (11)

One-sixth commented on July 22, 2024

@veritas9872 Thanks for your advice. I think so too, I just tested the expand operation, but found that performance has a little decreased. It seems that pytorch jit is not working as we think. This is the modified branch. The time statistics below have been updated.
https://github.com/One-sixth/ms_ssim_pytorch/tree/flexible_multi_channel

I also thought of CPU to GPU overhead. I didn't write any data movement operation on CPU to GPU. Before calling this module with cuda tensor, need to use SSIM().cuda() to place it on the GPU. Otherwise will get an error. This avoids all CPU to GPU overhead.

old ssim

Performance Testing SSIM

testing losser2
cuda time 88990.0703125
perf_counter time 86.80163019999999

testing losser3
cuda time 36119.06640625
perf_counter time 36.057978399999996

testing losser4
cuda time 34708.8359375
perf_counter time 33.916086199999995

new ssim

Performance Testing SSIM

testing losser2
cuda time 88976.3203125
perf_counter time 86.79177510000001

testing losser3
cuda time 36157.3203125
perf_counter time 36.096230100000014

testing losser4
cuda time 35632.9140625
perf_counter time 34.966091500000005

old ms_ssim

Performance Testing MS_SSIM

testing losser1
cuda time 134115.96875
perf_counter time 134.0006031

testing losser3
cuda time 61760.56640625
perf_counter time 61.71994470000001

testing losser4
cuda time 52888.03125
perf_counter time 52.848280500000016

new ms_ssim

Performance Testing MS_SSIM

testing losser1
cuda time 134462.515625
perf_counter time 134.3518838

testing losser3
cuda time 62653.0546875
perf_counter time 62.61409040000001

testing losser4
cuda time 55489.34375
perf_counter time 55.450284599999975

from ms_ssim_pytorch.

veritas9872 commented on July 22, 2024

@One-sixth Hello. Thank you for the information on performance. I am glad to know this.
As for the small increase in time, I think that expanding the kernel in ssim() and ms_ssim() functions might reduce the overhead somewhat. The current implementation has the .expand() inside gaussian_kernel(), which is used several times in each function.

from ms_ssim_pytorch.

One-sixth commented on July 22, 2024

I have modified the code and update branch, but the strange thing is that the time spent does not seem to change. The pytorch jit optimization may be weird. Pytorch jit may have considered this situation and optimized it.

ssim
old

Performance Testing SSIM

testing losser2
cuda time 88984.921875
perf_counter time 86.7958464

testing losser3
cuda time 36126.84375
perf_counter time 36.0654904

testing losser4
cuda time 34692.1953125
perf_counter time 33.896039900000005

new

Performance Testing SSIM

testing losser2
cuda time 88994.9140625
perf_counter time 86.8065547

testing losser3
cuda time 36120.90625
perf_counter time 36.0597451

testing losser4
cuda time 35769.17578125
perf_counter time 35.1033664

from ms_ssim_pytorch.

veritas9872 commented on July 22, 2024

@One-sixth Does the new implementation use dynamic value scaling by any chance?
I implemented the value range to be the range of values for the target if the value range was not specified.

from ms_ssim_pytorch.

One-sixth commented on July 22, 2024

@veritas9872 No, not in my implementation. I think users should set these values manually instead of using automatic.

from ms_ssim_pytorch.

One-sixth commented on July 22, 2024

I sorted out the code. The new dynamic channel version is available at https://github.com/One-sixth/ms_ssim_pytorch/tree/dynamic_channel_num .

from ms_ssim_pytorch.

veritas9872 commented on July 22, 2024

@One-sixth Thanks for the update!
I have one more question. Do both versions use the 'compensation' variable?
Transferring data from CPU to GPU is actually a very expensive operation and I think that this might be the cause of the difference in speed.

from ms_ssim_pytorch.

One-sixth commented on July 22, 2024

@veritas9872 Sorry. I don't really understand the meaning of "compensation". My English is not good.
Transferring data from the CPU to the GPU is of course a very expensive operation. But through

mod = SSIM()
mod = mod.cuda()

All parameters will be moved to the GPU. you can run this

print(mod.window.device)

to found the variable is in the gpu. Because of the use of ScriptModule, all python variables such as self.data_range will become jit constants and merge in jit script. I can't think of any loss of CPU to GPU performance. Maybe you can tell me.

from ms_ssim_pytorch.

veritas9872 commented on July 22, 2024

@One-sixth I was referring to the variable compensation = 1.0 in ssim.py.
However, I checked and both versions have the variable.

from ms_ssim_pytorch.

One-sixth commented on July 22, 2024

@veritas9872 I don't know... You need to ask @VainF.

from ms_ssim_pytorch.

VainF commented on July 22, 2024

@One-sixth @veritas9872 The variable comes from ssim of Tensorflow

from ms_ssim_pytorch.

Add dynamic channel expansion. about ms_ssim_pytorch HOT 11 CLOSED

Comments (11)

Related Issues (6)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent