Coder Social home page Coder Social logo

Blocky output about hat HOT 8 OPEN

xpixelgroup avatar xpixelgroup commented on September 3, 2024
Blocky output

from hat.

Comments (8)

chxy95 avatar chxy95 commented on September 3, 2024

@shreykshah This looks like a specific out-of-distribution case, since the released models are trained based on natural images. If possible, you could send me the input image and I will check it.

from hat.

shreykshah avatar shreykshah commented on September 3, 2024

@chxy95 This is a zoomed in version of the image to highlight the issue. This is one example. This has happened on many images, including multiple natural images. The blocks happen across the entire image, both in foreground and background.

from hat.

chxy95 avatar chxy95 commented on September 3, 2024

@shreykshah Which model is used for generating the results? Could you send me some input sample that would produce these phenomena by email or any other ways? Window-based SA model indeed would generate blocking artifacts for some image restoration tasks, but I haven't observed such severe cases in image SR.

from hat.

shreykshah avatar shreykshah commented on September 3, 2024

@chxy95 I tried with two different images, one black and white, the other full color, with people in it. I don't feel comfortable sharing the photo, but I tried it on HAT_SRx2, HAT_SRx3, HAT_SRx4, HAT-L_SRx2_ImageNet-pretrain, HAT-L_SRx3_ImageNet-pretrain, and HAT-L_SRx4_ImageNet-pretrain, all of which produce blocky results (albeit to varying degrees) on both images.

from hat.

shahargadshriki avatar shahargadshriki commented on September 3, 2024

I have the same issue
image (33)

from hat.

chxy95 avatar chxy95 commented on September 3, 2024

This phenomenon does seem to be a flaw in our approach due to the fixed window size for self-attention calculation. This problem seems difficult to solve under the existing framework. I think it might work to lower the resolution of the input image properly first.

from hat.

shahargadshriki avatar shahargadshriki commented on September 3, 2024

Can you explain please why tile mode doesn't solve it (every tile is low-resolution input)?

from hat.

chxy95 avatar chxy95 commented on September 3, 2024

I think the blocky phenomenon is caused by the low information density of the input image. In other words, there is not enough valid information in the fixed window size used for self-attention computation for SR. Tile mode changes the resolution but not the information density. Appropriate downsampling may be able to alleviate this phenomenon by changing the information density.

This is also my guess. What I can confirm is that this is indeed caused by the window-based self-attention mechanism in HAT. It seems that the blocky problem is difficult to deal with, although we have tried to alleviate it in our network design.

from hat.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.