Coder Social home page Coder Social logo

op_rbf's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

op_rbf's Issues

image testing bugs!!!

when i use this image for testing , program crashed!!, here is the image. is this a problem about the size?
skysmall

Bugfix for odd linenumber height(segment)

if (thread_index + 1 == RBF_MAX_THREADS)
height_segment = height - thread_index * height_segment;

should be changed to:

if (thread_index + 1 == RBF_MAX_THREADS)
height_segment = height - thread_index * height_segment - 1;

Please note that I removed the variable for number of threads in the class for my implementation.

Calculated image with filtered edges

Use author's (using optimized SSE2 with optional multithreading, pipelined 2 stages)Method, the left edge of the image after the calculation has filtering, why is it?

bugs and solutions

Hi, Fig1024, your optimization is awesome, it means a lot to me. thanks!
However, i found some bugs there, and below is some of my solutions

First, according to the comment statements from https://github.com/nothings/stb.git. Maybe we should add
#define STB_IMAGE_IMPLEMENTATION
#define STB_IMAGE_WRITE_IMPLEMENTATION
at the begining of RecursiveBilateralFilter.cpp. In some situation, it couldnot pass the compile if we don't.

Second, it crash in RBFilter_SSE2.cpp,when i run your code. After debug, I found it's because out_pix4 is not defined before used in function verticalFilter
pixB = _mm_shuffle_epi8(pixB, mask_pack);

out_pix4 = _mm_srli_si128(out_pix4, 4); // shift
out_pix4 = _mm_or_si128(out_pix4, pixB);
I try to change the codes to be
if (i == 0) //
{
out_pix4 = _mm_shuffle_epi8(pixB, mask_pack);
}
else
{
pixB = _mm_shuffle_epi8(pixB, mask_pack);

out_pix4 = _mm_srli_si128(out_pix4, 4); // shift 
out_pix4 = _mm_or_si128(out_pix4, pixB);

}
It seem to work. But i'm not sure it this change is right.

Can you check here,please?
Thank you.

性能

Original Recursive Bilateral Filter implementation
Image: lw_cross.jpg, size: 1280 x 720, time ms: 70.2
Image: Thefarmhouse.jpg, size: 1440 x 1080, time ms: 119.3
Image: testGirl.jpg, size: 448 x 626, time ms: 23.6

Optimized SSE2 single threaded, single stage (non-pipelined)
Image: lw_cross.jpg, size: 1280 x 720, time ms: 86.6
Image: Thefarmhouse.jpg, size: 1440 x 1080, time ms: 142.0
Image: testGirl.jpg, size: 448 x 626, time ms: 25.6

Optimized SSE2 2x multithreading, single stage (non-pipelined)
Image: lw_cross.jpg, size: 1280 x 720, time ms: 44.3
Image: Thefarmhouse.jpg, size: 1440 x 1080, time ms: 78.4
Image: testGirl.jpg, size: 448 x 626, time ms: 13.4

Optimized SSE2 4x multithreading, single stage (non-pipelined)
Image: lw_cross.jpg, size: 1280 x 720, time ms: 24.9
Image: Thefarmhouse.jpg, size: 1440 x 1080, time ms: 41.5
Image: testGirl.jpg, size: 448 x 626, time ms: 7.5

Optimized SSE2 4x2 thread pipelined 2 stages
Image: lw_cross.jpg, size: 1280 x 720, time ms: 18.6
Image: Thefarmhouse.jpg, size: 1440 x 1080, time ms: 32.0
Image: testGirl.jpg, size: 448 x 626, time ms: 5.9
Finish

您好,我的cpu是i7-8700 [email protected]的,但是这个复现结果和您的结果差距在十多倍,请问会是什么原因呢?谢谢

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.