Need sugestion to optimize realtime bitblt->StretchBlt (scaling takes to much CPU)

Started by
26 comments, last by Krypt0n 9 years, 10 months ago
What exactly are your requirements on the quality?

You wrote that "color on color", which is nearest downsampling, is not good enough. What about downsizing the image to exactly half it's original size with a box filter, compressing and sending that to the target device, and linearly upsampling there?
You can simulate that in GIMP/Photoshop to see, what it looks like.
Advertisement

take ma advice and do it such way

1) wrote your own resizing code (inbeetween the raw scaling and halftone), this shouldnt be hard - you will not get as fast as simple resizing blitter when doing something more complex than raw scalling but you also should get not as slow as halftone when writting something simpler than halftone

2) find some library - this should be the same, if you use something in between than raw sacling and halftone that should be both slower than raw scalling and faster than halftone

thats all

take ma advice and do it such way

1) wrote your own resizing code (inbeetween the raw scaling and halftone), this shouldnt be hard - you will not get as fast as simple resizing blitter when doing something more complex than raw scalling but you also should get not as slow as halftone when writting something simpler than halftone

2) find some library - this should be the same, if you use something in between than raw sacling and halftone that should be both slower than raw scalling and faster than halftone

thats all

I already tried few pro image library - all gave me worst FPS than strachblt

I don't know how much time per day you've been spending on this, but in the time since you started this thread you could have already written a GPU solution. I know you say it's not what you want to do, but seriously, it wouldn't be that hard and you'll have excellent performance.

what cpu are you using for benchmarking?

some possible solutions:
1. scale to half and then upscale to the resolution you wanted, e.g.
1440*900 -> 720*450 -> 1200*700

2. write a custom scaler for the solution you need.

3. maybe your source is in some YCbCr format or something? you could scale the source

4. interlace scaling (scale just every 2nd line per frame -> double fps)

5. scale to half and use a letterbox

6. don't interpolate, just fetch the closest pixel to show.

Hi Krypt0n

what cpu are you using for benchmarking?

CPU is x9000 (2 cores 2.8 each)

some possible solutions:
1. scale to half and then upscale to the resolution you wanted, e.g.
I think this will kill quality but i will try.

2. write a custom scaler for the solution you need.
Already tried that. I tries even to use other pro lib - performance were slow.
3. maybe your source is in some YCbCr format or something? you could scale the source
Source is 24 or 32 rgb (as i want) i scale and then convert to YUV 420

Maybe scaling the YUV 420 will be faster instead of scaling the rgb but i could not find any function that scale YUV 420 without first convert it to rgb
4. interlace scaling (scale just every 2nd line per frame -> double fps)
Sorry but not sure i understand what you mean here ...

I don't know how much time per day you've been spending on this, but in the time since you started this thread you could have already written a GPU solution. I know you say it's not what you want to do, but seriously, it wouldn't be that hard and you'll have excellent performance.

As far as i know GPU is opengl (many problems) or directX - as for DirectX i really do not want to start distribute DirectX with the software and handle all DirectX versions.

GDI is simpler.

Hi Krypt0n
what cpu are you using for benchmarking?
CPU is x9000 (2 cores 2.8 each)

should be quite memory bandwidth bound I guess.

1. scale to half and then upscale to the resolution you wanted, e.g.
I think this will kill quality but i will try.

yeah, but for slow cpus this might still be a solution to reach 30Hz as long as you don't want to use a gpu solution.

2. write a custom scaler for the solution you need.
Already tried that. I tries even to use other pro lib - performance were slow.

I think those libs are quite generic, writing a solution for your case might open specializations that would give you a boost.

3. maybe your source is in some YCbCr format or something? you could scale the source
Source is 24 or 32 rgb (as i want) i scale and then convert to YUV 420
Maybe scaling the YUV 420 will be faster instead of scaling the rgb but i could not find any function that scale YUV 420 without first convert it to rgb

YUV 420 should be faster indeed, you could even use a quality scaling for Y and something fast/cheap for UV, it's not critical for quality.

4. interlace scaling (scale just every 2nd line per frame -> double fps)
Sorry but not sure i understand what you mean here ...

it's a common method in bandwidth limited cases (e.g. first 1080 support was interlaced as 1080p would be beyond the HDMI spec at that time).
http://en.wikipedia.org/wiki/Interlaced_video
you could do the same if you figure out the HW does not manage to support full resolution at 30Hz.

This topic is closed to new replies.

Advertisement