Archived

This topic is now archived and is closed to further replies.

jack_1313

A whole 1/2 frames per second!

Recommended Posts

Ok - here it is. I am using DirectDraw (only because D3D screws my computer) and I''ve been working with locking surfaces to perform Alpha-Blending and a sort of Back-Buffer blending thing to make a nice smooth pictre every frame (kind of like anti-aliasing). My problem is, I can''t seem to get pixel plotting to work at an acceptable speed. Why? If I want to alpha-blit a 640*480 bitmap pixel by pixel my framerate drops to about 1/2 fps. I have seen alpha-blending of large bitmaps in DDraw before and it does work, I just can''t seem to do it. Am I missing something?

Share this post


Link to post
Share on other sites
Have you updated your graphics card driver to support the DX-version you are using?

____________________ ____ ___ __ _
Enselic''s Corner - My site. Go test my game Spatra and see if you can beat it onto the Official Spatra Top 10. (source available)
CodeSampler.com - Great site with source for specific tasks in DirectX and OpenGL.

Share this post


Link to post
Share on other sites
Guest Anonymous Poster
If you are locking and unlocking the surface more than one time every frame, then this is your problem... Locking surfaces are incredibly slow, and then I mean incredibly slow! You can try this for yourself, just create one surface which you fill in with a black color and then flip it. Now lock it once every frame before you flip it, you''ll notice how slow it goes and if you lock it twice before flipping it you will start to see the slowdown very well....

Share this post


Link to post
Share on other sites
One more thing (and this has been the culprit of me in the past). MAKE SURE ALL OF YOUR BUFFERS (two sources and destination) ARE IN RAM, NOT VIDEO MEMORY. Otherwise you will kill your framerate right there, as reads/writes to video memory are much slower (writes are sometimes acceptable but reads are hell).

Things to check:
1) Locking once per frame. This can be once per blit if necessary. Also to note: on 3dfx and ATI cards you can get away with locking the whole screen sometimes dozens of times per frame and still sit up in hundreds of fps. On NVIDIA cards, in my experience, and for some reason that I don''t understand, locking takes a MASSIVE hit... I had to reprogram a large section of a game once to lock just once per frame... and that actually killed a pile of features, but still better than eliminating half the audience!
2) Buffers in RAM. Do this with the flags when you create your surfaces.
3) Using pointer addition and not working out the co-ordinates of every pixel every step of the way. You should have a running pointer that you just increment for pixels (and lines). Don''t call a general plot pixel (x, y) function for every sequential point...

Anyways hope that helps.

Share this post


Link to post
Share on other sites
Yes, I only lock ONCE per step. But the backbuffer is in video memory. I noticed that problem before but decided that there wasn''t anything I could do about it. So what can I do? I most definetly cannot store my surfaces in System memory - that would cause every blit to slow down and framerate would suffer like nothing on earth.
Here''s what I was thinking:

Backbuffer in Video Memory.
Another surface in System Memory (or RAM, as you may call it).

Every loop:
Lock both these surfaces.
copy the backbuffer onto the System Memory surface with a single memset.
Plot pixels to the System Memory surface.
Memset the System Memory surface onto the Video Memory surface.

Is that the way to go about it?

Share this post


Link to post
Share on other sites
I doubt that will help much. It is reading from video memory that is killing your speed. Writing to video memory is blazingly fast, reading blazingly slow. Just as a test, create a surface in system memory and do all your stuff on that surface, blit it to the backbuffer and flip it. Just try it, see if there is an improvement.



First make it work,
then make it fast.

--Brian Kernighan

"I’m happy to share what I can, because I’m in it for the love of programming. The Ferraris are just gravy, honest!" --John Carmack: Forward to Graphics Programming Black Book

Share this post


Link to post
Share on other sites
Nope, not gonna work... I know, I''ve come across the same barriers before in DirectDraw In fact, with my current project...

The problem:
1) If you bit FROM video memory TO RAM, you kill your frame rate.
2) If you keep everything in RAM, including the back buffer, if you have anything more than a few blits, you kill your frame rate.
3) If you alpha blend using any surfaces in video memory you kill your frame rate.

Thus, you can only use alpha blending for specialized operations. For example, in my game I use alpha blending ONTO objects before they are blitted onto the back buffer. Very limitted compared to being able to alpha blend anything, but it can still get some cool looking effects (like shields, etc). Just say goodbye to glows and nice anti-aliased sprites though

Sorry... I just don''t think that there''s any way around this one... if someone comes up with one, I''d love to hear it!

Share this post


Link to post
Share on other sites
The problem seems to be this - DirectDraw is a peice of crap. Locking surfaces and slow-slow-slow pixel plotting? What the hell is that? This is a 2D engine - and it is Microsoft for gods sake. One of the primary features of a 2D thingy should be fast pixel manipulation, but instead, all we get is fast bitmap bliting.
I guess this is why Microsoft decided to ditch DDraw.

Share this post


Link to post
Share on other sites
No... it has to do with the focus of what you are doing.

If you just keep your buffers in RAM and do pixel manipulation it will be super-fast... as fast as your memory can go in fact! There's no way to do better than this.

The problem is that system memory and access is not optimized for the kind of buffer copying and operations that these things make heavy use of in blits. Thus by putting the memory locally on the video card and simply instructing it what to do (via a fixed-function pipeline... using Blt etc) you can get the video card to do what it's optimized to do best!

However, as soon as you want to deviate from the fixed-function nature (manually edit pixel data), you have to copy all the data across the AGP BUS, and you get the processor involved.

Just stick to these rules:
1) If you need lots of blitting at super-fast speeds, use video memory.
2) If you need to manually edit pixel data, use RAM.

The faults here do NOT lie with Microsoft, no matter how much you want to blame them. Go ahead... write your own API or find another one that does it faster.
- Locking surfaces is slow: VIDEO CARD. Like I said, ATI and 3dfx based ones are FAST with this, which NVIDIA ones are slow in my experience.
- Slow pixel plotting: only if you are trying to plot directly to video memory! Otherwise it's super-fast (as long as your code is fairly decent). You can plot millions of pixels per frame even if you have decent code.

The solution to the problems that you describle is to go to a programable pipeline in which you instruct the graphics card with CODE how it should manipulate its data => shaders!

PS: One further note. RAM->RAM blits and RAM->VRAM ARE fast... just not as blazingly fast as VRAM->VRAM. In fact, if you have good memory (I've tested on RDRAM... but I can assume Dual-DDR might be similar), you can get speeds of *almost* as fast as VRAM->VRAM. Eg, on my system with PC800 RDRAM (not even the fastest anymore):
VRAM->VRAM 640x480 blits: ~800fps
RAM->RAM 640x480 blits: ~700fps

[edited by - AndyTX on July 4, 2003 9:54:24 AM]

Share this post


Link to post
Share on other sites
Jack_1313.
Put everything in System ram, absolutely everything. I was down the same road a while back.
I thought that If I use HW blits on vram, the copy the back buffer to system ram, I could do faster alpha blending. This is not the case. It will be so much faster to do everything in SoftWare, in System Ram, and then copy everything to a Vram surface.

Either skip alpha blending, or skip Video Ram. There is nothing in between.

Share this post


Link to post
Share on other sites
Well summarized Anyways this seems to be one of those things that everyone just figures out themselves... I wonder why documentation is so sparse on this particular one... but in the SDK and books?

Share this post


Link to post
Share on other sites
I wrote a VERY optimised alpha blending function in MMX asm, using the on site tutorial(s) as guide. And by doing some safe assumptions, it got pretty fast.
But I thought that I'd still do all the non translucent sprites, and all tiles in Vram, with surface->blt so it would be blazing fast with hardware acceleration. And then I wrote a 64 bit mem copy function, that had the screen size hardcoded, and loops unrolled to match, just to copy the vram back buffer to sram. And I dare say you'd be hard pressed to write a faster memcopy for that specific purpose.
But it still wasn't as fast as I had expected, so I tried moving everything to Sram, still using DirectDraw surface->blt, and it was way faster.
Best way is to do everything in system memory, and then copy everything TO a video ram surface and flip.

I thought I'd post this just to back up my rather short previous post.

[/edit]
Just thought I'd add so,ething.
This solves the slow lock problem too, IF you write your own blitters, you will always lock the surfaces at the start of the rendering, keep it locked throuhg all the blitting, and then unlock it before you copy it to video ram.
[/edit]

[edited by - Bad Maniac on July 4, 2003 11:28:09 PM]

Share this post


Link to post
Share on other sites