Graphics speed problems...

Started by
9 comments, last by JY 20 years, 8 months ago
I''m not sure whether this should be posted here or in DirectX but i''m posting it here... I''ve been working on a software renderer for some time and have now decided to use Direct3D (v.8) for the full screen stuff. I have successfully created my devices, surfaces etc. but I am getting mega slow frame rates just doing simple stuff. I know that LockRect can be slow but I am simply locking the buffer, setting all the pixels to a colour (simply x/y loop) and i''m only getting 20 frames per second!!?? Is it just that LockRect is slow, or am I setting something up wrong?

	while (GetMessage(&msg, NULL, 0, 0)) {
		if (!TranslateAccelerator(msg.hwnd, hAccelTable, &msg)) {
			TranslateMessage(&msg);
			DispatchMessage(&msg);
		}

		hr = g_pd3dSurface->LockRect(&lr, NULL, 0);
		DWORD* pdw = (DWORD *)lr.pBits;
		int pitch = lr.Pitch / 4;
		for (int x = 0; x < d3ddm.Width; x++) {
			for (int y = 0; y < d3ddm.Height; y++) {
				pdw[y * pitch + x] = col;
			}
		}

		hr = g_pd3dSurface->UnlockRect();
		hr = g_pd3dDevice->Present(NULL, NULL, NULL, NULL);
		frame++;
		col++;

		if (GetAsyncKeyState(VK_ESCAPE) & 0x8000) break;

		if (GetTickCount() - dwStart >= 1000) {
			Output("Frames", frame);
			frame = 0;
			dwStart = GetTickCount();
		}
		PostMessage(g_hWnd, WM_NULL, 0, 0);
	}
 
.. this is my main loop - I can post the initialisation stuff as well if it helps but I''m doing it all as per the docs. Is there a quicker way to get access to the memory. "Absorb what is useful, reject what is useless, and add what is specifically your own." - Lee Jun Fan
"Absorb what is useful, reject what is useless, and add what is specifically your own." - Lee Jun Fan
Advertisement
First of all, your x loop should be inside the y loop, otherwise you''re accessing the buffer in a very non-sequencial manner. This may not give you a significant speedup, but it just hurts me to see this.

Now to find out if the loop really is a problem, comment it out and see if you get any significant speedup. After that, remove the buffer lock/unlock, and see what additional speedup you get. This should give you some idea as to where the problem is... I''m not nearly an expert on Direct3D, but chances are that the problem is in the buffer lock (it is very unrecommended, from what I''ve heard). Hope this helps...

Michael K.,
Co-designer and Graphics Programmer of "The Keepers"


We come in peace... surrender or die!
Michael K.
I had the same problem with DirectDraw.Here is how i solved it.I created a GDI DIB section and i wrote directly to it.At buffer swap i blit it on the back surface and then rotating the chain.The result was 200% speed increase.With the HAL drivers it''s 150%.

"Tonight we strike,there is thunder in the sky,together we''ll fight,some of us will die,but they''ll always remember that we''ve made a stand and many will die by hand!" - ManOwaR
Reading and writing to VRAM is very very slow, create sysmem surface as backbuffer render to that then blit it to the screen and you''ll find an enorumous speed gain
HardDrop - hard link shell extension."Tread softly because you tread on my dreams" - Yeats
I'm not sure that you are not indexing the array correctly. You are not accounting for the width, which (ignoring pitch for now) should be done as follows:
pdw[y * d3ddm.Width + x]

Sorry to sound a bit harsh, but the use of pitch is also incorrect. lr.Pitch / 4 will result in: 0 <= Pitch <= 1 (which, as a side note, is declared as an int). This cannot be used in the indexing of the array. a better solution would be to use a BYTE* to lr.pBits as in the following code:

(Note that incrementing a pointer should also be a little faster than using an array index)

hr = g_pd3dSurface->LockRect(&lr, NULL, 0);BYTE* Dw = (BYTE*)lr.pBits;for(int y = 0; y < d3ddm.Height; y++){  for(int x = 0; x < d3ddm.Width; x++, Dw += lr.Pitch)  {    *(DWORD*)Dw = col;  }} 




[edited by - Safely Anonymous on August 14, 2003 3:58:59 AM]
Name callers are idiots.
quote:Original post by Safely Anonymous
I''m not sure that you are not indexing the array correctly. You are not accounting for the width, which (ignoring pitch for now) should be done as follows:
pdw[y * d3ddm.Width + x]

Sorry to sound a bit harsh, but the use of pitch is also incorrect. lr.Pitch / 4 will result in: 0 <= Pitch <= 1 (which, as a side note, is declared as an int). This cannot be used in the indexing of the array. a better solution would be to use a BYTE* to lr.pBits as in the following code:

(Note that incrementing a pointer should also be a little faster than using an array index)


hr = g_pd3dSurface->LockRect(&lr, NULL, 0);
BYTE* Dw = (BYTE*)lr.pBits;

for(int y = 0; y < d3ddm.Height; y++)
{
for(int x = 0; x < d3ddm.Width; x++, Dw += lr.Pitch)
{
(DWORD*)Dw = col;
}
}


Not 100% sure, but I think he actually is doing that right. IIRC, the pitch parameter of a buffer in Direct3D represents the width of the buffer plus its per-line padding, in bytes. Since he used DWords, he needs to divide this by four.

Michael K.,
Co-designer and Graphics Programmer of "The Keepers"


We come in peace... surrender or die!
Michael K.
quote:Original post by technobot
The pitch parameter of a buffer in Direct3D represents the width of the buffer plus its per-line padding, in bytes.


Oops.. Sorry, I mistook the parameter for the bit depth. Yes, his original code was right. That'll teach me not to read the remarks section of the documentation.

However, incrementing a pointer instead of indexing the array would be slightly faster:

int pitch = lr.Pitch / 4;int Pad = Pitch - d3ddm.Width;for(int y = 0; y < d3ddm.Height; y++, pdw += Pad){  for(int x = 0; x < d3ddm.Width; x++, pdw++)  {    *pdw = col;  }} 


[edited by - Safely Anonymous on August 14, 2003 4:23:47 AM]
Name callers are idiots.
quote:Original post by DigitalDelusion
Reading and writing to VRAM is very very slow, create sysmem surface as backbuffer render to that then blit it to the screen and you''ll find an enorumous speed gain



I have also heard this - that writing to VRAM is slow, but at some point my own buffer has to be written to VRAM. What''s the quickest way of doing this if not what I am doing already?

Any more pointers would be gratefully received!




"Absorb what is useful, reject what is useless, and add what is specifically your own." - Lee Jun Fan
"Absorb what is useful, reject what is useless, and add what is specifically your own." - Lee Jun Fan
quote:Original post by Safely Anonymous
I''m not sure that you are not indexing the array correctly. You are not accounting for the width, which (ignoring pitch for now) should be done as follows:
pdw[y * d3ddm.Width + x]



This is precisely what should not be done. One should always use the pitch when traversing a bitmap vertically because the pitch can easily be greater than the width of the bitmap. When this is the case, the bitmap will have staggered lines:

|<-              pitch        ->||<-         width       ->|...........................:::::: 


x + width would not give you the pixel below x.


"Absorb what is useful, reject what is useless, and add what is specifically your own." - Lee Jun Fan
"Absorb what is useful, reject what is useless, and add what is specifically your own." - Lee Jun Fan
quote:I have also heard this - that writing to VRAM is slow, but at some point my own buffer has to be written to VRAM. What''s the quickest way of doing this if not what I am doing already?


As mentioned, the fastest way will be to bitblit your system memory surface onto the framebuffer.
"When you die, if you get a choice between going to regular heaven or pie heaven, choose pie heaven. It might be a trick, but if it's not, mmmmmmm, boy."
How to Ask Questions the Smart Way.

This topic is closed to new replies.

Advertisement