It''s known that lPitch may be not equal to surface width, because width aligned by directx to "good" number. But if i create surface with already good width = 512 or 256, can i be sure that lPitch will be equal width in any computers, or directx can set it to 511 or 513,514...???

It MIGHT, but it is never safe to assume anything. Usinl the lPitch isn''t that hard, I just have a loop that increments the Y value and then processes the X (the scanline) with pointer incrementation. The pointer is recreated for each scanline

Cybertron is right. Using lPitch isn''t that hard. But if you must know, I don''t believe that directX will make the lPitch something odd, like 511 or 513... there''s no optimizations in doing that... having it aligned at 256, 512, or 1024 has it''s benifits when it comes to blitting, i.e. directX can then use bit shifters to calculate the beginning of a line instead of straight multiplication... just my take on it, though...

I agree, but in some cases when time of working become very important some additional operations like organizing cycle
and additional increments are not allowed.If lPicth = Width and
Width is known then surface is linear memory, and i can fill in
one cycle ( not two dimension ) or use ''rep STOPSW'' in assembler
to copy data in it.
Real LPitch calculated by directx, and this calculations doing by some alghoritm , which - is unknown. If know how it deside
what value of lPitch must be, so we can exactly say - Always LPitch = Width when Width = 2^N or Not.
To my mind its no reasons give lPitch other when Width = 2^N, but how it''s real, i not sure..

I think I can see the benifit to what you're trying to do, if you were writing your own blitter. Personaly, I think blitting is best left to the hardware.

However, my way of thinking is this: if the width == 2^N, then lpitch 'should' always = width(no benifit or reason for doing otherwise).

The only way to find out for sure is to try it, though. Try creating a random number of surfaces, with random widths. Than compare the widths to the lpitches. In the end, it doesn't matter how anyone 'thinks' directx or video hardware will behave. The only thing that matters is how it 'truely' behaves, under real-world circumstances, and the only way to find that out is to see firsthand. Also, I'm not sure if the video hardware is responsible for the 'lpitch factor' or if it's a directx thing, so behavior may change on different hardware/directx versions....

But ultimately, my opinion is to bite the bullet and assume that lpitch never equals width, and use lpitch accordingly. That's just me, though...

You read pitch from the surface description and use that to calculate offset to next scanline. Period.

It''s not because some people suggested that it''s "faster" for blitter to blit if the scanline widths are powers of two. That''s mostly irrelevant for performance.

What is more relevant is that think of nVidia, they do multiple pixels at once. Think of architechtures with tiling. If the framebuffer is accessed in, for example as 4x1, 16x8 or other size blocks at a time, it would take extra silicon to implement special case where framebuffer has odd-number of pixels in the last block on the right side of the buffer.

It''s so much simpler to allocate whole blocks and just leave the few odd pixels unused/undefined. When resizing window, also, operation will be quicker when we don''t have to resize for each pixel-size resize, rather only physically change size when we cross certain block size .. if we use logically 64x64 blocks, we need to resize only every 64 pixels. Users will still see smooth resizing taking place. Most of the time window *does* just logically map piece of physical videomemory and no memory re-allocation per-se is taking place, but that''s still how it goes. Imagine rendering buffer for DirectX or OpenGL where rendering is taking place, and which is BLIT into the visible part (or soon-to-be-visible) part of videomemory. It''s quite sure that the driver doesn''t want to resize the memory all the time (or waste memory, but can, if that is fit).

Point being.. nice power-of-two width for framebuffer hardly is the primary reason. Heck, most devices use pitch == width*bpp anyway. It''s some odd Matrox and ATI cards mostly where you might experience otherwise.

Then there used to be drivers where pitch was set to width in visible pixels (not even multiplied by the pixel width in bytes) which gave a lot of gray hair to developers (who did care) back then. Usually applications which (incorrectly) calculated pitch manually got away with it and only Matrox folks experienced difficulty. This happened with VESA VBE aswell.. too bad.. specs are clear about it, but still driver authors make errors. But as application writers it''s OUR responsibility to write CORRECT code, and inform the hardware manufacturer of their driver errors. This way things work out smoothly in the end.

So write correct code. Don''t assume anything about pitch. Read the value back, and use it as it should be used. Thank you.

quote:

Point being.. nice power-of-two width for framebuffer hardly is the primary reason. Heck, most devices use pitch == width*bpp anyway. It''s some odd Matrox and ATI cards mostly where you might experience otherwise.

Ummm, no. Going back quite a ways in DX versions, you could see in the SDK docs how pitch was calculated. The SDK also allowed you to allocate buffers yourself for use as surfaces, as long as you used the padding to keep the alignment on specific boundries (page or k boundries - can''t remember which). And, if a driver remains complient, it will still use that same method, although newer versions of DX (8+) drivers may try to optimize memory a bit by using smaller pitch values, which the driver will compensate for when you''re trying to access it. Typically, you''ll see this when dealing with depth buffer surfaces.

quote:

When resizing window, also, operation will be quicker when we don''t have to resize for each pixel-size resize, rather only physically change size when we cross certain block size ..

Any resize of the window means you have to reconstruct the buffer anyway to match, otherwise you get stretching, which uses the same resolution as when constructed. The only reason memory will change is when surfaces are paged in/out or moved to faster memory.

quote:

Then there used to be drivers where pitch was set to width in visible pixels (not even multiplied by the pixel width in bytes) which gave a lot of gray hair to developers (who did care) back then.

More than likely, that was at the beginning of DX , where video card manufacturers were reluctant to get going. Still, very few drivers had problems like that.

quote:

So write correct code. Don''t assume anything about pitch. Read the value back, and use it as it should be used. Thank you.

Correct. If you are using pitch, than you should always lock the surface and grab the current pitch (never assume it''s fixed), as drivers can relocate surface memory at any time due to paging or any other reason. The speed hit is very small, as you have to lock the surface to ensure it''s in memory anyway.

Ofcourse I said *drivers*, they were free to set pitch to anything they thought appropriate - thus - user sholdn''t count on it being anything specific. Read it, use it, right?

I just gave plenty of reasons why pitch != width*bpp, and I still think that power-of-two dimensions is least likely reason in practise to be of consideration.

For client it''s marginally useful.. if you know your trade, raster routines don''t have to calculat address of pixels too frequently anyway (*1). If you doing triangle filler, once per triangle is sufficient (for example for the topmost vertex). If it''s a 2D sprite, same thing, one of the corners is sufficient, etc. etc.

SO IMHO, the power-of-two doesn''t weight much at all in the cup based on my experience.

(*1) Going right? ++address; Going down? address += pitch; As you can agree, this makes the power-of-two not very factorial.

Cool Jim Adams posted a message! If you are still cruising around here i''m reading your book Programming Role-Playing Games With DirectX and am loving it!

##### Share on other sites
quote:
Original post by silentrob66
The only way to find out for sure is to try it, though. Try creating a random number of surfaces, with random widths. Than compare the widths to the lpitches. In the end, it doesn''t matter how anyone ''thinks'' directx or video hardware will behave. The only thing that matters is how it ''truely'' behaves, under real-world circumstances, and the only way to find that out is to see firsthand.

Please don''t do this. This is the sort of thing that leads to developers releasing products which crash on other people''s machine, and in defends the developer responds with "Well, it worked on my machine". "It worked on my machine" only matters if that''s the only place you''re going to run it. If you''re releasing it to the public, it isn''t acceptable.

Stay Casual,

Ken
Drunken Hyena

quote:

Ofcourse I said *drivers*, they were free to set pitch to anything they thought appropriate - thus - user sholdn''t count on it being anything specific. Read it, use it, right?

Wrong, drivers mirror the hardware, so a driver can''t just pick anything - it has to be based on the hardware specs. If a memory card pages memory, it has to do in big chunks, hence the use of page or k boundries that make it easier on the hardware to move memory.

quote:

I just gave plenty of reasons why pitch != width*bpp, and I still think that power-of-two dimensions is least likely reason in practise to be of consideration.

quote:

Heck, most devices use pitch == width*bpp anyway

You might want to decide on pitch!=width*bpp or not in your discussion - you''re switching back and forth.

quote:

For client it''s marginally useful.. if you know your trade, raster routines don''t have to calculat address of pixels too frequently anyway (*1). If you doing triangle filler, once per triangle is sufficient (for example for the topmost vertex). If it''s a 2D sprite, same thing, one of the corners is sufficient, etc. etc.

You only need calculate pitch once per lock for any following operation, be it drawing bitmaps, polygons, etc. From there, all objects use the pitch to calculate one position to start from, so this statement is kind of a given - you don''t have to ''know the trade'' to understand this.

quote:

SO IMHO, the power-of-two doesn''t weight much at all in the cup based on my experience.

It''s not power of two we''re talking about, it''s page or k boundries, which are need for memory paging, not blitting issues. See above about hardware.

quote:

(*1) Going right? ++address; Going down? address += pitch; As you can agree, this makes the power-of-two not very factorial.

Again, this makes not difference - you''re talking about rendering while the real issue is paging and memory management.

Even talking about blitting, memory works faster when accessed in larger chunks, typically DWORDS, so ++address is slow when using bytes or short values. Hence, the power of two rule of thumb.

quote:

The only way to find out for sure is to try it, though. Try creating a random number of surfaces, with random widths. Than compare the widths to the lpitches. In the end, it doesn''t matter how anyone ''thinks'' directx or video hardware will behave. The only thing that matters is how it ''truely'' behaves, under real-world circumstances, and the only way to find that out is to see firsthand.

Doing this will most definately demonstrate the paging process if you create enough surfaces. It also slows down the whole deal as DirectX will continously page in surface memory when it runs out of working space.

*NEVER* assume that pitch == width*bpp. Some drivers might want to include extra information at the end of each scanline, e.g a checksum or something.

If you absolutely must, then write two versions of the code:

  if(lPitch == nWidth*nBpp){   // Do whatever you want, use pointer++}else{   // The safe route: use pointer += pitch}

This way you can accomodate both cases, although be sure to test both properly.
Not using the pitch won''t just make your programs more portable, what happens if a new graphics card comes out that does some really cool special effects, but it needs some data at the end of each scanline, and you decide to buy this card? All you code that doesn''t use the pitch parameter won''t work. You''ll be tearing your hair out trying to figure out why it won''t work.

2p, Steve

/*
Wrong, drivers mirror the hardware, so a driver can''t just pick anything - it has to be based on the hardware specs.
*/

Ofcourse it''s wrong if you think I meant that the pitch can be random value - OFCOURSE it need to be such a value that actually works with the hardware. Hello? I''m talking to TOO intelligent person because seems I have to explain everything from ground-up so that you won''t misunderstand!

How can you say I am wrong to say that driver can choose anything it finds appropriate: if it find it appropriate to match the underlying hardware how can this be "Wrong" ?

Like I said before, I say it again, this time EMPHASIS on the real message: "user should calculate offset to next scanline with pitch instead of his own custom width*bpp calculation".

/*
If a memory card pages memory, it has to do in big chunks, hence the use of page or k boundries that make it easier on the hardware to move memory.
*/

This is also possible, that''s the beauty of DirectX interfaces. All that there is left for everything to work out, is users actually writing correct code and not promoting unsound practises in driver side.

/*
> Heck, most devices use pitch == width*bpp anyway

You might want to decide on pitch!=width*bpp or not in your discussion - you''re switching back and forth.
*/

I am not switching back and forth ; I have been VERY consistent in promoting the use of pitch *correctly*.

What you are quoting is just personal observation put out-of-context, if I used wording "Heck, ALL devices use ...", then your comment would be justified.

/*
You only need calculate pitch once per lock for any following operation, be it drawing bitmaps, polygons, etc. From there, all objects use the pitch to calculate one position to start from, so this statement is kind of a given - you don''t have to ''know the trade'' to understand this.
*/

Sure you do. All beginners who DON''T "know the trade" seem to think writing raster routines means calls to putpixel() function. This leads to the thought that it''s faster to calculate the address with bit-shift than multiplication, example:

void putpixel(int x, int y, char color)
{
buffer[pitch*y+x] = color;
}

Ofcourse the next step is that he reads a "tut" from the web, where someone explains that if the pitch is known to be 320, for example, he can do..

y*320 == (y<<8) + (y<<6)

You surely are familiar with all this beginner nonsense, right? This is why it''s logical to some people that power-of-two pitch is somehow more "efficient" for the client.

Or even the extreme case where people falsely belive into lookup table like this:

char* buffer = ylookup[y];
buffer[x] = color;

So I thought it would make my point clear without too much extra explaining what I meant by "going right? ++address", I was assuming we all are in level ground here ; not looking up ; nor looking down onto each other.

Because efficiently written raster routine doesn''t care about the pitch value, it''s not relevant for writing efficient code, right? So there''s nothing stopping from writing correct code (pitch aware) without any of the kludges I mention above and still coming out efficient.

/*
It''s not power of two we''re talking about, it''s page or k boundries, which are need for memory paging, not blitting issues. See above about hardware.
*/

To be honest I haven''t really encountered page-boundary issue since banked client access to videomemory since VBE 1.2 days.

If the hardware implementation is such that banking does require implementation to place scanlines in memory in such way that it still looks like linear aperture to the client, it only makes sense.

I won''t argue against this since this is what I came to believe in aswell, even a long before I ever heard of "DirectX". ;-)

/*
Again, this makes not difference - you''re talking about rendering while the real issue is paging and memory management.
*/

That''s the real issue for hardware/driver manufacturer. For the client it''s very clear: Use The Pitch, Luke! Indeed ; I am talking about rendering - that''s what we clients do, and as such I wanted to demonstrate that the actual physical memory layout is completely irrelevant to me as long as they can expose it though the DirectX interfaces *correctly*.

For me it''s complately irrelevant what the pitch is, aslong as it works: I don''t need it to be 512, 1024, 2048 or other "convenient" value.

/*
Even talking about blitting, memory works faster when accessed in larger chunks, typically DWORDS, so ++address is slow when using bytes or short values. Hence, the power of two rule of thumb.
*/

That''s *cough* not power-of-rule thumb, but alignment rule-of-thumb. Address aligned by, say, four doesn''t need to be power-of-two, is 12 a power-of-two? No. Is it aligned by four? Yes.

I know about alignment, believe me, but next pixel''s address is still trivial to calculate, that''s the point.

8bpp:
uint8* buffer = ...;
++buffer;

16bpp:
uint16* buffer = ...;
++buffer;

32bpp:
uint32* buffer = ...;
++buffer;

Ofcourse we all know that if pointer is 32bit, increment to it does actually add sizeof(uint32) bytes to the address, right? That''s the way C/C++ pointer arithmetics works, don''t look at me like that. ;-)

I''m sure you also know that misaligned memory access is actually illegal on most platforms, even on later x86 extensions like SSE. So it''s not only beneficial to write efficient code, but also MANDATORY unless it''s hardware exception you are aiming for.

/*
Doing this will most definately demonstrate the paging process if you create enough surfaces. It also slows down the whole deal as DirectX will continously page in surface memory when it runs out of working space.
*/

Can''t disagree with that, it''s true. But I disagree with your style of saying "Wrong" as reply to about anything I say - this misinterpretation skill of yours put me into explaining spree, must be fun to pull the strings like that. #¤#&¤#%""& >B)

EvilBill,

I think we have a communication problems here.. I was suggesting something more along these lines..

int pitch = ...;
char* ybuffer = screen + sx + sy*pitch;

for ( int y=0; y<height; ++y )
{
char* buffer = ybuffer;

int count = width;
do { *buffer++ = color; }
while (--count);

ybuffer += pitch;
}

See? Going into next pixel in horizontal and vertical is cheap, and we still take pitch into account correctly.

Sorry if confused anyone earlier, the above should be clear enough. I don''t see any special need for two cases like you suggested. I still think that the pitch value is IRRELEVANT for efficient and correct implementation of most raster operations you can think of.

Okay, regardless of the discussion, you do need to calm down. Anybody can read what you've been saying without you trying to talk down to somebody like they are stupid.

quote:

Hello? I'm talking to TOO intelligent person because seems I have to explain everything from ground-up so that you won't misunderstand!

??? You're talking to too intelligent person? I can't make out if you mean two intelligent people or something else.

quote:

Like I said before, I say it again, this time EMPHASIS on the real message: "user should calculate offset to next scanline with pitch instead of his own custom width*bpp calculation".

I could be wrong, but that's what everybody (including myself) has been saying since the first reply.

quote:

I am not switching back and forth ; I have been VERY consistent in promoting the use of pitch *correctly*.

What you are quoting is just personal observation put out-of-context, if I used wording "Heck, ALL devices use ...", then your comment would be justified.

quote:

That's *cough* not power-of-rule thumb, but alignment rule-of-thumb. Address aligned by, say, four doesn't need to be power-of-two, is 12 a power-of-two? No. Is it aligned by four? Yes.

I need to clear this up, as I'm mixing up my old and new. Surfaces and bitmaps are padded to 8 bytes. Textures should be power of 2. And so, no surface would be 14 bytes.

quote:

Can't disagree with that, it's true. But I disagree with your style of saying "Wrong" as reply to about anything I say - this misinterpretation skill of yours put me into explaining spree, must be fun to pull the strings like that. #¤#&¤#%""& >B)

I'm only pointing out those instances that I believe you are incorrect. Just so I can get this right from now on, what word should I say, or should I also condone myself to the say 'you are too stupid Luke' tactics as well?

As I've said, you just need to calm down - a lot of what we are saying are the same, but for some reason, you seem to be getting really upset when somebody tells you that you might be wrong. Also, drop the anony posts - if you can't put your name to anything, than some people might not take you seriously.

Just to point something out, we both agree on the pitch and using some pointers to access, but you are assuming newbies are going to doing something strange. Let's drop the thing about calculating pitch, since there is no argument about it. In fact, I'll leave this open until tonight for any further valid replies that are on topic (bad Jimmy, went off topic )

I do say both, that''s true:

(1) only pitch is correct
(2) width*bpp is pitch very often in practise

But the two are not mutually exclusive, and the (2) is subset of (1) so I''m not flipping back and forth.

Also you said that driver cannot choose the pitch, is wrong. This I cannot accept being wrong either. It''s true that driver must choose the pitch so that it is correct. There may be only one possible pitch, even, but it''s still the driver who chooses it: just very little choise for the driver to pick from.

You say I am upset. That is irrelevant. I am merely standing up to claims that some things I said were incorrect, I feel they were not incorrect. If I did feel that I *was* incorrect, I would thank you and know a bit more, but this is not the case time time. I would like to steer the discussion away from personal issues and you to respond to the two points I made above or ever hold your peace (I am not interested in personal analysis, you are not interested in personal analysis and other people likely even less).

That''s all.

I was talking to Pzh. He was saying that if he knows that the next scanline is immediately after the end of the first line, he can do some assembler stuff like ''rep STOPSW''.

Steve

quote:

Also you said that driver cannot choose the pitch, is wrong. This I cannot accept being wrong either. It''s true that driver must choose the pitch so that it is correct. There may be only one possible pitch, even, but it''s still the driver who chooses it: just very little choise for the driver to pick from.

Please reread my replys, I said the driver mirrors the hardware, which in turn determines the pitch - I didn''t say that drivers cannot choose the pitch. With DirectX, you have to deal with the drivers, there''s no arguing that.

quote:

(1) only pitch is correct
(2) width*bpp is pitch very often in practise

(1) is nothing, nobody argued that you shouldn''t use the pitch (reread the posts). If you want to discuss (2), I would like to see your proof. I can directly quote the DX SDK that tells you that surfaces should be padded to 4 bytes, hence not width*bpp.

quote:

lPitch
Distance, in bytes, to the start of next line. When used with the IDirectDrawSurface5::GetSurfaceDesc method, this is a return value. When used with the IDirectDrawSurface5::SetSurfaceDesc method, this is an input value that must be a DWORD multiple. See remarks for more information.

As for the power of 2, quote:

quote:

Storage Efficiency and Texture Compression
All texture compression formats are powers of 2. While this does not mean that a texture is necessarily square, it does mean that both X and Y are powers of 2. For example, if a texture is originally 512×128 bytes, then the next mipmapping would be 256×64 and so on, with each level decreasing by a power of 2. At lower levels, where the texture is filtered to 16×2 and 8×1, there will be wasted bits because the compression block is always a 4×4 block of texels. Unused portions of the block are padded. Although there are wasted bits at the lowest levels, the overall gain is still significant. The worst case is, in theory, a 2K×1 texture (20 power). Here, only a single row of pixels is encoded per block, while the rest of the block is unused.

So bottom line is that you can calculate pitch using the 4 byte padding (as per the SDK), but it''s better to always use the pitch when locking the surface, as I''ve been saying all along.

Now, as I''ve promised, I''ll be locking this thread. We''ve both had our say-so and it''s now up to the readers to determine and research this further using what has been said.

Jim

