• Announcements

    • khawk

      Download the Game Design and Indie Game Marketing Freebook   07/19/17

      GameDev.net and CRC Press have teamed up to bring a free ebook of content curated from top titles published by CRC Press. The freebook, Practices of Game Design & Indie Game Marketing, includes chapters from The Art of Game Design: A Book of Lenses, A Practical Guide to Indie Game Marketing, and An Architectural Approach to Level Design. The GameDev.net FreeBook is relevant to game designers, developers, and those interested in learning more about the challenges in game development. We know game development can be a tough discipline and business, so we picked several chapters from CRC Press titles that we thought would be of interest to you, the GameDev.net audience, in your journey to design, develop, and market your next game. The free ebook is available through CRC Press by clicking here. The Curated Books The Art of Game Design: A Book of Lenses, Second Edition, by Jesse Schell Presents 100+ sets of questions, or different lenses, for viewing a game’s design, encompassing diverse fields such as psychology, architecture, music, film, software engineering, theme park design, mathematics, anthropology, and more. Written by one of the world's top game designers, this book describes the deepest and most fundamental principles of game design, demonstrating how tactics used in board, card, and athletic games also work in video games. It provides practical instruction on creating world-class games that will be played again and again. View it here. A Practical Guide to Indie Game Marketing, by Joel Dreskin Marketing is an essential but too frequently overlooked or minimized component of the release plan for indie games. A Practical Guide to Indie Game Marketing provides you with the tools needed to build visibility and sell your indie games. With special focus on those developers with small budgets and limited staff and resources, this book is packed with tangible recommendations and techniques that you can put to use immediately. As a seasoned professional of the indie game arena, author Joel Dreskin gives you insight into practical, real-world experiences of marketing numerous successful games and also provides stories of the failures. View it here. An Architectural Approach to Level Design This is one of the first books to integrate architectural and spatial design theory with the field of level design. The book presents architectural techniques and theories for level designers to use in their own work. It connects architecture and level design in different ways that address the practical elements of how designers construct space and the experiential elements of how and why humans interact with this space. Throughout the text, readers learn skills for spatial layout, evoking emotion through gamespaces, and creating better levels through architectural theory. View it here. Learn more and download the ebook by clicking here. Did you know? GameDev.net and CRC Press also recently teamed up to bring GDNet+ Members up to a 20% discount on all CRC Press books. Learn more about this and other benefits here.
Sign in to follow this  
Followers 0
QNAN

HLSL (SM2/3): How do I transfer less-than-32 bit variables?

6 posts in this topic

I wish to transfer an instancing vertex, that beyond the matrix has four very small integers, two of which are indexes into a texturemap shared by several different other objects - a texturemap holding many different kinds of grass and flowers. I call each of them "frames" and the texture a "framemap"l The last two define the dimensions (x/y) of the frame map, that the indexes point into.

From the max dimensions and the indexes, the shader will be able to calculate the offset into the frame map. I found, that this should be cheaper to transfer than simply the float offsets, if I used special types.

 

The integers are so small, that I can get away with only 4 bits for each, as I allow the frame map to have maximum 16x16 frames. Combined the indexes- and dimensions-variables will occupy 4x4=16 bits.

 

However, I cannot find a data type for transferring (http://msdn.microsoft.com/en-us/library/windows/desktop/bb172533%28v=vs.85%29.aspx), which takes only 4 bits. Actually nothing that takes less than 32 bits. Is that really so?

I could live with packing variables together and unpacking on the other end, but if I am stuck with a minimum unit of 32 bits, then Im not sure it is worth the price.

 

Is there a solution to this? Is there any way I can transfer variables of less than 32 bits?

0

Share this post


Link to post
Share on other sites

You could go up to 32 bits exactly and transfer them as D3DDECLTYPE_UBYTE4 but this seriously does smell of premature optimization.  Try just transferring the full normal texcoords as floats anyway - you'll probably find that you're not really bound at this stage of the pipeline at all and that any attempt to reduce the data size doesn't make a blind bit of difference.

1

Share this post


Link to post
Share on other sites

Bandwidth is usually a problem when rendering massive amounts of objects (which foliage can easily be), so I assumed, that it would be here too, although I have not tested yet.

If there is no elegant way to do it, I guess I will bump the variables (indexes/boundaries) to 8 bit and use the UBYTE4-structure. Im just a bit disappointed, that no solution exists for transferring custom-sized data pieces, as it can be a problem if transferring millions of data packets.

 

Even if this may be premature optimization, I thought I would benefit from knowing the transferring to the card in detail. And I think it is an interesting problem.

0

Share this post


Link to post
Share on other sites

I assume the context to be PC games. It might look different when you're running on a console.

 

On DX9 level hardware, UBYTE4 is indeed the best you can get. You can do some bit swizzling in the shader to combine 2x 4bit into one of these 4 bytes, but you can't specify anything less than 32bit. Thus this compression scheme is only useful if you can make other use of the remaining 3 bytes. 

 

A warning from experience: it won't help you much.

 

- It can save a lot of GPU memory, but this is only useful if you're handling literally millions of instances. If you're doing just a few tens of thousands of instances, I wouldn't waste my time on it. 

- It can save quite some memory bandwith, but I never found this to be the limiting factor. The best I got from compressing my instancing vertex structure from 56 bytes down to 20 bytes was 30% performance gain.

- It can save you transfer bandwith in case you're updating the instancing data every frame. In that case I'd say it's worth the hassle, but I'd wager you have other problems then.

- It won't help you in any other case. 

 

A few months ago I wrote a voxel renderer that splatted millions of textured quads. I first tried to use instancing, but it was slow as hell. 4 million quads resulted in ~15fps on my Geforce GTX 460. When trying to find the bottleneck I noticed that all counters of NVPerfHUD together only accounted for 30% of the frame time, and 70% went to "somewhere". Then I tried Visual NSight, which was buggy as fuck but at least could show me the real cause: the Input Assembler. Then I removed all instancing and stored four unique vertex structures per quad, with a total 80 bytes per quad, and I got to 55fps. For the very same geometry, and four times the GPU memory bandwith. Something is happening on those modern cards that I can't explain. An ATI GPU showed the same behaviour.

1

Share this post


Link to post
Share on other sites

It's also the case that for huge numbers of objects you're more likely to bottleneck on fillrate (and potentially overdraw, depending on the type of object) than on vertices.  This can be observed with particle systems and would be true of foliage too.

1

Share this post


Link to post
Share on other sites

Im just a bit disappointed, that no solution exists for transferring custom-sized data pieces, as it can be a problem if transferring millions of data packets.

Well, graphics cards can't deal too well with data smaller than 32 bits, especially unaligned data, so what you'd save in memory bandwidth, you would lose in processing efficiency. Really, you should get everything working first, and then benchmark (if you still notice a slowdown once everything is in place).

1

Share this post


Link to post
Share on other sites
Thanks for the excellent input guys. Knowing that it is impossible to have less than 32 bit types is a big plus - at least I don't have to bang my head against an unbreakable wall :). It was also nice to hear about people's performance stories, as from them it sounds like Im not gonna run into a bandwidth problem as the first thing.

Thanks for the feedback, guys. Edited by QNAN
1

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!


Register a new account

Sign in

Already have an account? Sign in here.


Sign In Now
Sign in to follow this  
Followers 0