Jump to content

  • Log In with Google      Sign In   
  • Create Account


Send more data to the GPU vs more operations


Old topic!
Guest, the last post of this topic is over 60 days old and at this point you may not reply in this topic. If you wish to continue this conversation start a new topic.

  • You cannot reply to this topic
3 replies to this topic

#1 thecheeselover   Members   -  Reputation: 308

Like
0Likes
Like

Posted 16 April 2012 - 05:32 PM

Hi there,

I noticed that I was sending a lot of data to my GPU and I was wondering if sending a lot of data (into a vertex buffer) could be very slow? For each vertex, I send 48 bytes (position, normal, texture and color). Also, I wanted to send one more byte to the GPU, but because HLSL doesn't support bytes, I'm obligated to send a short (2 bytes). I found a way to store my normal, my color and my new byte into one float (they all share the same variable) so that it would only use 4 bytes instead of 18. Is it worth to do binary operations both on the CPU and the GPU to compress/decompress my normal, my color and my new byte into one float or it's better to send more data? Which choice is the best to increase my game's performance?

Thank you,
Thecheeselover
Hide yo cheese! Hide yo wife!

Sponsor:

#2 MJP   Moderators   -  Reputation: 10231

Like
3Likes
Like

Posted 16 April 2012 - 11:22 PM

In general with GPU's you want to prefer math over memory access. Usually dedicated GPU's are quite disproportionate with regards to their ALU count vs. bandwidth + memory hardware. But of course in reality it depends on the hardware and the workload, so you should be careful not to jump to any conclusions without profiling.

However my bigger concern would be precision. How are you going to store a normal, color, and another byte in just 4 bytes? Typically you'll want at least 16 bit per component for normals.

#3 Katie   Members   -  Reputation: 1282

Like
2Likes
Like

Posted 16 April 2012 - 11:52 PM

In general the memory transactions will happen in the background. They're also very fast, don't pollute the CPU cache, don't cause cacheline fighting and can happen while your 3D card is busy doing something else; as the CPU can be doing.

The CPU is pretty fast at doing tight operations loops, but there will still be tons of branches, cache misses, hyperbus communications and other assorted friction. In addition it can't parallelise the work. Mapping more of the GPU memory, doing simple writes into it and then leaving both devices to get on with more work while other (specialised and faster) parts of the computer hardware deal with the shifting around of memory is definitely the way to go on desktop systems.

If some of the data is constant or not updated often, consider using two buffers -- a frequent and an infrequently changing and use two accesses in the shaders to combine the data. This will reduce the amount of memory which needs to be moved between the devices across the memory busses[1]. This is usually less of a problem in the shaders because the shader cores will each have memory controllers (often one for each buffer), and you're still processing the vertices in linear order so you'll still get good cache read-ahead on them.



[1] An example would be a skinned model. The texture posn/colour at each vertex will typically not change, it's just the vertex posn that gets updated, so by using two buffers, the memory transmission size can easily be reduced by 2/3.

#4 thecheeselover   Members   -  Reputation: 308

Like
0Likes
Like

Posted 17 April 2012 - 04:09 PM

Thank you both for you answers Posted Image So, I will use more memory.

To MJP: Sorry, I meant 6 bytes ^^ My normal only takes 1 byte, because, in my game, there are only 6 possibilities of normals for this technique.
Hide yo cheese! Hide yo wife!




Old topic!
Guest, the last post of this topic is over 60 days old and at this point you may not reply in this topic. If you wish to continue this conversation start a new topic.



PARTNERS