Jump to content

  • Log In with Google      Sign In   
  • Create Account

Banner advertising on our site currently available from just $5!

1. Learn about the promo. 2. Sign up for GDNet+. 3. Set up your advert!


Member Since 15 Aug 2009
Offline Last Active Mar 24 2015 06:40 PM

Posts I've Made

In Topic: Cross platform GPU computation for real time ray tracing rendering engine!

24 March 2015 - 06:37 PM

I'm sorry, but

11 replies and 3 of them you post about OpenCL and Spir-V, although it's really not relevant to the request for a cross platform compute solution targeting XBox and PS4.
I think people don't vote you down because they dislike you, it's rather because you're off topic. I agree the down-vote is sometimes mean, because you don't know why people do it, but the point is not that you'll get used to it and continue posting. It's probably more of advantage for you if you deduce why it happens.

I'd wish people would be man/women enough to always tell why they voted down, random punishment does not help or lead to anything.

In Topic: Array of structs vs struct of arrays, and cache friendliness

13 March 2015 - 04:07 PM

You can also use hybrids, particularly for position data. For example the following layout is a cache friendly SoA:


xxxx yyyy zzzz xxxx yyyy zzzz xxxx yyyy zzzz xxxx yyyy zzzz


Assuming they're all 32-bit floats; the Z component of a vector is 16 bytes away from its X component. This means when you load the X component of a vector, you'll be loading its Y & Z in the same cache line (most x86 CPUs use 64-byte cache lines; some rare ARM devices use 32-byte caches though).


The approach doesn't scale well to higher width SIMD (i.e. AVX-512) unless the standard cache line size increases as well (which AFAIK, doesn't); however it's still an improvement over the original SoA which will always need 3 lines per Vector3.

It's SIMD 4 register friendly, but not very cache friendly. a tuple of 4 vectors is 24byte in size, thus sometimes the x y z components are crossing cache line borders, and in that case you could just as good use pure SoA.

But I agree that using hybrids makes sense. That whole AoS and SoA is not a guide to how you have to do it. It should rather open your eyes that you can layout data completely different than a real world logical view of the data would suggest. The way to start should not be "how do I layout the data", but "How am I going to use this data" and then the "hybrid" comes into play, because you'll organize data in a way that makes sense. "Sense" doesn't mean strictly for performance. 

-If you have some complex structures and it's not critical to performance, then it makes sense to organize those the best way for your maintaining, this way you will safe your time and you can spent this time on optimizing critical parts.

-if you now have a piece of code that is critical, try to figure out what access pattern you will have. Try to figure out what ranges the data will be that you use and what quality you need. e.g. if you do all your heavy math on colors, those might not need to be float. You could have those in memory as 8bit/channel or as halfs. You effectivelly trim unneded bits from your variables and thus become cache friendlier, memory bandwidth friendlier etc.


And most importantly, especially if you are a beginner, don't assume what is slow and what is fast. Implement working solution and profile it, you will be surprised how often the things you thought would be slow are not the bottleneck and how often parts of the binary are slow that you haven't assumed. As a next step try to analyse why it is slow, don't fall in a trap like "there is a division, divisions are slow", it might be that the division operation first fetches data from memory and stalls for it, that might take way more cycles than a division. The same another way around, some fetches for random memory might be hidden by the cpu pipeline, don't immediately assume that's the problem, your compile might create weird opcode for innocent looking code.

And don't hesitate to ask senior programmers, you will see that every of them will tell you another reason your code is probably slow and another solution for it, this is a simple proof that profiling is the propper way to decide... that's also the case for AoS vs SoA vs hybrid solutions.

In Topic: spare time project IP

06 August 2014 - 03:46 AM

while you are certainly correct, that's not really the topic. I'd rather like to know how the law is handling this in various countries.

I can Google that for my country and it states that those parts in contacts are not valid, so you don't have to bother to even complain about it.
I thought gamedev.net has people from all over the world and someone could tell/referenz to their law.

In Topic: spare time project IP

05 August 2014 - 08:11 AM

thanks tom, sadly there isn't much information, rather opinions. I'd really appreciate if someone could share some knowledge or reference to sources.
and also how it is handled in uk, france, germany or sweeden

In Topic: What would it take to update an older game with better graphics/physics/parti...

02 April 2014 - 12:51 PM

titanfall was done with the half life engine and that is based on the quake engine. that's the scale you can achieve its just a matter of investment (time, money, knowhow).

it also depends on what side you want to invest, you could redo lot of art (most) and make it look way better. you could also redo all tech. and adding vehicles etc. would be rather a challenge on gameshow and network side.