Jump to content

  • Log In with Google      Sign In   
  • Create Account

Krohm

Member Since 27 Aug 2002
Offline Last Active Yesterday, 12:42 PM

#5126641 Physics and huge game worlds

Posted by Krohm on 26 January 2014 - 11:28 PM

Performance is bad because the algorithm cannot make any assumption about the collision data. This is not the case for other collision shapes, which also have a fully parametric representation.

I am currently using proxy hulls for everything (hand made). This is standard about now albeit we could discuss on the amount of automation.




#5125789 Physics and huge game worlds

Posted by Krohm on 22 January 2014 - 11:12 PM

As an aside, Havok deals with that kind of problem with their concept of Islands.  I'm guessing Bullet doesn't have a similar built in structure? 

It does, but sadly, it doesn't expose them at library level.

 


Also btBvhTriangleMeshShape takes some time to build the hierarchy, so  serialize or not to serialize??
Do not serialize, do not use. Performance is terrible to start with and I cannot understand why so many people go for it.


#5124087 I'm having doubt about my map loading method and its performance.

Posted by Krohm on 16 January 2014 - 02:17 AM


For starters, switch from text to binary It will be much faster, since you don't have to go through hundreds of characters, you just read a piece of data and there you are.
I have to point considering binary files to just be "the data, right there" is likely to be a good way to be in trouble. Soon. TheUnnamable points out a first example, dealing with endiannes. While a true problem, its relevance is overstated in my opinion.

What the binary file does is to guarantee each chunk of data has a known footprint, easily inferred from state. It does not guarantee the value itself is coherent with previous state. Input sanification is still required - what you gain is a much, much more compact parser, which can be ideally 5 LOC for a pure data blob.

 


With small modifications by me

Binary formats are not necessarily faster and they have a bazillion other drawbacks.

  1. Disk seek time will completely swamp any CPU processing time.
  2. Text files can in many cases compress better than binary equivalents and so (using a ZIP asset file or zlib to compress your source assets)...
  3. you can actually get faster loading time with text than binary.
  4. To be sure, measure, measure, measure.

I would like to know what those drawbacks are supposed to be as...

  1. not a binary file problem. Seek is always a problem, no matter if binary or text. But with text you have the additional complexity of parsing, especially if the format is designed to be human-usable;
  2. to a smaller file? Or in percentage? Information is information. Both files store the same amount of information, with data preferring one representation or the other depending on values themselves;
  3. sure you "can" in some circumstances. I don't recall it happening to me however;
  4. please stop with that "to be sure..." thing. Time is not free. Either you think something is worth or not.

Now, back to the original problem.

Text files are suitable for small amounts of data, such as level configuration in a tower defense game, a set of tile indices for parts of levels in a tile-based game.

If you can guarantee the syntax is simple, loading text has the inconvenience of variable-length but loading complexity will still stay low. If you care about performance, you might cheat by terminating each buffer with a null temporarily and removing that after processing the token. Or, more nicely, you might switch to a pointer-length approach where string termination is not assumed. This way, memory allocations are lower and performance goes up.

If you need even more performance, it's probably time to drop text. Filters to binary can cook the data for you in awesome ways.

Also consider json.




#5121713 How efficient are current GPU schedulers?

Posted by Krohm on 06 January 2014 - 02:42 PM

I am very well aware of the latency-hiding strategies involving block read/write.

There's no heavy math in the memory-heavy section. I don't understand what you're saying


while you're doing extra work ... the kernel actually runs faster while doing the heavy math at peak rates

I don't understand what kind of extra work are you referring to: changing layout or decompressing/transforming data appears something I'd have to do, not the HW.


if you wonder about some optimization you come up with, it's indeed best if you just try and profile it.

Are you reading my posts? We don't have easy access to neither GCN nor Kepler devices. I'm sorry to write this but so far I haven't read anything I didn't know already and I start being aware I am not able to express my thoughts.


#5121585 How efficient are current GPU schedulers?

Posted by Krohm on 06 January 2014 - 03:08 AM

While I had a quick look on GPU releases in the last few years, since I've focused on Development (as opposed to Research), I've haven't had the time to deeply inspect GPU performance patterns.

On this forum I see a lot of people dealing with graphics is still optimizing a lot in terms of data packing and such. It seems very little has changed so far, yet on this forum we care about a widespread installed base.

 

In an attempt to bring my knowledge up-to-date, I've joined a friend of mine in learning CL, and we're studying various publicly available kernels.

In particular, there's one having the following "shape":

MemoryUsage.png

The kernel is extremely linear up to a certain point, where it starts using a lot of temporary registers. After a while, those massive amount of values are only read and then become irrelevant.

 

What I expect to happen is that the various kernel instances will

  1. be instanced in number to fit execution clusters, according to the amount of memory consumed.
    What happens to other ALUs in the same cluster?
  2. Happily churn along until memory starts to be used. At this point, they will starve one after the other due to the low arithmetic intensity.
  3. The scheduler will therefore swap the "threads" massively every time they starve by bandwidth.
  4. When a "thread" is nearby the ending, compact phase, the swapping will possibly end.

It is unclear to me if the compiler/scheduler is currently smart enough to figure out the kernel is in fact made of three phases with different performance behavior.

 

However, back when CL was not even in the works and GPGPU was the way to do this, the goal was to make "threads" somehow regular. The whole point was that scheduling was very weak and the ALUs should have been working conceptually "in locksteps". This spurred discussions about the "pixel batch size" back in the day.

 

Now, I am wondering if simplifying the scheduling could improve the performance. On modern architectures such as GCN (or Kepler).

The real bet I'm doing with him is that the slowdown introduced by increased communication (which is highly coherent) will be smaller than the benefit given by improved execution flow.

 

Unfortunately, we don't have easy access to neither GCN nor Kepler systems, so all this is pure speculation. Do you think it still makes sense to think in those terms?

 

Edit: punctuation.




#5117266 Rendering with only 8 lights active

Posted by Krohm on 16 December 2013 - 02:04 AM

Personally, if I'd have a realistic test case in which I need to have more than 100x the amount of supported lights, I'd start considering other methodologies, such as deferred shading. 




#5116440 Snake Gone Wild

Posted by Krohm on 12 December 2013 - 02:01 AM

Well done, but I'd just make everything bigger (say 40x40 map each tile being twice as big), seconding Jay.




#5116436 So, I want to make a game engine...

Posted by Krohm on 12 December 2013 - 01:30 AM

Having a "taste" of different languages is indeed an excellent idea.

I'd personally stay away from Java. It's just too verbose and HTML5 can do many, perhaps even most things Java excels at. The library is quite verbose and system integration is, in my opinion, still lacking.

However, no matter what you do, you will never be able to build an engine (in the sense of multi-game shared platform) without writing a few games first, possibly from different genres. Your resulting design would just not interact well with the gameplay constructs or the data flow involved.

So, next step in your path to engine design is: write an engine (in terms of logic for a game).




#5115590 Bullet physics undefined behavior with a dummy simulation

Posted by Krohm on 09 December 2013 - 02:06 AM


Isn't it generally considered a bad idea to explicitly set an objects velocity in a physics engine? I've always read that you should apply an appropriate force instead.
I'd agree with that. Setting velocities directly always had repercussions on my systems. Dynamic objects are supposed to be completely simulated by the library, if you want to change them, at least make sure you wake em up from sleeping. Try calling ->activate().


#5115584 [Terrain-RTS map] Advanced shadings for better detail!

Posted by Krohm on 09 December 2013 - 01:45 AM


And the result is not bad, but still cant compare to current game today (like starcraft2, civilization V, shogun 2 campaing map....).
You will never be able to compare. To do so, you first need an artist and you probably need a budget, counting at least 4 digits.

What I'm trying to say is that you don't make a game using technical feats alone. You need artwork, you need a aesthetic vision, a look and feel. Those concepts are not currently present in this thread, and that's terrible. It's not just a matter of doing HDR or bloom (strange, it's part of HDR) or DX11 lighting (whatever this is supposed to be) or bumps.




#5113679 Forcing Code To Work !

Posted by Krohm on 02 December 2013 - 01:55 AM

Mph. That made me uncomfortable but I guess it's just PHP? :P




#5112650 Object interactions in a multithreaded game

Posted by Krohm on 28 November 2013 - 12:10 AM

Maybe it's just me but I see something greatly overlooked in those two statements. Goal, level of abstraction. Threading is for performance. A computation like this is not performance oriented; it is also a low-level concept, while you talk about a ultra-hi-level gameplay concept. Do not mix those. The example above is a proof of concept but I recommend everyone to not try to pull a mountain out of it. Gameplay concepts like this could be scripted and there's often no control at all on scripts. It's virtual function calls... overcharged.

 

FYI, a relationship like this in my system would take no time at all (running the scripts generally take less than 2%) !

Now you'll be saying: "I'm only using those as examples". No. You're building a fictional problem.

 

Slightly different for physics. You don't do physics yourself. And as a side note, you don't give up simulation determinism.

Let me stress this: you don't do physics yourself.




#5111778 system requirements

Posted by Krohm on 25 November 2013 - 02:10 AM

My suggestion for starters is: just don't worry about it. You probably won't even half a GiB.

More involved answer.

RAM estimation is based on your system. For a start, you need to load at least once all the textures and all the models.

Maybe some models will be CPU-vertex-skinned. Then you need an extra model copy for each interpolated version of each animated model.

Or maybe you have particle systems, each will have its own particle buffer, with each particle being (ex: float3, float4) and N particles for which you have a total buffer size of N*sizeof(float)*7. Maybe those are only in GPU memory.

And then you have physics. In my experience it's fairly compact representation, I'm pretty sure it takes less than 256k for me (only including level data representation).

 

In general, when the game is designed, the requirements are set in the design document. It is not best practice to cut them later, but it still happens. Sometimes this can be done even with no quality loss - for example, one might cut in half a texture which is only seen from a distance.

 

As a first step, you might elaborate on your definitions of "tons of detail in textures and detailed environment". I see this got an "IF", so it's only supposition. You want to care about the dataset you're going to use right now or ... at worse... a year from now. Not the AAA asset from the game you'll be producing 10 years from now (if you're extremely lucky).




#5111776 When to play impact sounds based on physics collisions?

Posted by Krohm on 25 November 2013 - 02:00 AM


I tried playing it on every collision being reported from Bullet physics engine ... but that doesn't seem to be the way to do it.
And... what problem are you observing with that approach?

Keep in mind that Bullet does not reports collisions. It reports collision manifolds, which are a different thing.




#5110122 What can you do with a map (strategy)

Posted by Krohm on 18 November 2013 - 02:52 AM

Both "Agricola" and "Stone age" are focused on gathering resources. They don't really have a map, as all slots are the same in terms of movement. I think collection of resources is very good concept if we want to not use combat and it sure allows to be expanded with movements.

 

Settlers of Catan features a procedurally built map. There's no real movement but rather "growth" from the cities you control. It is my understanding it has been praised for some mathematical properties I cannot fully appreciate.






PARTNERS