Jump to content

  • Log In with Google      Sign In   
  • Create Account


Member Since 14 Feb 2007
Offline Last Active Today, 07:42 PM

#5308114 Some HDR, tonemapping and bloom questions

Posted by on Today, 04:44 PM

1) pretty much. Having a gamma-correct pipeline is also a prerequisite -- the HDR data is linear whereas the post-tonemap data is sRGB.

2) You don't need to use average luminance when tonemapping. Tonemapping is going from HDR to "normal" images. In a digital camera, it goes from the real world of unbounded light intensity to 8bit sRGB JPEGs, using exposure (basically a simple multiplier), a gamma curve, and maybe a tonemapping curve if you're lucky.
If you use nothing but a multiplier (followed by the linear->sRGB gamma curve) we'd say that you're using a linear tonemapping function. The better tonemapping functions (Reinhard, Hable, Hejl, etc) are basically another curve that gets applied after exposure and before gamma.

If your camera is set to 'auto' mode, then yeah, it will sample the average scene luminance and use this to pick an exposure value.
However, if you set the camera to 'manual', you can arbitrarily pick an exposure value.
Some games give artists complete camrra-like control, which means in some levels they might use manual exposure :)

3) Bloom represents the fact that the lens isn't perfect and will blur the image slightly, similar to lens flare (they'd be computed at the same time).
This means just running a subtle blur filter over your HDR data before tonemapping. It should have a very high peak, but wide tail, so that it's only noticeable in very bright areas.
Before HDR, we used to do bloom by either having artists manually specify which surfaces were "bright", or we'd use a threshold - e.g. Anything over 0.9 brightness will blur.

4. Profile and test the performance differences :)
It's a "parallel reduction" algorithm that maps well to both pixel and compute.

5. As above, sometimes in the past we'd use "bloom mask" textures, but hopefully not any more :)
Great HDR these days goes hand in hand with a good shading model / BRDF, and good textures (e.g. PBR stuff).

#5308035 Threadpool with abortable jobs and then-function

Posted by on Today, 08:22 AM

1) Probably. But your example code is non-deterministic (output ordering and whether the Abort call does anything).
2) Of course; you're using boost :wink:
Instead of Handle having the abort member of type std::shared_ptr<std::atomic<bool>>, could it just be std::atomic<bool> instead? Shared-ptr says that you don't know what the lifetime of your abort flag is, which seems overkill. Instead of AddJob returning a Handle by value (and done taking a Handle by value), AddJob could take a Handle by reference, done could take it by const-reference, and Handle could become non-copyable.
3)Design wise, why do you need to abort jobs in the first place? Typically in game, jobs are very small (sub millisecond), so I guess this is more general, such as for tools / non-game GUI stuff too?

#5308025 Pd.D. C.S. Research Topics?

Posted by on Today, 07:38 AM

There's a lot of fields within games. Is there one that interests you the most? e.g. Graphics (as above), audio, AI, animation, multi-threading, scripting languages, build systems, source control and asset management, physics, user interfaces, user input, VR, etc.

#5307980 cascaded shadow map correction

Posted by on Today, 12:41 AM

* Try to have a smaller difference between cascade sizes.
* Use a larger filter on close cascades, and a smaller filter on far cascades.
* Blend between cascades at the boundary, instead of having a hard line where you switch from one to the next.

* Dither between cascades at the boundary, instead of having a hard line where you switch from one to the next.

#5307972 Style preferences in for loop structure

Posted by on Today, 12:11 AM

i --> 0
I've never seen anyone write it like this, and in fact, without my morning coffee, it took me a few seconds to realize this is actually valid code and I had to put in extra effort to understand that it iterates over the same range. It's a neat trick, but I'm not sure it's really worth confusing other people who may read the code.


#5307965 trek game

Posted by on Yesterday, 11:23 PM

star trek game

Sure. It's a game dev forum after all.

Except that copyright infringement is frowned upon and the site itself can't condone it.

#5307963 Occlusion culling:Complement CHC++ in webgl

Posted by on Yesterday, 11:10 PM

IMHO, CHC++ is irreparably flawed, in that it requires low-latency (<1 frame) CPU->GPU->CPU communication. This will always result in CPU and GPU idle time, and can't really be fixed in general.


If you want to use a depth-buffer based dynamic occlusion system, then your options are to either do it on the CPU, such as: https://github.com/GameTechDev/MaskedOcclusionCulling ... but these techniques require SIMD intrinsics to be fast, which aren't available in javascript.

Or do it on the GPU, but also move all of the draw submission to the GPU as well, using Draw-Indirect, which I don't think is available in webgl.


There's a lot of other kinds of occlusion culling techniques to look into though, e.g.




#5307948 Style preferences in for loop structure

Posted by on Yesterday, 09:05 PM

Forget the body, what about the fist line? :wink:
for(int i = 0; i < MAX; i ++)
VS idiomatic iterator style:
for(int i = 0, end = MAX; i != end; ++i)

#5307842 Low level serialisation strategies

Posted by on Yesterday, 08:02 AM

I guess I find it hard to imagine a situation where I could [use POD] and still find the code convenient to work with. Some objects are intrinsically hierarchical, and some have varying length contents; trying to flatten all that just seems to move the complexity out of the serialisation and into everywhere else in the engine.
Sure, but it's easier said than done, especially when you have pointers and you're going for a memcpy/fwrite type of approach.

In the example I gave, I deliberately used hierarchy and variable-length data :)
The MyBlob struct is of varying size - List<Foo> is a u32 count, followed by that many Foo instances. Foo contains an Offset<Bar> (which is a u32 that acts like a pointer to a Bar) - a has a/owns a hierarchical relationship.
In my example, all the complexity is in my manually written C# serialization routine - the C++ data structures themselves and the algorithms that operate on them are extremely clean.
Sure, you can't put a mutable, complex, under-unspecified std::map into one of these structures, but we can put our own immutable map class into one just fine.

Re: the stringoffset stuff - what I understand of it looks much like some of the pointer serialisation stuff I've dealt with in the past, but the code is essentially back into writing out each field one by one, right? Which isn't a bad thing, just that I thought you were trying to avoid that. And the reading code would appear to have to do the same.

There is no reading code -- you declare those structures and then you use them without a deserialization step. If they're in memory, they're usable - just cast the pointer to their data to the right type and you're done. You can memory map an asset file and start using them immediately, without even having loaded them into RAM first - the OS would page fault and load them on demand in that case!
 [edit] See Offset, List, Array, StringOffset, Address in this file.

Yes, the serialization code in my C# tool writes out each field one by one, though that's because I don't have a serialization framework - it's just a plain old binary writer. I planned on writing one that would automatically go from C# structs to bytes in a files, just how the C++ "reading" side doesn't need to implement any deserialization code. This would mean that all I would have to do is declare the same structure layout in my C++ engine and my C# tools and that would be that.... but I actually found that I like using the plain old binary writer so far so it hasn't been a priority :D
i.e. If you're not a fan of my messy C# serialization code, it is a solvable problem.
The C# code that I posted will work across all our platforms, and does have "pointers" embedded in it (as offsets that are never deserialized). It is easy and done! We don't just do this to get better loading times, but also because it actually is good for our productivity.

#5307825 Low level serialisation strategies

Posted by on Yesterday, 05:56 AM

1. Because 99% of the objects in the engines I've worked with are not Plain Old Data
2. Because the data is padded or aligned differently on different platforms

 You only use this for plain old data. If you're building an engine system and want to be able to load it form disc with a memcpy, then you're able to make the choice to design it as POD.
The thing about KISS, is that it's hindered by complex engineering, so you end up with a war in your code-base when you try to mix the brutally simple with the deceivingly complex :)
For 2, your build system needs to generate binaries for each platform - meaning your can't load your PS3 assets on a PS4. Your serializer can know some things about the platform it's generating for, such as its endianness and the padding rules of its compiler.
This is why we only use it for the kind of static assets that you ship on a disc, and not more portable/flexible/changing things, such as save-games or user-facing editable content.

I can appreciate the hypothetical speed benefits of this but given how error-prone they are, I wonder whether there is any real benefit.

Well it depends what you're comparing it to. There's a lot of games that peg a CPU core at 100% usage for 30 seconds when loading a level :D
We handle data structures of any complexity, references between different assets (model loads a material loads a texture, etc) and do next to no CPU work during loading -- it's all just waiting for the CPU to map our data into address-space. For graphics assets, we support loading an asset as several "blobs" which can be allocated in different ways, which allows us to stream vertex/pixel data directly into GPU memory instead of, e.g. loading an image file into memory, deserializing it, then creating the GPU resource, and then copying the data across.

Often we don't even fix up pointers on load. I've a template library containing pointer-as-relative-offset, pointer-as-absolute-offset, fixed array, fixed hash table, fixed string (with optional length prepended to the chars, and the hash appearing next to the offset to the chars), etc. Instead of doing a pass over the data to deserialize these, they're just left as is, and pointers are computed on the fly in operator->, etc. This can be a big win where you know a 2byte offset is enough, but a pointer would be 8bytes.

This sounds fascinating but I have no idea how that would work. Say you have this object:
class Xyz
    std::string name;
    u32 value1;
What does the writing code look like? Do you write each field out individually, telling it that you want 'name' to be written as a fixed string? Because I'm encountering serialisation strategies that basically mandate removal of that string from the class, replacement with a string hash or some other POD type, then an fwrite (or equivalent).
(Or, worse, I've seen a system that expected to read objects back in a single chunk, but required each field to be written individually, so that it could mangle pointers on the way out. This kind of gives you the worst of both worlds. But hey, fast loads and you can still use pointers!)

If I'm using this deserialization technique, I'm almost certainly using it for some kind of immutable asset data. It's very rare to have mutable assets. That means that std::string is overkill. Though it's hard to think of a good use for std::string in any part of a game engine IMHO  :wink:

For an example system, I'd write some C++ code of the data that I need, and the algorithms I need to consume it:
struct Bar
  u32 value1;
  u32 value2;
struct Foo
  StringOffset name;
  Offset<Bar> bar;
struct MyBlob
  List<Foo> foos;

void Test( const MyBlob& blob )
  for( uint i=0, end=blob.things.count; ++i )
    const Foo& foo = blobs.foos[i];
    const Bar& bar = *foo.bar;
    const char* name = foo.name;
    printf( "name %s, %d, %d", name, bar.value1, foo.bar->value2 );
I could write a better serialization system for the C# tool / generator side, but it's all explicit ATM:
class FooBar
  public string name;
  public int v1, v2;
List<FooBar> data = ...
using(var chunk0 = new MemoryStream())
using(var w = new BinaryWriter(chunk0))
  StringTable strings = new StringTable(StringTable.Encoding.Pascal, true);

  //struct MyBlob
  long[] tocBars = new long[d.Count];
  for( int i=0, end=data.Count; i!=end; ++i )
    //struct Foo
    strings.WriteOffset(w, data[i].name); // Foo.name
    tocBars[i] = w.WriteTemp32();         // Foo.bar
  for( int i=0, end=d.Count; i!=end; ++i )
    w.OverwriteTemp32(tocBars[i], w.RelativeOffset(tocBars[i])); // Fix up Foo.bar
    //struct Bar
    w.Write32(data.Count);// Bar.value1
    w.Write32(data.Count);// Bar.value2

#5307703 Low level serialisation strategies

Posted by on 24 August 2016 - 03:22 PM

I've my own engine for my Indie game, and I contract on another game engine at the moment. Both make use of this strategy extensively for data files that are compiled by the asset pipeline (so are automatically rebuilt if the format changes).

Often we don't even fix up pointers on load. I've a template library containing pointer-as-relative-offset, pointer-as-absolute-offset, fixed array, fixed hash table, fixed string (with optional length prepended to the chars, and the hash appearing next to the offset to the chars), etc.
Instead of doing a pass over the data to deserialize these, they're just left as is, and pointers are computed on the fly in operator->, etc. This can be a big win where you know a 2byte offset is enough, but a pointer would be 8bytes.

As above, I find that this KISS solution is often less stress and time than the more over-engineered solutions.
Also as above, I don't spend too much time debugging this stuff at all, and it either works fine or breaks spectacularly. Leaving a few unnecessary offsets to strings in the data can be useful if you do have to debug something.
I usually generate my data with just a C# BinaryWriter (and extension classes to make writing things like offsets and fixed size primitives clearer), and use assertions when writing structures that the number of bytes written equals some hard coded magic number. The C++ code also contains static assertions that sizeof(struct) equals a magic number. If you upgrade a structured and forget to update these assertions, the compiler reminds you very quickly.

Save game files, user generated content, and online data tend to use more heavyweight serialisation systems/databases that can deal with version/schema changes, as these don't go through the asset compiler.

#5307609 Bound but unused vertex buffers affecting the input layout?

Posted by on 24 August 2016 - 07:35 AM

1. Nope, the Input Layout says which attributes to load, and which 'slots' to load them from. If a buffer happens to be bound to an unused slot, no big deal.

Note that this isn't true in other API's. In certain other API's, it's a big performance pitfall to accidentally have extra attributes bound to the pipeline like this :)

2. There's no need to unbind buffers, no. I unbind buffers in debug builds, but not shipping builds as a debugging tool.

However... When you bind a resource to the device, the device holds a reference to that resource until it is unbound. If you release a buffer but it happens to still be bound to the infrequently used slot #15, then the memory won't actually be release until you do unbind it. I generally call ID3D11DeviceContext::ClearState at the beginning of every frame to avoid long-term bindings like this.

#5307511 I have an Idea..........

Posted by on 23 August 2016 - 07:21 PM

The game is supposed to an open world game like skyrim My question is are there any easier engines that i can use to make this game
You could make it as a mod for Skyrim. 

#5307505 Horizon:zero Dawn Cloud System

Posted by on 23 August 2016 - 05:56 PM

@Hodgman you planning on sharing an application with your findings ?

Sure. The quick prototype that that picture is from is a shadertoy :D 
https://www.shadertoy.com/view/Mt33zj The code is terrible and terribly inefficient though!
I started by stealing this atmospheric scattering shader by valentingalea (which is based on this article), stealing some noise functions from iq, and adding a cloud layer based on what I could remember from reading this thread.

#5307362 Game architecture

Posted by on 23 August 2016 - 05:07 AM

Determine which procedures in which systems need to load media, and pass a reference to the media loading system into those procedures.