Jump to content

  • Log In with Google      Sign In   
  • Create Account


Member Since 04 Apr 2007
Offline Last Active Oct 22 2016 12:10 PM

#5283150 Per Triangle Culling (GDC Frostbite)

Posted by on 24 March 2016 - 08:56 AM

By a lot unfortunately, and Nvidia's release this year doesn't seem likely to change support for async. Still, it's not going to be a loss generally, so it's not like you'd even have to disable it in a Nvidia specific package.

Currently nVidia's hardware straight up cannot support async compute, at least in the sense most people think of the term. Cool guide here, and the tl;dr is that the nVidia compute queue implementation doesn't support resource barriers and as such cannot implement the current DX12/Vulkan spec.

#5283146 [Vulkan] Descriptor binding point confusion / Uniform buffer memory barriers...

Posted by on 24 March 2016 - 08:38 AM

Based on my understanding of the spec, it's actually because you're mapping into an image with tiling VK_IMAGE_TILING_OPTIMAL layout, which does not necessarily have to have things exist in linear scanlines. The fix is to create a staging resource with identical format and VK_IMAGE_TILING_LINEAR layout instead, and then copy from the LINEAR image to the OPTIMAL one by way of a vkCmdCopyImage.

#5283133 Object Space Lightning

Posted by on 24 March 2016 - 08:09 AM

Same as anything else, put it into the texture, sample for the 'real' draw.


EDIT: If you're having trouble visualizing what's going on, think of it like having a really fast artist who can whip up a color map (hesitate to say 'diffuse' here because there's going to be specular in it too, but I guess in the name of artistic fudgery here we'll go with it) with all the fancypants shading baked into it at 30Hz.


Vlj is actually incorrect here, each object is drawn a minimum of twice.The first time around, though, it's not *quite* going to look the way you're used to with traditional forward rendering, you use the UV coordinates for screen position instead of the actual, you know, position. The end result is that you're going to have a cool single color texture (probably HDR, too) that you can then just sample from and filter like a normal boring color map. This should be super duper familiar if you were around for Eugene d'Eon's seminal work on texture space diffusion, as has been mentioned. See here in section 16.4 if you want some helpful visualizations.

#5204220 Multiple Lights on game map with forward rendering

Posted by on 14 January 2015 - 09:38 AM

Another interesting approach used by UE3 was to composite all lights CPU-side into a set of spherical harmonics coefficients, then send these to the GPU for shading. This is an awesome tradeoff for mobile where the extra detail afforded by calculating things like falloff per-pixel are harder to see and the performance benefits are huge-- lighting time is completely independent of the number of lights influencing the object!

#5126310 HDR Light Values

Posted by on 25 January 2014 - 07:32 AM

Generally, intensity values like that are unitless unless you specifically work out some scale for them. A fair number of archviz renderers do so in order for them to work nicely with measured IES light profiles. There isn't a *formal* standard with games/the DCC packages used to create their assets, though, as physically-based rendering is just starting to catch on these days.


Incidentally, you very much want to establish some sort of PBR framework so the values you feed into the shader(s) are used in a meaningful context, which should hopefully make sense when you think about it.


Re: quadratic attenuation-- that should again make some intuitive sense considering real-world light follows the inverse square law. You likely will see better results moving over to a simple area light model, though, as point lights are physically impossible. This would also give you a more sensible attenuation model 'for free.'


Lastly, tone mapping is pretty much entirely an artistic process, you fiddle with it until it subjectively 'looks good,' and that's that.

#5126307 Is it normal for 4x MSAA (with custom resolve) to cost ~3ms vs no MSAA?

Posted by on 25 January 2014 - 07:17 AM

If your platform offers the ability to configure graphics settings, sure. I think a lot of folks don't mind the *option* to burn GPU horsepower should their hardware afford them the opportunity. For lower-spec systems, post-process-based antialiasing systems could offer a reasonable, low-cost(!) alternative.

#5108707 Programming a "TV"

Posted by on 12 November 2013 - 09:28 AM

I would generally not use the term 'bump map' to describe it as that's a reference to an entirely different, very specific technique that just so happens to use a similar-looking texture as an input.


In simpler terms, you're using a distortion shader that adds or subtracts a value to the texture coordinate (distinct from the value sampled from a texture at that texture coordinate, though the actual offset would be taken from a second texture) depending on the location onscreen. tonemgub's soltution is basically to paint a texture that contains offsets gradually pointing towards the middle of the screen, with decreasing magnitude as you move away from one of the corners. It would work, but is likely more computationally expensive than it could be.


Fortunately, folks have already worked out how to do most of this with math. If you're interested in a really detailed CRT monitor simulation, give this neat site/shader a gander.

#5107729 Feature chat: Type/member annotations

Posted by on 07 November 2013 - 01:52 PM

While the compiler metadata feature has been around for some time, I've really been impressed by how powerful and elegant C#'s type/member annotation feature is. I feel we could cook up something pretty similar for AS using mostly pre-existing features (namely initializer lists and built-in arrays of interfaces) but for best results I really think this needs a solid, 'standards-compliant' official integration.




The actual creation/retrieval of metadata is pretty trivial. You'd probably want to have a basic (application-defined?) metadata interface stored in a built-in array. At a low level, the compiler could stow away anything in the square brackets, then parse the contents as an initializer list for your array. What is interesting is how the process would interact with bytecode precompilation, and specifically serialization. As of right now, there's no official AS object serialization framework, which means we can either A) develop something from scratch, B) create some sort of application interface (promising but obviously fragmenting/error-prone) or C) keep type metadata in plaintext, then parse it on bytecode deserialization.


As of right now, my vote goes for B; the differences between application interfaces mean that the loss of direct bytecode portability between AS implementations is essentially a non-issue. This also means that work/code could in theory be shared between the metadata engine and actual game functionalities like networking and/or game saves and thus be tailored to suit the needs of the specific use case(s).


Storing the metadata with the property/type is also going to be kind of interesting. I think keeping this as a script object makes the most sense, and I also think keeping the user data as a separate field is valuable. Therefore, I propose adding an extra void* member to store a pointer to the array. The pointer could then be cast as appropriate for use


The type would be fixed, and specifically the built-in array type with the subtype set to be references to the application metadata type.


Example application metadata type:

interface Annotation {};

Example AS storage type:


or alternately


For convenience, I also propose adding additional methods to asIScriptEngine with the following signatures

asIObjectType* GetMetadataBaseType() const;
asIObjectType* GetMetadataContainerType() const;

It may also be valuable to have different types of annotations for functions, fields and types, but that's what I'd like to interact with the library community on.

#5104108 Post processing pipeline - pass count makes sense?

Posted by on 24 October 2013 - 08:16 AM


I just finished implementating a post-processing pipeline in my renderer. Its initial performance is better than I expected, but I still need to optimize it a bit.

One of the things that worries me is the number of passes (by pass I mean single full-screen-quad draw call, so single gaussian blur has 2 passes - BlurU and BlurV). This is how the pipeline looks:

- Screen Space Subsurface Scattering - 6 passes(3 gaussian blurs). I use stencil and depth test to avoid unnecessary pixel blurs.

- Bloom - 7 passes(1 bright pass filter, 3 gaussian blurs).

- HDR Tone Mapping - 2 passes - create luminance texture, generatemips for average, than tone map. I read MJP's post about using CS instead of generateMips, it's on my todo list.

- DOF - 3 passes (generate CoC map, 1 gaussian blur).

- Film Grain - 1 pass. This is the easiest one to remove, which I tried, but perfroance stayed the same.


So my starting point is 19 passes, most of them blurs so heavy on the TXS. I've tried removing some, but it affects the visual quality. What I'm trying to do is improving performance while preserving the visual quality, and if possible, preserve the pipeline flexibility.


I have some ideas, mainly:

- Use CS for better sampling efficiency.

- Widen the blur kernels while reducing the number of blurs. This will reduce the total amount of TXS ops, but will probably reduce the visual quality.

- Merge passes. Not sure how that will work.



Any advice will be appreciated.


- No reason you can't use the bloom texture as input for average luminance. Sample the lowest MIP level, do a luminance calculation on that. It doesn't work out to the same thing mathematically, but you're generally not after that; the core 'darken the scene if it's really bright, lighten if the reverse' will still behave as normal.


- Merging passes generally requires extra work on your part. You can merge depth of field and motion blur calculations somewhat by using a skewed disk sampling pattern as demonstrated by LittleBigPlanet and (I think) Crysis 2. Tonemapping can trivially be slapped on the end of the pass immediately preceding it, as can film grain.

#5091301 skeletal animation on peices of a model

Posted by on 03 September 2013 - 08:27 AM

Sounds like it could work pretty well since I'm making a PC game with deferred shading.


I haven't tried writing data to textures CPU side every frame yet.  It's not too expensive?  I have to upload a matrix array somehow anyway so I imagine there's some overhead either way.

Slightly more so than shader constants, but not by a wide margin. The real problem is the GPU stall associated with a CPU lock, but so long as you have extra work that the GPU can be chewing on while you tinker with the matrix buffer/texture there's nothing to be worried about. If you're concerned with PCI express bandwidth, you can also use a more compact transform representation such as vector + quaternion instead of a full 3x4 matrix. You lose scaling, but there are plenty of engines out there that do without. It's possible to pack that into, say, the transform w component (or also the quaternion, but reconstruction is more complex and it's less of a straight win.)

#5077839 Energy conservation of diffuse term

Posted by on 15 July 2013 - 06:38 AM

Just throwing this out there, I think the problem is more that most of the classical BRDFs used in rendering are designed to be 'one-shot' in the sense they don't offer a breakdown for diffuse and specular terms. When a theoretical graphics programmer is using said BRDFs as intended, there's no energy competition between the two terms, and  our 'diffuse Fresnel' problem goes away. In fact, the reason we have these models more has to do with the rather crappy indirect lighting situation most games find themselves in-- we have to get a little more bang for our buck from existing point lights, etc. so we have a sort of multilayered BRDF approach that's designed to show detail not immediately in the specular area of influence.


EDIT: Yeah, I wrote this on reduced sleep and in a rush to get to work; I'm not sure where I was going with the whole GI/multilayer BRDF thing. See MJP's post below for a nice explanation of what I think I was originally going for wink.png

#5065907 how do you apply a brush colour to an image professionally in an art program

Posted by on 29 May 2013 - 01:17 PM

You, uh, linearly interpolate between the two colors using the specified alpha?
layer1 + alpha * (layer2 - layer1)

EDIT: And, as you've probably seen in the FFP,
(layer1 * alpha) + (layer2 * (1-alpha))

#5065305 Need some tips on improving water shader

Posted by on 27 May 2013 - 11:14 AM

Firstly, on the subject of bright water:


Backing up a bit, I think it's helpful to understand where that 'water color' comes from. Physically, you're doing a very simple (but workable) approximation to multiple scattering-- basically light is entering the water all over the surface, bouncing around a little bit *inside* the water volume, then leaving towards the viewer... among other directions. You can get all crazy and simulate this numerically, but it turns out just having a single, flat color does a pretty good job of capturing the core 'light enters somewhere, leaves elsewhere' effect.


Intuitively, you just need to control that simulated incoming amount of light. Since we're throwing correctness out the window already, I'd suggest doing the simple, controllable thing and just have a sort of 'day water color' and 'night water color' and then blend between those (preferably on the CPU!) before handing it off to the shader. In order to save artist work, it seems like you could analytically figure out what the difference between night light and day light is and then just apply that directly to the 'day' color.


Re: streaking--


That's a problem with your eye vector resulting from your water surface approximation; (I'd bet good money you're just using a single flat plane + normal map for the water surface) it's entirely physically accurate. The issue arises because water (and surfaces in general) don't really change orientation without some intuitive change in 'position' (and hence eye vector) to go along. Ideally you'd do away with the normal map entirely and just have a nicely tessellated/displaced water volume, but if that's off the table then you may be able to fake a displacement in the eye vector by adding a small vector scaled by a height value in the normal map alpha or some other such unused channel.

#5061548 Why do most tutorials on making an API-agnostic renderer suggest that it'...

Posted by on 13 May 2013 - 10:49 AM

The answer is pretty simple, actually. Most people that write tutorials on the Internet, while certainly well-meaning, have fairly little idea what it is they're doing/talking about. Doubly so for how to explain it. I'll get back to this in a moment.

You're on the right track, though, good to see you've managed to pick the proper level of abstraction.

EDIT: From personal experience, having mesh/particle, light and camera primitives (while exposing some things like shader parameters) seems a good jumping-off point. This is generally flexible enough that you can create most any rendering effect with minimal overhead. I also suggest designing your system to be both very content-oriented and minimally retained; this puts the power in the hands of the artist(s) and also makes debugging/multithreading easier, as all information is usually on-hand.


I don't mean to pick on you, AgentC, but a rebuttal:


A few reasons I can think of:


1) it requires (at least superficially) the least amount of thinking / planning to just wrap the low-level API objects

2) by keeping the abstraction at low level you should, theoretically, be able to construct complex API-agnostic higher level rendering code, and not have to duplicate eg. the functionality of Mesh for different APIs

3) A tutorial stays more generally applicable if it doesn't impose its own higher-level constructs


It has potential to get messy, though. For an example here's a list, from my engine, of D3D / OpenGL differences that "leak" from the low level abstraction to the higher. It's not unmanageable, but not pretty either. https://code.google.com/p/urho3d/wiki/APIDifferences


1) So, in essence, you're making code harder to follow by splitting it up at a very fine level, adding runtime overhead by adding superfluous virtual function calls, etc. just so that you can be typing code into an IDE *right this instant*? That seems a *very* poor tradeoff. This seems to be the thought process behind a lot of tutorials, actually.

EDIT 3: This kind of thinking is also what gave us JavaScript, FWIW.


2) Maybe in theory. By your own admission, though, there are often fundamental differences in how the API works that render this 'abstraction' meaningless anyway-- you still need to add more of them at different levels, which will in turn make code harder to follow.


3) Isn't the point of a tutorial to demonstrate how to take the low-level API and map it to higher-level constructs *anyway*?

#5059776 DX11 InstanceDataStepRate

Posted by on 06 May 2013 - 11:35 AM

I actually came up with a nifty way of using this-- particle LOD. Instead of having one giant honking texture with multiple particles packed onto it (think a cloud of sparks), you can actually create a few (geometrically) distinct particles that are simmed/drawn as one unit. In the vertex shader, you can add some random offsets to each sub-particle and get something visually identical with less fillrate consumption.