Jump to content

  • Log In with Google      Sign In   
  • Create Account


Member Since 19 Jun 2007
Offline Last Active Jul 31 2015 05:35 AM

#5208861 Shoreline waves?

Posted by Reitano on 05 February 2015 - 06:19 AM

In my opinion the simulation and rendering of water waves near the shoreline is one of the biggest challenges in real-time water rendering. Afaik no game engines or middlewares on the market support it. Offline renderers are well capable of producing convincing results, as demonstrated by the many animated movies in recent years with water scenes (e.g Surf's Up). Naturally they do not have the tight constraints that games have and can afford the cost of highly sophisticated physical models and continuos artistic refinement.

Games traditionally have employed cheap effects such as foam near the shoreline, modulated by the distance from it, as seen in FarCry, Assassins Creed etc. Outerra uses procedural waves for the shoreline whose parameters (amplitude and phase) depend on the distance to the shoreline. That's a step forward in the right direction.

In general, the proper simulation of shallow waves, characterized by a change of propagation speed, direction and shape profile among other phenomena, would require a complete new technology and a significant investment in R&D that studios can hardly justify.

Myself I have ideas that I intend to prototype at some point in the future. In rough terms, I would calculate local wave solutions near the shoreline, characterized by functions that approximate locally the curvature and distance from the shoreline. For LOD, waves would have a variable granularity based on the distance from a reference point (e.g. the main camera position). For rendering, the local wave solutions would be splatted on offscreen accumulation buffers encoding the final waves displacement, gradients, foam level, variance etc. Physics would be handled entirely on the CPU side.

I'd be happy to go to into more details if there's enough interest, and also share ideas about other approaches.

#5160489 Screenshot of your biggest success/ tech demo

Posted by Reitano on 14 June 2014 - 06:58 AM

Here are some screenshots from a demo of my Typhoon engine:


Main 2014-06-14 13-38-44-67.jpg Main 2014-06-14 13-38-51-94.jpg Main 2014-06-14 13-39-00-81.jpg Main 2014-06-14 13-39-10-70.jpg Main 2014-06-14 13-39-43-44.jpg Main 2014-06-14 13-40-21-16.jpg Main 2014-06-14 13-41-44-98.jpg

#5129274 Baking a Local Thickness Map

Posted by Reitano on 06 February 2014 - 04:37 AM

I did some research on this topic recently and apparently the latest version of xnormal has support for baking translucency maps.  I haven't tried it yet and I'd be curious to see what results you'll achieve in your implementation.

#5124417 Shading in Unreal Engine 4

Posted by Reitano on 17 January 2014 - 09:37 AM

As promised here's some code. I'm still not sure about some details (in particular those related to importance sampling) and it'd be great if other people could spot errors and add more variants to the code, like different distribution functions.

void GenerateFGLookupTexture(Graphics::Texture* texture, RenderingDevice& device)
	Graphics::TextureDesc textureDesc;
	device.GetDesc(texture, textureDesc);

	assert(textureDesc.format == Graphics::Format::G16R16);
	assert(textureDesc.levels == 1);

	const float deltaNdotV = 1.f / static_cast<float>(textureDesc.width);
	const float deltaRoughness = 1.f / static_cast<float>(textureDesc.height);

	// Lock texture
	const Graphics::MappedData mappedData = device.Map(texture, 0, Graphics::MapType::Write, 0);
	uint8* const RESTRICT dataPtr = reinterpret_cast<uint8*>(mappedData.data);

	const uint numSamples = 512;
	float roughness = 0.f;
	for (uint v = 0; v < textureDesc.height; v++)
		float NdotV = 0.f;
		uint32* dst = reinterpret_cast<uint32*>(dataPtr + v * mappedData.rowPitch);
		for (uint u = 0; u < textureDesc.width; u++, ++dst) 
			float a = 0;
			float b = 0;
			CalculateFGCoeff(NdotV, roughness, numSamples, a, b);
			assert(a <= 1.f);
			assert(b <= 1.f);
			*dst = (PACKINTOSHORT_0TO1(b) << 16) | PACKINTOSHORT_0TO1(a);

			NdotV += deltaNdotV;
		roughness += deltaRoughness;

	device.Unmap(texture, 0);

void PlaneHammersley(float& x, float& y, int k, int n)
	float u = 0;
	float p = 0.5f;
	// FIXME Optimize by removing conditional
	for (int kk = k; kk; p *= 0.5f, kk /= 2)
		if (kk & 1)
			u += p;
	x = u;
	y = (k + 0.5f) / n;

vec3 ImportanceSampleGGX(float x, float y, float a4)
	const float PI = 3.1415926535897932384626433832795028841971693993751f;
	// Convert uniform random variables x, y to a sample direction
	const float phi = 2 * PI * x;
	const float cosTheta = std::sqrt( (1 - y) / ( 1 + (a4 - 1) * y) );
	const float sinTheta = std::sqrt(1 - cosTheta * cosTheta);
	// Convert direction to cartesian coordinates
	const vec3 H(sinTheta * std::cos(phi), sinTheta * std::sin(phi), cosTheta);

	//D = a2 / (PI * std::pow(cosTheta * (a2 - 1) + 1, 2));
	return H;

// Reference: GPU Gems 3 - GPU Based Importance Sampling
vec3 ImportanceSampleBlinn(float x, float y, float specularPower)
	const float PI = 3.1415926535897932384626433832795028841971693993751f;
	// Convert uniform random variables x, y to a sample direction
	const float phi = 2 * PI * x;
	const float cosTheta = std::pow(y, 1.f / (specularPower + 1));
	const float sinTheta = std::sqrt(1 - cosTheta * cosTheta);
	// Convert direction to cartesian coordinates
	const vec3 H(sinTheta * std::cos(phi), sinTheta * std::sin(phi), cosTheta);

	//D = (specularPower + 2) / (2 * PI) * std::pow(cosTheta, specularPower);

	return H;

float G_Schlick(float k, float NdotV, float NdotL)
	return (NdotV * NdotL) / ( (NdotV * (1 - k) + k) * (NdotL * (1 - k) + k) );

void CalculateFGCoeff(float NoV, float roughness, uint numSamples, float& a, float& b)
	// Work in a coordinate system where normal = vec3(0,0,1)

#define GGX 0
#define BLINN 1
#define G_SCHLICK 0

	// Build view vector
	const vec3 V(std::sqrt(1.0f - NoV * NoV), 0, NoV);

	const float blinnSpecularPower = std::pow(2.f, 13.f * roughness);
#elif GGX
	const float GXX_a4 = std::pow(roughness, 4);
	const float G_k = std::pow(roughness + 1, 2.f) / 8.f;

	a = 0;
	b = 0;
	for (uint i = 0; i < numSamples; ++i)
		float x, y;
		PlaneHammersley(x, y, i, numSamples);

		// Microfacet specular model: 
		// f = D*G*F / (4*NoL*NoV) 
		// V = G / (NoL*NoV)
		// Importance-based sampling:
		// f / pdf
		// Calculate random half vector based on roughness
		const vec3 H = ImportanceSampleBlinn(x, y, blinnSpecularPower);
		// D and pdfH cancel each other so just set to 1
		const float D = 1;  
		const float pdfH = D;
#elif GXX
		const vec3 H = ImportanceSampleGGX(x, y, GXX_a4);
		// D and pdfH cancel each other so just set to 1
		const float D = 1;
	        const float pdfH = D;

		// Calculate light direction
		const vec3 L = 2 * dot( V, H ) * H - V;

		const float NoL = saturate( L.z );
		const float VoH = saturate( dot( V, H ) );
		const float NoH = saturate( H.z );
		const float LoH = saturate( dot( L, H ) ); // = VoH

		// Convert pdf(H) to pdf(L)
		// Reference: Torrance and Sparrow
		// http://graphics.stanford.edu/~boulos/papers/brdftog.pdf
		const float pdfL = pdfH / (4 * LoH);
		if (NoL > 0)
			#if G_SCHLICK
                        // FIXME NoV cancel out
			const float G = G_Schlick(G_k, NoV, NoL);
			const float V = G / (NoL * NoV);
			const float V = 1;
			const float G_Vis = D * V / 4 / pdfL;
			const float Fc = std::pow(1 - VoH, 5.f);
			// FIXME : NoL ? Part of the lighting eq but gives dark reflections at grazing angles. Need a better BRDF probably
			a += (1 - Fc) * G_Vis * NoL;
			b += Fc * G_Vis * NoL;
	a /= numSamples;
	b /= numSamples;


#5123546 Shading in Unreal Engine 4

Posted by Reitano on 14 January 2014 - 06:08 AM

I read that presentation a second time and found the answer to my question. The guys at Epic pre-convolve cubemaps offline and weight each sample by the aforementioned dot product. As my cubemaps are dynamically rendered and mip-mapped automatically by the hardware, I do not have any control on their convolution and I will have to move the dot product to the pre-integrated BRDF  instead.

If anybody's interested I could post here the C++ and shader code of my implementation.

#5123254 Shading in Unreal Engine 4

Posted by Reitano on 13 January 2014 - 05:58 AM


Over the weekend I've read the presentation on physically-based shading in the Unreal 4 engine (http://www.unrealengine.com/files/downloads/2013SiggraphPresentationsNotes.pdf). I have a question on the integration of environment maps.


As described in the paper, this is accomplished by splitting the integration in two parts: the average of the environment lighting (a mip mapped cubemap) and a pre-convolved BRDF, parametrized by the dot product (normal.view) and the material roughness.


For the BRDF, we calculate many random directions around the normal based on the roughness, then calculate the corresponding reflected vector and use it to evaluate the BRDF. My question is: should we weight each sample by the dot product between the reflected vector and the normal ? That makes sense to me as it's part of the lighting equation, but it gives very dark results at glancing angles and for low roughness values because in that case, the majority of reflected vectors are almost perpendicular to the normal. The sample code in the paper does not consider this factor which is a little surprising.




#5105672 Spherical Harminics: Need Help :-(

Posted by Reitano on 30 October 2013 - 08:57 AM

I am not familiar with RSMs but, judging from the code you posted, I believe SH suit your needs.


Remember that SH represent a signal defined over a spherical domain. In this case the signal is the total incoming light from all the secondary sources. A SH probe has an associated position in world space. For a <light, probe> pair, the signal is (cosThetaI * lR * Flux) in the direction defined by R (normalized). The cosThetaJ  term depends on the receiver's normal and will be taken into account later at runtime when shading a surface.

Some pseudo-code:


for each probe

  SH = 0

  for each light

     signal = R / | R | * (cosThetaI * lR * Flux)

     SH += signalToSH(signal)


signalToSH is a function that encodes a directional light in the SH basis. See D3DXSHEvalDirectionalLight for an implementation.



At runtime, when shading a pixel, pick the probe spatially closest to it. The pixel has a normal associated. Assuming that the BRDF is purely diffuse, you have to convolve the cosine lobe around the pixel's normal with the signal encoded in the probe, achieved with a SH rotation and dot product. See the paper http://www.cs.berkeley.edu/~ravir/papers/envmap/envmap.pdf


I suggest you read more material on SH in order to have a better understanding of their properties and use cases. As for myself, I will read the RSM paper asap and edit my reply if I've made wrong assumptions.


Please let me know if you have more questions.





#5105640 Square <-> hemisphere mapping

Posted by Reitano on 30 October 2013 - 06:19 AM

An update: a paraboloid mapping solves this problem brilliantly. In order to bias the sample distribution near the horizon, I apply an exponent to the sampling direction Z component. For a typical sky, the sampled representation closely matches the original one, and objects far from the camera blend well with the sky. If anyone is interested I can post here some formulas and pseudo-code.

#5104384 Spherical Harminics: Need Help :-(

Posted by Reitano on 25 October 2013 - 10:18 AM

Directional, spot and point lights are ideal lights. From the point of view of a probe, they emit radiance within an infinitesimal solid angle. In signal processing, you would call them impulses. You can only describe them analytically, not with a cubemap, unless you approximate them with an area light somehow. The way to go is by projecting them to SH basis with analytical formulas, as discussed in the Stupid SH tricks paper. So not, you don't need to render cubemaps in this case.

#5104357 Square <-> hemisphere mapping

Posted by Reitano on 25 October 2013 - 08:19 AM

Lots of ideas to experiment with. Another option that I should have considered right from the start is the dual paraboloid mapping:


The mapping and its reverse are trivial. I might be able to control the sampling distribution by manipulating the UV space, as suggested by Hodgman. About that, a fixed odd exponent (e.g. 3) would simplify the ALU by removing the sign and abs instructions.


@tonemgub: I should have mentioned that I am going to update the map every frame (either on CPU or GPU) and a cubemap likely has a higher cost than a single texture although I haven't tested it. And, apart from this use case, I am interested in this problem from a mathematical point of view.

#5098243 Component based architecture and messaging

Posted by Reitano on 02 October 2013 - 04:59 AM

My advice is that of decomposing the engine in libraries designed with the principles of middlewares: fully decoupled, configurable, robust, reusable, testable. Then, build an entity/component framework at a higher level with a common API for creating, destroying, updating and synchronizing components of any type.


With that in mind, you would first develop an AI library with support for steering behaviours, waypoints, pathfinding etc. Have a look at commercial or open source AI libraries for inspiration. Among other things, the AI library is responsible for allocating, destroying, loading, saving and updating AI primitives. AI data and code, being fully decoupled, are optimal and do not suffer from the usual pitfalls of overly generic code.

Then, as part of the entity framework, you would have one component system per AI primitive: an AIAgentComponentSystem, an AIWaypointComponentSystem etc. These systems internally communicate with the AI library. They also synchronize AI components with other components like the TransformationComponent; coupling is at the entity level, not the library level.


In your case, an entity moving around the scene following a path could be configured as follows:

<entity name="player">
  <component class="Transformation"/>
  <component class="AIAgent"/>
  <component class="Mesh"/>
  <component class="Script"/>

The AIAgent component internally owns an AI Agent managed by the AI library. It syncs the agent position with the Transformation component which is then syncronized with the Mesh component for visualization. The Script component contains game logic and allows you to configure the pathfinding at run-time based on game events (i.e: switch pathfinding on/off or pass control to physics).

Then, if desirable, you would have an entity per AI waypoint:

<entity name="waypoint 0">
  <component class="AIWayPoint">

Note that waypoints do not necessarily have to have an associated component if you do not need to manipulate them with the component API.

Please let me know if you need further clarifications as I've written this in a rush.

#5096698 Self-shadowing terrain idea - asking for feedback

Posted by Reitano on 25 September 2013 - 10:12 AM

This idea reminds me of:




which I had in my Typhoon engine about 10 years ago :)


The paper describes an elegant algorithm for computing the horizon of a heightmap but my guess is that a brute-force approach could work reasonably fast on today's machines, even for large datasets.


Note that you'll need more than two points for representing the horizon unless you constrain the sun to follow a fixed arc (how is this arc called btw ?) You can use the horizon information for approximating the AO too (or you could approximate the visiblity with a cone in this case, as described in a paper by ATI).

#5096607 Component based architecture and messaging

Posted by Reitano on 25 September 2013 - 05:14 AM

In my engine, components are fully autonomous classes and do not share data with other components. For example, both the Transformation component and the RigidBody component have a vec3d position member variable. The problem is how to synchronize the components so that the position in the Transformation component is always an up-to-date copy of the RigidBody position. In turn, other components such as the MeshRenderer, Light, which all possess a position, needs a copy of the position of the Transformation component.

Often, message queues or callbacks are suggested. Instead, I use a simpler approach. All my component systems know the required dependencies and explicitly synchronize components once per frame. An example:

struct RigidBodySystem::SyncRigidBodyTransform
    const RigidBody* rigidBody;
    Transform*       transform;

RigidBodySystem::InitializeComponent(RigidBody* rigidBody, Entity* entity, Scene* scene)
    Transform* transform;
    if (scene->Query(entity, &transform))
        // Add synchronization primitive
        this->syncPrimitives.push_back( SyncRigidBodyTransform(rigidBody, transform) )

RigidBodySystem::DestroyComponent(RigidBody* rigidBody)

// Called once per frame
void RigidBodySystem::Update(float dt)

// Called every frame after Update
void RigidBodySystem::SynchronizeComponents()
    foreach (sync, this->syncPrimitives)
        // Orientation too...

The synchronization phase is very efficient and cache friendly because the in and out components reside in consecutive memory areas and the loop does not trash the instruction cache.

An objection might be that this approach introduces coupling between classes. That is true but in my case the coupling happens at a high level. In fact, component systems like the RigidBodySystem are simple frameworks that dispatch calls to low-level libraries and these have no dependencies on other component libraries.

#5095423 Hiding private methods and variables in a class

Posted by Reitano on 20 September 2013 - 05:24 AM

Thank you all for your opinions.


I was going to reply to RobTheBloke but Hodgman anticipated me with a well written clarification of the advantages of this approach compared to others. I like the suggestion of using macros to beautify the implementation code. The only thing I dislike is the use of a static Create function in the Interface class. I'd let the application take care of the creation of an actual implementation (based on #ifdef s) at a higher level. That would be the only code aware of the actual implementation while the rest of the code would just see the interface API.

#5095392 Hiding private methods and variables in a class

Posted by Reitano on 20 September 2013 - 03:42 AM

I am working on a multiplatform rendering API  with implementations for DirectX 9 and DirectX 11.



- efficiency : no virtual calls

- minimal code duplication : declare the public API once only

- compile-time selection of the implementation


Nice to have:

- exposing to client code only the public API and not private methods and member variables of the actual implementation.


I have been exploring different approaches (e.g. http://www.altdevblogaday.com/2011/06/27/platform-abstraction-with-cpp-templates/ , http://www.altdevblogaday.com/2011/02/01/the-virtual-and-no-virtual/) and then stumbled upon this solution which seems to fit my requirements pretty well:




I quite like this solution and I am considering using it for other systems. I strongly suspect FMOD uses it too, as the FMOD API classes contain only public methods. This solution puts some constraints on the class usage (it seals it and allows manipulation through pointers only). Moreover, the actual .cpp implementation file feels a little hacky and less pretty with all the additional downcasts. I kinda wish C++ supported partial class declarations as part of the language, for cases like mine.


What is your opinion about this approach ?