Jump to content

  • Log In with Google      Sign In   
  • Create Account


Member Since 24 Feb 2008
Offline Last Active Yesterday, 07:46 PM

Topics I've Started

[D3D12] About CommandList, CommandQueue and CommandAllocator

19 April 2016 - 11:30 PM

I read D3D12 Documentation and I am reading Frank Luna's book about DirectX12.


One thing I do not understand is the following:


When you record commands in a ID3D12CommandList, you are really writing commands in a ID3D12Allocator memory. This memory should be in CPU memory (some place in heap memory (RAM)), fo performance reasons.


When you execute a ID3D12CommandList, you actually send ID3D12Allocator memory from CPU to GPU memory (CommandQueue memory I think) so the GPU can consume these commands once it reaches your commands in the queue.


The question is: If ID3D12CommandList adds commands in an ID3D12Allocator that is in CPU before we execute that command list. Why cannot we record new commands in the same ID3D12CommandList and for the same ID3D12Allocator, if you are actually writing commands in CPU memory?


Any of my assumptions is incorrect? Am I missing some architectural decision?


Thanks in advance!

[D3D12] Direct3D 12 Documentation in PDF :)

29 August 2015 - 03:11 PM

Hi community


I converted MSDN DirectX12 documentation into a pdf. I attached the pdf to this thread and also uploaded it





I hope you find it useful​

[DirectX] Particle Systems - Compute Shader

02 June 2014 - 10:43 PM

Hi community

I want to share some demos about Particle Systems I was working on.
I implemented those systems with DirectX 11 and Compute Shader
I was working on Visual C++ 2013, Windows 8.1, Phenom II x4 965, 8GB Ram, Radeon HD 7850. They run at 60FPS (I limited FPS to 60 FPS)
Please, watch demos in HD 1080p.
Video 1:
There are 1.000.000 particles in this demo which are in 5 different areas.
As demo progress, you can see how particles in each area begin to organize
Video 2:
There are 640.000 particles forming a sphere. Particles are moving slowly to the center of the sphere.
Video 3:
There are 640.000 particles organized in 100 layers of 6400 particles each. Particles move at random speed and direction through the plane of its layer.
I tested those demos with 10.000.000 particles and they run at 30 FPS approximately. Apparently alpha blending and z tests slow them, because when they are separated, then FPS increment.
Future work
Find bottlenecks and learn how to do that faster. I know in AMD video cards, threads per group should be multiple of 64.
Implement systems where particles interact physically between them or with complex AI. I was using steering behaviors in demo 1

[D3D11] Terrain LOD using Compute Shader

18 April 2013 - 06:38 PM

Hi community. I finished a new mini project using DirectX 11.

Simplified Terrain Using Interlocking Tiles

The Algorithm
Greg Snook introduced an algorithm that mapped very nicely to early GPU architectures, in Game Programming Gems 2. The algorithm was useful because it did not require modifying the vertex or index buffers, because it used multiple index buffers to provide a LOD mechanism. By simply rendering with a different index buffer, the algorithm could alter the LOD and balance performance and image quality.

The algorithm breaks up the 2D height-map texture into a number of smaller tiles, with each tile representing a 2^n + 1 dimension area of vertices. This vertex data was fixed at load time and simply represented the height at each corresponding point on the height map, which also corresponds to the highest possible detail that can be rendered. For each level of detail, there are 5 index buffers. Those buffers represent the central area of the tile, plus one index buffer for each of the four edges.

LOD Calculation
A jump between any two levels of detail is possible, but this may cause popping to become noticeable. The simplest way to avoid this popping is to only transition between neighboring LOD levels, or in severe cases, to consider vertex blending or other morphing enhancements.

Typically, the biggest and most noticeable pops are transitions at the lower end of the LOD scale, such as from 0 to 1, where the geometric difference between the two states is highest. Transitioning at the top end, such as from 4 to 5, will not be as noticeable, since the mesh is already quite detailed, and the triangles are relatively small, making the geometric difference small. Large changes to the silhouette of the terrain tend to be particularly noticeable due to their prominence in the final image, and should be ideally avoided. Unfortunately, the simplest metrics for deciding the level of detail will typically set the geometry farthest from the camera to the lowest level of detail.

We will show 2 examples:

(1) Naive implementation: We will compute midpoints for of the 5 tiles (tile being rendered + 4 neighbors tiles). After that we will compute the distance from each midpoint to the camera to generate a raw level of detail for each of the 5 patches, and finally these LOD values are assigned to the 6 output values that Direct3D expects (Tessellation Constant function: 2 inner factors and 4 edge factors)

(2) Height Map Pre Pass: It is preferable to focus tessellation on areas where it is most noticeable. The above Naive Implementation clearly demonstrates that will neither generate high quality , nor high performance results. Most detailed areas are not necessarily the areas which need more details. We will divide the input height map into kernels; each kernel area will have its pixels read in, and four values will be generated from the raw data, which can then be stored in a 2D output texture. This texture will then be used to compute LOD. A good objective for this pre-pass is to find a statistical measure of coplanarity, i.e. what extend the samples lie on the same plane in 3D space.


We do not want to be limited to a single texture. We would like to create terrains depicting sand, grass, dirt, rock, snow, etc, all the same time. We cannot use one large texture that contains the sand, grass, dirt, etc, and stretch it over the terrain, because we will have a resolution problem, the terrain geometry is so large, we would, require an impractically large texture to have enough color samples to get a decent resolution. Instead, we take a multitexturing approach that works like transparency alpha blending.

The idea is to have a separate texture for each terrain layer (e.g. one for grass, dirt, rock, etc). These texture will be tiled over the terrain for high resolution. Also we need to include transparency alpha blending. The blend map, which stores the source alpha of the layer we are writing indicates the opacity of the source layer, thereby allowing us to control how much of the source layer overwrites the existing terrain color. This enables us to color some parts of the terrain with grass, some parts with dirt, and others with snow, or various blends of all three.


One particular limitation of this implementation is that it still retains an element of distance in the LOD calculation. Geometry further from the camera is always assumed to be less important, and to require less detail.

It is worth extending the terrain algorithm to weight features on the horizon higher than distant objects that are less pronounce. The silhouette of a model is one of the biggest visual clues of its true level of detail, and is one of the more obvious places for the human eye to pick up on approximations or other visual artifacts.

Interacting with the terrain is another problem. The core application running on the CPU has no knowledge of the final geometry, which is generated entirely on the GPU. This is problematic for picking, collision detection, etc. One solution is to move picking or ray casting to the GPU, or use the highest level of detail.

Game Programming Gems 2
Introduction to 3D Game Programming with DirectX 11
Practical Rendering and Computation with DirectX 11
Real Time Rendering

Video Demonstration

DirectCompute - CUDA - OpenCL are they used?

05 March 2013 - 07:11 PM

Hi community.


My question is very simple. Are those technologies used in real time games?


I saw they are used to implement Blur effects, Particle Systems, Fluids simulation, physics, etc.


I want to know if they are used for AAA games in the game industry and to implement what kind of things?


Thanks in advance! I will appreciate this information