Intel sponsors gamedev.net search:   
Journal of YsaneyaBy Ysaneya      
Main project:

Infinity, a space-based MMOG

Forums

Wednesday, April 16, 2008
Today's a short tip of the day aimed at graphics programmers.

Most people don't realize that bilinear filtering an RGBA8 ( fixed point, 8 bits per channel ) texture results in a fixed point value.

What I mean is extremely simple. RGB8/RGBA8 textures are a standard used commonly everywhere. In shaders, you usually sample a texture like this (GLSL):

vec4 color = texture2D(tex, uv);


If filtering has been enabled on the texture, the values in 'color' are not full fp32 precision. They are still 8 bits.

I blame this on legacy hardware. I have confirmed this behavior on both NVidia and ATI cards, including the modern ones.

Of course, it only becomes visible when you sum up a lot of textures together, or when you scale a value by a huge factor.

I always have a shudder when I think of all the people using lookup tables or scaled textures in RGBA8, and who don't realize what they are doing.

Here is an example. First a picture of a relatively dark area of a texture, heavily magnified ( needed to demonstrate filtering after all ). There are blocky square artifacts coming from the jpeg compression, ignore them, and just verify in photoshop or your preferred image editor that the pixel values have a smooth gradiant, decreasing 1 by 1 (actually 1/255) between each adjacent:

No scaling

Now the pixel shader simply multiplies this pixel by a constant value of 5. Notice how the pixel values jump by 5 by 5:

Scaling x5

To fix this problem, there are two ways that I know:

1. Perform bilinear filtering yourself: the shader pipeline is fully 32 bits, so while you're sampling 4 texels in 8 bits, the bilinear interpolation will be in full precision. This has a high performance cost.

2. Use a fp16 or fp32 internal format ( at the cost of additional video memory ).

Those two solutions both have a serious cost, either in performance or memory. Why NVidia and ATI haven't implemented bilinear filtering of RGBA8 textures in full precision in hardware yet is beyond my understanding.


Comments: 11 - Leave a Comment

Link



Thursday, April 10, 2008
Many people have been asking me how the terrain texturing is implemented, so I'll make a special dev journal about it.

Sub-tiling

The whole technique is based on sub-tiling. The idea is to create a texture pack that contains N images ( also called layers or tiles ), for example for grass, rock, snow, etc..

Let's say that each image is 512x512. You can pack 4x4 = 16 of them in a single 2048x2048. Here is an example of a pack with 13 tiles ( the top-right 3 are unused and stay black ):



Mipmapping the texture pack

Each image / tile was originally seamless: its right pixels column matches it left, and its top matches its bottom. This constraint must be enforced when you're generating mipmaps. The standard way of generating mipmaps ( by downsampling and applying a box filter ) doesn't work anymore, so you must construct the mipmaps chain yourself, and copy the border columns/rows so that it's seamless for all levels.

When you're generating the mipmaps chain, you will arrive at a point where each tile is 1x1 pixel in the pack ( so the whole pack will be 4x4 pixels ). Of course, from there, there is no way to complete the mipmaps chain in a coherent way. But it doesn't matter, because in the pixel shader, you can specific a maximum lod level when samplying the mipmap. So you can complete it by downsampling with a box filter, or fill garbage; it doesn't really matter.

Texture ID lookup

To each vertex of the terrain is associated a slope and altitude. The slope is the dot product between the up vector and the vertex normal and normalized to [0-1]. The altitude is normalized to [0-1].

On the cpu, a lookup table is generated. Each layer / tile has a set of constraints ( for example, grass must only grow when the slope is lower than 20° and the altitude is between 50m and 3000m ). There are many ways to create this table, but it's beyond the point of this article. For our use, it is sufficient to know that the lookup table indexes the dot-product of the slope on the horizontal / U axis, and the altitude on the vertical / V axis.

The lookup table ( LUT ) is a RGBA texture, but at the moment I'm only using the red channel. It contains the ID of the layer / tile for the corresponding slope / altitude. Here's an example:



Once the texture pack and the LUT are uploaded to the gpu, the shader is ready to do its job. The first step is easy:

vec4 terrainType = texture2D(terrainLUT, vec2(slope, altitude));


.. and we get in terrainType.x the ID of the tile (0-15) we need to use for the current pixel.

Here's the result in 3D. Since the ID is a small value (0-15), I have multiplied it by 16 to see it better in grayscale:



Getting the mipmap level

So, for each pixel you've got an UV to sample the tile. The problem is that you can't sample the pack directly, as it contains many tiles. You need to sample the tile within the pack, but with mipmapping and wrapping. How to do that ?

The first natural idea is to perform those two operations in the shader:

u = fract(u)
v = fract(v)
u = tile_offset.x + u * 0.25
v = tile_offset.y + v * 0.25


( remember that there are 4x4 tiles in the pack. Since UVs are always normalized, each tile is 1/4 th of the pack, hence the 0.25 ).

This doesn't work with mipmapping, because the hardware uses the 2x2 neighbooring pixels to determine the mipmap level. The fract() instructions kill the coherency between the tiles, and 1-pixel-width seams appear (which are viewer dependent, so extremely visible and annoying).

The solution is to calculate the mipmap level manually. Here is the function I'm using to do that:

/// This function evaluates the mipmap LOD level for a 2D texture using the given texture coordinates
/// and texture size (in pixels)
float mipmapLevel(vec2 uv, vec2 textureSize)
{
    vec2 dx = dFdx(uv * textureSize.x);
    vec2 dy = dFdy(uv * textureSize.y);
    float d = max(dot(dx, dx), dot(dy, dy));
    return 0.5 * log2(d);
}


Note that it makes use of the dFdx/dFdy instructions ( also called ddx/ddy ), the derivative of the input function. This pretty much ups the system requirements to a shader model 3.0+ video card.

This function must be called with a texture size that matches the size of the tile. So if the pack is 2048x2048 and each tile is 512x512, you must use a textureSize of 512.

Once you have the lod level, clamp it to the max mipmap level, ie. the 4x4 one.

Sampling the sub-tile with wrapping

The next problem is that the lod level isn't an integer, but a float. So this means that the current pixel can be in a transition between 2 mipmaps. So when calculating the UVs inside the pack to sample the pixel, it has to be taken into account. There's a bit of "magic" here, but I have experimentally found an acceptable solution. The complete code for sampling a pixel of a tile within a pack is the following:

/// This function samples a texture with tiling and mipmapping from within a texture pack of the given
/// attributes
/// - tex is the texture pack from which to sample a tile
/// - uv are the texture coordinates of the pixel *inside the tile*
/// - tile are the coordinates of the tile within the pack (ex.: 2, 1)
/// - packTexFactors are some constants to perform the mipmapping and tiling
/// Texture pack factors:
/// - inverse of the number of horizontal tiles (ex.: 4 tiles -> 0.25)
/// - inverse of the number of vertical tiles (ex.: 2 tiles -> 0.5)
/// - size of a tile in pixels (ex.: 1024)
/// - amount of bits representing the power-of-2 of the size of a tile (ex.: a 1024 tile is 10 bits).
vec4 sampleTexturePackMipWrapped(const in sampler2D tex, in vec2 uv, const in vec2 tile,
	const in vec4 packTexFactors)
{
 	/// estimate mipmap/LOD level
	float lod = mipmapLevel(uv, vec2(packTexFactors.z));
	lod = clamp(lod, 0.0, packTexFactors.w);

	/// get width/height of the whole pack texture for the current lod level
	float size = pow(2.0, packTexFactors.w - lod);
	float sizex = size / packTexFactors.x; // width in pixels
	float sizey = size / packTexFactors.y; // height in pixels

	/// perform tiling
	uv = fract(uv);

	/// tweak pixels for correct bilinear filtering, and add offset for the wanted tile
	uv.x = uv.x * ((sizex * packTexFactors.x - 1.0) / sizex) + 0.5 / sizex + packTexFactors.x * tile.x;
	uv.y = uv.y * ((sizey * packTexFactors.y - 1.0) / sizey) + 0.5 / sizey + packTexFactors.y * tile.y;

    return(texture2DLod(tex, uv, lod));
}


This function is more or less around 25 arithmetic instructions.

Results

The final shader code looks like this:

const int nbTiles = int(1.0 / diffPackFactors.x);

vec3 uvw0 = calculateTexturePackMipWrapped(uv, diffPackFactors);
vec4 terrainType = texture2D(terrainLUT, vec2(slope, altitude));
int id0 = int(terrainType.x * 256.0);
vec2 offset0 = vec2(mod(id0, nbTiles), id0 / nbTiles);

diffuse = texture2DLod(diffusePack, uvw0.xy + diffPackFactors.xy * offset0, uvw0.z);


And here is the final image:



With lighting, shadowing, other effects:



On the importance of noise

The slope and altitude should be modified with many octaves of 2D noise to look more natural. I use a FbM 2D texture that I sample 10 times, with varying frequencies. 10 texture samples sound a lot, but remember that it's for a whole planet: it must look good both at high altitudes, at low altitudes and at ground level. 10 is the minimum I've found to get "acceptable" results.

Without noise, transitions between layers of different altitutes or slope look really bad:



Comments: 4 - Leave a Comment

Link



Wednesday, April 2, 2008
Yesterday's news was obviously an April Fool's.. But like last year, a lot of people thought it was true. I must say, posting it in the 31th didn't help, but in the GMT timezone it was near midnight and I didn't want to wait another 12h to post it, which would have been in the middle of the 1st April day for me.

The werewolves comment was a funny reference to this thread, but you had to follow it to understand it.

Rest assured that I have more backup copies than just 4.

Most of the events that I described had a part of truth ( yes, including the cops chasing the thiefs in the woods at 3 am. This really happened ). The timing was not though. The probability that all of that happened the same day, on April 1, is ridiculously low. I'd rather win at lottery.

Comments: 2 - Leave a Comment

Link


All times are ET (US)

 
S
M
T
W
T
F
S
1
3
4
5
6
7
8
9
11
12
13
14
15
17
18
19
20
21
22
23
24
25
26
27
28
29
30

OPTIONS
Track this Journal

 RSS 

ARCHIVES
October, 2009
August, 2009
July, 2009
May, 2009
April, 2009
March, 2009
February, 2009
January, 2009
November, 2008
October, 2008
July, 2008
June, 2008
May, 2008
April, 2008
March, 2008
January, 2008
December, 2007
November, 2007
October, 2007
September, 2007
August, 2007
July, 2007
June, 2007
May, 2007
April, 2007
March, 2007
February, 2007
January, 2007
December, 2006
November, 2006
October, 2006
September, 2006
August, 2006
July, 2006
June, 2006
May, 2006
April, 2006
March, 2006
February, 2006
January, 2006
December, 2005
November, 2005
October, 2005
September, 2005
August, 2005
July, 2005
June, 2005
May, 2005
April, 2005
March, 2005
February, 2005
January, 2005
December, 2004
October, 2004
September, 2004
August, 2004