| Sunday, February 25, 2007 |
 Gas giants part II |
Posted - 2/25/2007 11:59:07 AM | I am continuying my work on gas giants.
I have finished the color table generation code. My first try was using a lookup table that maps a density ( in the range [0-1] ) to a color, that interpolated colors based on 4 to 7 "color keys". Each color key was determined randomly.. as expected, it didn't look very good. I decided to use another approach: generate a start color for density 0; generate an end color for density 1; then use the mid-point displacement algorithm in 1D to generate the intermediate values.
Here's an example of a color table ( stretched in 2D to see the colors better ).
And Here's an example of a gas giant texture ( there are 6 per layer, and 6 layers total -> there's 36 of those ) with the applied color table.
I also rewrote the color table lookup code itself, to use a simple 256-values array, and pre-computing the colors and the alpha for each value. Then, to texture the gas giant, I only need 1 line: get the density, convert it to an integer in the 0-255 range, then lookup the RGBA values. It provided a nice performance boost ( 5 times faster than the previous method ). You wouldn't notice a difference though, because this part of the algorithm wasn't a bottleneck to start with.
I rewrote the atmospheric scattering shaders in GLSL. Here's a gas giant with its 6 layers ( each cloud layer is moving at different speeds, although it's subtle ). Notice how the scattering on the top-right makes the atmosphere go yellow-ish:

Last but not least, I've been trying to get good results when entering inside the atmosphere of the gas giant. It's still very experimental, at the moment my main problem is to blend the atmosphere look from space and from inside.. and I haven't succeeded yet. But that's how it looks:
In the upper atmosphere:

In the medium atmosphere:


In the deep atmosphere, you probably won't be able to go deeper in a spaceship anyway:

| |
| Tuesday, February 20, 2007 |
 GLSL, Gas giants, Atmospheric scattering |
Posted - 2/20/2007 9:22:31 AM | Gas giants:
I started to work the implementation of gas giants. My current approach is to use an outer sphere with atmospheric scattering, similar to the ones used on my Earth-like planets, and a set of layers ( six at the moment ). All of those are generated and displayed with the LOD ( level-of-detail ) algorithm used in the terrain engine, except that the spherical surface is not displaced by a heightmap.
Each layer corresponds to a cloud layer with various colors and transparencies. Each layer is textured with a cube map, so I started to generate a "gas giant style" texture for each layer with random color combinations and turbulence. Here's an example of such a texture:

The upper layers in altitude get a lower transparency while the deeper layers get more and more opaque. The bottom-most layer is fully opaque, a bit like the ground.. but I don't expect anybody to really see it, since so deep in the atmosphere, any spaceship would be dead for long..
The 6-layers idea wouldn't look very good if the layers were individually visible. Fortunately, I am hoping that with good fogging, particle/wheather effects, and good scattering, everything will blend together quite well.
The gas giant texture is procedurally generated by stretching fBm noise with a large non-uniform factor, like ( 1, 30, 1 ). This generates the straight strips. To break the straightness (english?), I am actually using another fBm which returns the amount of stretching to do. In the end, it looks like this:
float res = noise.fBm(ray * (4, 4, 4) );
res = noise.fBm(ray * (1, 20 + res * factor, 1));
The first factor, (4, 4, 4), determines the size of the turbulence "features"; while the second factor ( factor ) determines how stretched the atmosphere looks. Gas giants planets are rotating very quickly around themselves, which is why their clouds are so stretched. This factor will later be determined by the astrophysical parameters of the planet like its rotation speed.
The color table at the moment is quite random. I have to study colors of gas giants and how to generate "interesting" colors. The transparency is also a function of the noise value, plugged in a simple formula. I have to tweak it too.
In the future, I will also have to experiment storms like the big red spot on Jupiter, and wheather effects such as those mentioned particles, but also lightning, etc..
Atmospheric scattering:
While working on the shaders for gas giants, I realized that my scattering equations were not giving me the expected results.
I don't know if anybody wondered.. imagine that a gas giant didn't have any cloud - just imagine a huge ball of gas in space. How would it look like ?
Let's say the camera is in the upper atmosphere of the gas giant. The sky would appear like it does on Earth: fading from black to the sky color ( blue on Earth ), with some white haze at the horizon..
But there would be no ground. So what would you see underneath ?
Scattering equations, if I'm doing no mistake, say that it should fade back to black. The exctinction coefficients depend on how much atmosphere is traversed, and underneath the camera.. there's a lot of atmosphere to traverse.. so it should be black.
Unfortunately, in my previous atmospheric scattering implementation, I saw an ocean of... white. Certainly, I was wrong somewhere.
That's pretty much the point I decided to rewrite my atmospheric scattering shaders from scratch, carefully watching the results of all the terms of the equations.. and in GLSL !
GLSL:
Until now, I've been using assembly shaders ( in OpenGL, ARB_vertex_program and ARB_fragment_program ). Usually, it was good enough, as shaders were quite simple ( a few tens of instructions at most ). But scattering shaders are a whole new beast, requiring to intersect rays and spheres, and evaluating complex equations. I was loosing too much time to debug my ASM shaders, so I decided that it was time to switch to GLSL..
I expected the GLSL implementation to be straightforward.. how wrong I was. I spent pretty much my whole last week on it. For some reason, the GLSL interface works in a pretty different way than ASM shaders; in particular, shader constants do not belong to a specific shader ( vertex shader, or pixel shader ); but to a program, which is a pair vertex + pixel shader.
It's probably not clear, but let's take an example. You have a vertex shader called VertexShader1 and a pixel shader called PixelShader1. VertexShader1 uses a constant called AtmosphereParams. To assign a value to this constant, you have to query the location of this constant in the program.. the location is an ID, so maybe it'll return 1.
Now, let's say that you use the same vertex shader but a different pixel shader: PixelShader2. Well, here is the problem: if you query the location of the constant AtmosphereParams in VertexShader1, it might return another location ID.
Fixing the issue required introducing the concepts of "virtual" constant IDs, and "real" constant IDs, and mapping one to another in real-time depending on which pair "vertex shader + pixel shader" is bound..
Once my GLSL implementation was complete, I started to experiment with the GLSL language, and so far no problem with it, it's pretty easy and much easier than coding everything in assembly.
| |
| Wednesday, February 14, 2007 |
 Planetary textures, website, bump maps, SDD.. |
Posted - 2/14/2007 11:56:43 AM | On the website:
As you have noticed, yesterday the website has been moved to a new server with a VPS ( virtual private server ). This will allow me to monitor resource usage ( cpu usage, memory usage, etc.. ) and should hopefully give a slight performance boost ( website should be more responsive ). Time will tell if it makes a real difference.
On the SDD:
The new Ships Design Doc ( which should really be renamed to something else, as it's no longer specific to ships ) is still in the works. It takes a lot of time, but i usually write 1 or 2 pages every day. With a bit of luck, it should be done "soon".
Bump maps:
A few days ago I released a new version of the ASEToBin tool, to visualize models in the I-Novae engine. The GUI has been changed for a new panel interface; it is now possible to put the viewport in full screen, and it now works with wide fonts.
I noticed something strange when playing with Betelgeuze's planetary textures and the new Istari textures. There was a lot of "blocky" artifacts similar to jpeg artifacts, especially with specular lighting shaders, but the original textures were perfect. I spent a few hours to track the "problem", only to realize that the artifacts were coming from normal maps automatic compression, that was left enabled in the shaders. Changing a value from "true" to "false" fixed the quality problem. See this thread to compare screenshots before/after.
Planetary textures:
Betelgeuze is working on improving his textures for the planetary terrain engine. With the "fixed" bump mapping, it's looking pretty nice.. here are some examples:
From top to bottom:
Arean1
Arean2
Cytherean1
Cytherean2
Hephaestian1 (note: there's already a new version of this texture, but i lost it)
Hephaestian2
Gaian1
Gaian2

Just keep in mind that those are the "base" textures, that later get combined with procedural noise by the terrain engine. A single pixel of the terrain can be a combination of 6 of those "base" textures.
| |
| Tuesday, February 13, 2007 |
 Shadowing part II |
Posted - 2/13/2007 5:30:46 AM | Shadow mapping acne:
In the previous thread about Shadowing, i explained the basic implementation of Shadow maps.
I quickly mentioned shadow acne and how it can be solved relatively well. The key point is to apply a z-offset to the vertices in shadow map space, at the shadow-map render-to-texture stage.
I used to multiply pos.w ( pos being the 4D vector containing the vertex position ) by a constant like 0.99; it solved some of those issues, but it was not good enough. In particular, the constant was never perfect in all cases. For example when the light rays were almost parallel with a polygon, acne was horrible.
Reducing the constant to 0.98, 0.97, etc.. fixed the problem of parallel rays, at the cost of offsetting the shadows of orthogonal polygons too far. The shadows did not match the geometry anymore, and the objects seemed to fly a bit above the shadows.
The main trick to solve all those issues is to not use a constant, but to use a variable that is a function of the slope between the light rays and the surface. I'm now simply linearly interpolating between two constants ( say, 0.95 for parallel rays, and 0.999 for orthogonal rays ), and use the interpolated value to multiply pos.w with.
The assembly code looks like this:
# ZFactor blend factor depending on slope compared to light direction:
TEMP lightDirObj; # light direction in object space
DP3 lightDirObj.x, mv[0], lightDir;
DP3 lightDirObj.y, mv[1], lightDir;
DP3 lightDirObj.z, mv[2], lightDir;
DP3 lightDirObj.w, inNormal, lightDirObj;
MAX lightDirObj.w, lightDirObj.w, 0.0;
MAD lightDirObj.w, zfactor.x, lightDirObj.w, zfactor.y;
MUL pos.w, pos.w, lightDirObj.w;
Note: in the above code, mv is the modelview matrix, lightDir is the light direction in world space ( it is assumed to be a directional, infinite light ), and inNormal is the normal in ojbject space. Zfactor is a vector that contains in the X register the ZFactor value for parallel rays, and in the Y register the ZFactor value for orthogonal rays.
Shadow mapping experiments:
As i explained in the first part, standard shadow maps suffer from resolution problems, especially when you want them to cover a large area.
A very obvious improvement is, instead of mapping the shadow map to the whole scene, map it to a portion of the scene: the one centered on the camera ( that's where you want to see shadows ! ), and with a limited view distance. That's probably good enough for a FPS game; if you limit the distance at which shadows appear, say to 200m away from the camera, you'll get an "okay" resolution.. in the same idea, a RTS, with a top-down / ortho camera, will get a pretty much perfect resolution since you can modify the shadow map area to fit perfectly within the frustum of the camera.
Another improvement is adaptating the shadow map area, not to fit a certain area around the camera; but to fit the camera frustum itself. This might help a bit, but because the resolution is no longer constant, you might get a good resolution quality in some views, and an horrible one in other views. That's also the problem with all the perspective-tweaking techniques ( perspective shadow maps, light-space shadow maps ( LSSM ), trapezoidal shadow maps ).
LSSM is a technique i was very interested in, that didn't seem too hard to implement, until i realized it suffered from the same problems than the other techniques. It's usually good ( hell, even.. excellent! ) but there's always a way to find a view where the quality is *worse* than standard shadow maps. Because the resolution is dependant on the frustum, that means shadow texels flicker and change every time the camera moves/turns (ie. all the time).
Per-object shadow maps:
Another idea that i had many months ago was to use standard shadow maps, but to assign shadow maps to each object instead of the whole scene. Then, for each frame, the resolution of the shadow maps would be re-calculated based on the distance of the object and the camera. An object that is small ( in dimensions ) and far away would get, say, a 64x64 shadow map; an object that is big ( in dimensions ) and far away would get a 256x256; an object that is small and close would get a 512x512; an object that is big and very close would get a 1024x1024 or more.. you see the idea.
I implemented this algorithm using a cache / pool of shadow maps, re-cycling shadow maps usage to avoid re-creating them every time, but overall the performance was extremely bad. The reason was simply the explosion of the number of passes, and the amount of geometry to draw. As soon as an object's shadow map needed to be re-calculated, all the objects in the frustum had to be re-rendered. Imagine a scene with 50 objects near you, that need to be updated, and 1000 objects a bit further away, but that all fall in the frustum of some of those 50 objects. That's an insane amount of geometry to re-render.. and it killed the framerate.
In addition, if an object was really, really big ( say, 2 Km big ), there was no way it could get a high enough resolution to have high quality shadows, even using a 2048x2048 on it ( which would give 1 texel per meter ).
Cascaded shadow maps:
My last words on the topic will be about the technique i'm currently using ( which is still nowhere from perfect ), cascaded shadow maps. The idea ( and the implementation ) behind cascaded shadow maps is quite easy: instead of using one shadow map centered around the camera, with a range of 200m ( which would give an "okay" resolution, but with a range that is too close ), we can use N shadow maps, all centered on the camera, but each of them using an increasing maximum range: 200m (shadow map #0), 400m (shadow map #1), 800m (shadow map #2), 1600m (shadow map #3).
Of course, increasing the amount of shadow maps means rendering the scene N times rather than once. It's even worse than what you think, because the frustums of each shadow map also get bigger and bigger, so as N increases, the amount of geometry of the scene that needs to be rendered to each shadow map increases. The biggest level of the shadow map (the 4th one when N=4) almost requires to render the whole scene. And here, frustum culling will not help since we're rendering from the light's point of view ( which affects the whole scene ).
An intermediary solution is to use two levels of shadow maps: one that gets a high quality, very precise, around the camera ( say a 2048x2048 for a 200m distance ). And a rough one ( say a 2048x2048 for a 2000m distance ). This might work more or less well depending on your game; the larger the ranges step up from the previous level, the more you'll have troubles to hide the "transition" between the levels. However, performance will be better than with more levels.
In Infinity, i'm pretty much stuck in a dead-end. The images in this thread use N=4 ( 4 shadow maps ), covering larger and larger areas. Here are the values:
- for level 1, SM is 1024x1024 up to a distance of 150m.
- for level 2, SM is 1024x1024 for distances between 150m and 500m.
- for level 3, SM is 1024x1024 for distances between 500m and 1666m.
- for level 4, SM is 1024x1024 for distances between 1666m and 5555m.
After 5.5 Km, shadows disapear. To hide this, i'm using alpha blending with a factor that is a function of the distance, so that it fades out smoothly.
At this point you're probably wondering.. "okay, but why do you need shadows to appear as far as 5.5 Km" ? The answer is bad: i actually need them to appear at up to 50 Km, and i'm not happy at all with this 5.5 Km limit.
Why is my target 50 Km ? Because i use a planetary engine with an unlimited view distance. It's not impossible to have mountains that are up to 10 Km of altitude, and that cast shadows very far away, especially at sunsets. In addition, some of our spaceships are *huge*. The Flamberge ( the biggest ship in the screenshots ) is 5 Km long. With a view distance limited to 5.5 Km, that means the Flamberge can cast shadows at no more than its own length !
Then, you might ask, why don't you start with a lower limit ? For instance, the first shadow map level could have a range of 1000m instead of 150m. Of course, i can do that, but this would mean loosing resolution for small objects. Some vehicles are only a few meters long, and i'd like their shadows to have a nice shape, and not an horrible blurry blob of pixel(s).
Increasing N to 8 would probably allow me to reach this 50 Km distance target except.. that performance would suffer a lot. And it already does with N = 4, so i'm quite afraid to go that way...
Packed cascaded shadow maps:
To end this (long) topic on shadows, i'll speak quickly about another optimization i added to cascaded shadow maps. For N=4, you can pack 4 1024x1024 shadow maps into a single 2048x2048 one. It's then just a matter of determining per-pixel ( not per-vertex, or you'll get wrong results due to the linear interpolation ) which of the 4 areas to sample. This is very handy, especially when you add softening of shadow samples ( taking 2x2 samples to blur the shadows a bit ).
Caching shadow maps does not seem to work well. One of my motivations for going for cascaded shadow maps was the hope that i'd be able to update the first shadow map level every frame, but only every 2 frames for the 2nd level, every 4 frames for the 3rd one, and every 8 frames ( or so ) for the 4th one. This doesn't work at all, simply because you can have moving objects in the scene, which cause all the shadow maps that are not updated every frame to be incorrectly projected to the geometry; then, objects appear either completely in shadow, or completely in light, depending on the direction the object is traveling versus the camera.. So, forget about caching shadow maps: you'll have to recalculate all the shadow maps every frame anyway, even if you're not using perspective shadow maps derivated algorithms.
Conclusion:
I'm currently using the packed cascaded shadow maps algorithm. Performance is pretty bad ( relatively.. i still get more than 60 fps in the screenshots shown in this thread. But the performance hit over no shadows is like 100-150% ). The range is limited to 5.5 Km and i'm not sure how to increase it ( without killing the performance ). All the other shadow mapping techniques have their own limitations. Maybe an hybrid technique would work better; for example, i'm considering experimenting a cascaded light space shadow map technique ! By using two levels of shadow maps, each shadow map being rendered with LSSM, i could maybe achieve much better results at a better framerate ! Unfortunately, all those experiments take a lot of time, so who knows what will happen in the future ?



| |
| Thursday, February 1, 2007 |
 Fast Perlin noise |
Posted - 2/1/2007 9:12:34 AM | I'm here going to present a "tweak" to the Perlin noise function that multiplies its speed by a factor of 7.
The original Perlin noise function that is used is Perlin's Improved noise function.
Of course, a speed increase of x7 means that we're going to loose some nice properties / accuracy of the original noise, but everything is a trade-off in life, isn't it ? Whether you want to use the fast version instead of the original version is up to you and depends on if you need performance over quality or not..
On a Pentium 4 @ 3 Ghz, Improved noise as implemented directly in C++ from the code linked above takes 7270 milliseconds to generate 1024x1024 samples of fBm noise ( the basis function being Improved noise ) with 12 octaves. In other words, for each sample, the Improved noise function is called 12 times. I'm using the 3D version of the algorithm. I ran the test 8 times and averaged the results to get something relevant.
The first thing that can be improved in the Improved noise are the casts from float-to-int:
const TInt xtr = floorf(xyz.x);
const TInt ytr = floorf(xyz.y);
const TInt ztr = floorf(xyz.z);
Those are really, really.. REALLY bad. Instead you can use a bit of assembly:
__forceinline TInt __stdcall MFloatToInt(const TFloat x)
{
TInt t;
__asm fld x
__asm fistp t
return t;
}
and replace the floor calls by:
const TInt xtr = MFloatToInt(xyz.x - 0.5f);
const TInt ytr = MFloatToInt(xyz.y - 0.5f);
const TInt ztr = MFloatToInt(xyz.z - 0.5f);
The same performance test ( 1024x1024 @ 12 octaves ) now takes 3324 milliseconds, a 118% improvement.
The next trick is an idea of a co-worker, Inigo Quilez, who's heavily involved in the 64 KB-demo coding scene. His idea is simple: instead of computing a gradiant, just replace it with a lookup table.
The original code looked like this:
return(_lerp(w, _lerp(v, _lerp(u, _grad(ms_p[8][AA], x, y, z),
_grad(ms_p[8][BA], x - 1, y, z)),
_lerp(u, _grad(ms_p[8][AB], x, y - 1, z),
_grad(ms_p[8][BB], x - 1, y - 1, z))),
_lerp(v, _lerp(u, _grad(ms_p[8][AA + 1], x, y, z - 1),
_grad(ms_p[8][BA + 1], x - 1, y, z - 1)),
_lerp(u, _grad(ms_p[8][AB + 1], x, y - 1, z - 1),
_grad(ms_p[8][BB + 1], x - 1, y - 1, z - 1)))));
The "fast" version looks like this:
return(_lerp(w, _lerp(v, _lerp(u, ms_grad4[AA],
ms_grad4[BA]),
_lerp(u, ms_grad4[AB],
ms_grad4[BB])),
_lerp(v, _lerp(u, ms_grad4[AA + 1],
ms_grad4[BA + 1]),
_lerp(u, ms_grad4[AB + 1],
ms_grad4[BB + 1]))));
Here, "ms_grad4" is a 512-entry lookup table that contains absolutely random float values in the [-0.7; +0.7] range.
It is initialized like this:
static TFloat ms_grad4[512];
TFloat kkf[256];
for (TInt i = 0; i < 256; i++)
kkf[i] = -1.0f + 2.0f * ((TFloat)i / 255.0f);
for (TInt i = 0; i < 256; i++)
{
ms_grad4[i] = kkf[ms_p2[i]] * 0.7f;
}
At this point you're maybe wondering why 0.7 ? This is not clear to us yet; we're suspecting it has something to do with the average value you can get from a normal Perlin noise basis. We tried a range of [ -1; +1 ] originally, but we found that the fBm was saturating to black or white too often, hurting quality by a lot.
This "fast" version of noise runs at 1027 milliseconds, a 223% improvement compared to the previous tweak, or a 607% improvement compared to the original Improved noise version.
As for quality, well, everything has a price. There's no visible artifacts in the fast noise, but the features don't seem to be as well regular and well distributed than the improved noise. The following image has been generated with the same frequency/scaling values. The features appear different since the lookup table holds different values in the two versions, but that's normal.
http://www.fl-tw.com/Infinity/Media/Misc/fastnoise.jpg
| |
|
| S | M | T | W | T | F | S | | | | | | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | | | 15 | 16 | 17 | 18 | 19 | | 21 | 22 | 23 | 24 | | 26 | 27 | 28 | | | |
OPTIONS
Track this Journal
ARCHIVES
October, 2009
August, 2009
July, 2009
May, 2009
April, 2009
March, 2009
February, 2009
January, 2009
November, 2008
October, 2008
July, 2008
June, 2008
May, 2008
April, 2008
March, 2008
January, 2008
December, 2007
November, 2007
October, 2007
September, 2007
August, 2007
July, 2007
June, 2007
May, 2007
April, 2007
March, 2007
February, 2007
January, 2007
December, 2006
November, 2006
October, 2006
September, 2006
August, 2006
July, 2006
June, 2006
May, 2006
April, 2006
March, 2006
February, 2006
January, 2006
December, 2005
November, 2005
October, 2005
September, 2005
August, 2005
July, 2005
June, 2005
May, 2005
April, 2005
March, 2005
February, 2005
January, 2005
December, 2004
October, 2004
September, 2004
August, 2004
|