 Clouds generation |
Posted - 7/24/2005 5:22:57 PM | I'm now working on generating textures for planets, at a global scale level (ie, the textures you see from space).
I just finished a short experimentation with clouds, and i like the result. I am combining two distorted fBm noise functions. A distorted fBm function is like your standard fBm, but instead of using Perlin noise for its basis, it uses distorted Perlin noise. It looks like this:
TFloat CNoise::distNoise(const SVec3D& pos, const TFloat distortion)
{
return(noise(pos + distortion * vecNoise(pos + 0.5f)));
}
SVec3D CNoise::vecNoise(const SVec3D& pos)
{
SVec3D res;
res.x = noise(pos);
res.y = noise(pos + 3.33f);
res.z = noise(pos + 7.77f);
return res;
}
Here is an image of the resulting clouds:

| |
 Multithreading issues solved |
Posted - 7/22/2005 7:52:39 AM | That's going to be a long entry, and i will talk a bit about multithreading issues in a Windows environment.
A. Heap contention
After i finished my starfield implementation, i tried to put it in a separate thread in order to continue the rendering in paralell. It worked well except that the framerate, although still very high (300 fps, it's only displaying a sky box after all), was.. unsmooth.
I tracked it to the lines which are temporarily allocating some buffers for the blurring of the starfield (an operation that is done up to 30 times in the starfield generation process). The simple fact that i was calling malloc with a high amount of bytes to allocate, was enough to "suspend" the rendering thread for a few hundred of milliseconds. So, a pause of a few hundred of milliseconds happening every few seconds or so, was causing this stuttering effect.
I decided to investigate the issue a bit until i found that the C-runtime malloc is using a single heap. Two threads doing a malloc will hence by synchronized by a mutex to avoid messing up the heap.
After some googling, i found a good library called Hoard, which is an improved memory allocator for multithreading (and which solves the heap contention problem). I tested it and it worked flawlessly (fixing my slowdowns), but it is under the GPL. So i tried an alternative, ptmalloc2, which is under the LGPL (much better). It was a bit tricky to compile it under MSVC, but in the end i got a DLL and a LIB to which i linked my engine, and the problems were gone too. As a benefit, ptmalloc2 is in average 4 to 5 times faster than the standard malloc, even in non-multithreaded applications!
B. Task scheduler
Windows thread scheduler is BAD. I cannot stress it enough. One of the biggest issues in my opinion is its lack of fine control about the priority of the threads. Ok, you've got the SetThreadPriority function, but if you tried it to control your threads, you know how bad it is.
The problem with SetThreadPriority is this one: given two threads X and Y, if you set X priority to NORMAL and Y priority to BELOW_NORMAL (one level under), you'll get approximately 90% of the CPU in X and only 10% of the CPU in Y.
Now, let's say you want 70% of your CPU in X and 30% in Y. How do you do it ? Answer: you can't with SetThreadPriority. There is no combination of flags that give you this kind of CPU balancing.
I've been aware of this problem for months, so in my engine i decided to implement my own thread scheduler. And it works surprizingly well!
The idea is the following one: i create a scheduler thread which contains in an array all the threads with their associated priorities. This scheduler has a loop which sequentially picks up a thread, resumes it, goes to sleep for the amount of time the thread should run, then awakens and pauses the thread, and jumps to the next thread to process. For two threads it looks like this:
while (!stopped())
{
thread[0].resume();
thread[1].stop();
Sleep(thread[0].time);
thread[0].stop();
thread[1].resume();
Sleep(thread[1].time);
}
My code is obviously generalized. It can work on any number of threads and any number of CPUs. I might publish an article about it some day, since i haven't found anything similar on the net last time i checked.
I use it that way:
CScheduler *scheduler = new CScheduler();
CThread *thread1 = new CThread();
CThread *thread2 = new CThread();
CThread *thread3 = new CThread();
scheduler->add(thread1, 10);
scheduler->add(thread2, 20);
scheduler->add(thread3, 3);
Assuming a single CPU machine, thread1 will run for 10 milliseconds, then thread2 for 20 milliseconds, then thread3 for 3 milliseconds. This is equivalent of a CPU usage of 30% vs 60% vs 10%.
Now i will be working on planet textures generation.
| |
 Starfield video |
Posted - 7/19/2005 10:25:17 AM | Lutz asked for a video of the 360° starfield, so here it is (WMV 9). Note that the exposure in that video is constant, as i haven't coded the automatic exposure yet.
And the traditionnal screenshot:

| |
 Cube map blurring |
Posted - 7/18/2005 8:18:47 AM | Good news, i've finally got a working version of cube map blurring (required to have seamless edges between the faces of the cube). Initially i implemented something similar to ATI's cube gen, but this was 500 times slower than my original blur (without edge fix). I've put a lot of effort on it, but i successfully fixed my original blur with the edge fix. This involved drawing lots of diagrams on a paper to find all the possible configurations, and adjusting the sampled texels around the borders of each cube image. A real pain to code. But now i've got a nice 360° starfield.
I'm now going to work on a few optimizations. Even if my blur is 500 times faster than ATI's one, there's still room for improvements. There are a few "ifs" that can be put outside a loop, and i can generate the blur in monochromatic space instead of blurring a real RGB image. I am trying to reach a few tens of seconds max to generate a single 1024x1024x6 cube starfield (it's still a couple of minutes right now, but better than the 2 or 3 days required with ATI's blur :)).
By the way, if you are wondering, i'm using a 2-pass blur with an o(n^2) loop, while ATI's code seems to be using a brute-force blur with an o(n^4) loop. This is only possible because i'm using a box filter (with a gaussian filter i'd have to use an o(n^3) loop).
| |
 HDRI Starfield part II |
Posted - 7/14/2005 8:42:09 AM | This is more or less the final version of my first starfield:

I'm pretty happy with it now. It is composed of 4 main layers, similar to the one i posted in my previous post. I still have to fit in some dark clouds and to make the parameters dependant on an input table, but the code is pretty much done.
The next step is to create a separate thread to calculate the textures, to use a disk cache (to avoid regenerating them every time), and maybe to add a kind of additional glow/lens effect to the very bright stars.
| |
 HDRI and starfield |
Posted - 7/13/2005 11:52:56 AM | A short update. I've continued to work on HDRI, and i'm trying to produce a starfield in high-dynamic range. So far here is one result:

It's only one layer of the final composite starfield. The final starfield will have a level of complexity much higher. But i have to play with a lot of parameters for it to look good. Choosing good colors, contrast and glow strengths is the key.
| |
 Many things (or i can't think of a good subject) |
Posted - 7/11/2005 5:58:56 AM | As seen on MMORPG.com, this made my day:
Quote:
Having an original and innovative idea merged with a strong storyline and theme is more valuable than having the technical ability to make it happen.
|
Last Saturday, i went to the cinema to watch "War of the Worlds". Haven't been disapointed. Not the film of the century, but a good movie with fantastic CGI and action. A bit less cliché than standard Hollywoodian movies too. I was surprized by the amount of old people that came to see it.
While i was in town, i also went to a library to buy "The Hitchhiker's Guide to the Galaxy". I've started to read the first chapters, and i must say, this book is hilarious :) Can't wait to read more.
| |
 HDRI thoughts |
Posted - 7/10/2005 5:35:12 PM | I've been trying for a few days to implement correct HDRI in my engine.
I'm giving up. Not because of a technical difficuly, but because there is not enough current-generation video cards able to do it.
ATI cards, for example. Yes, you can find nice HDRI demos running on ATI cards... but they are just simple demos. In particular, my engine uses per-pixel lighting/shadowing, and requires additive multiple passes (once per light). Correct HDRI requires floating point buffers. But ATI cards do not support both blending AND floating point buffers.
I guess the R520 will support it, but i'm not gonna loose my sleep now for something that will "potentially" appear in 6 months-1 year (depending on more or less good driver support).
So, at the moment i will try to support a fake HDRI version. My plan is the following one:
1. Some textures will be encoded as RGBE, and decoded to floating point in a pixel shader.
2. Some constants (like the light colors) will be passed unclamped to the pixel shader.
3. The pixel shader will perform the per-pixel operations and scale the final color depending on the camera exposure. The result is then clamped into [0-1] (so i'm forgetting tone-mapping here :() and written to a standard fixed point RGBA buffer.
4. Repeat this process with additive blending for all the remaining passes (note: at each step the result will be clamped to [0-1] which is not good.. but no choice).
5. Once all the passes are rendered, get the HDRI buffer, keep any pixel with R, G or B equal to 1, and set to black all the other ones (this will avoid to have bloom on the whole image, which is ugly in my opinion).
6. Blur the buffer, add it back to the color buffer.
The thing that worries me the most is at step 5, because everything is clamped to [0-1], some colors which are logically different will end up with the same bloom color. But i'll see if the results are "good enough".
| |
 Space engine: update |
Posted - 7/5/2005 4:40:20 PM | In the last days i've been doing some bug-fixing and tweaking of some parameters for my space engine. I've now stabilized my physics, database and networking interactions. I still have a list of small bugs to fix, which will take a few more days.
I've started studying high-dynamic range rendering to implement it in my engine. I've been fighting against OpenGL and its extensions mechanism for many hours.
So here's the time for the OpenGL rant.
First, i consider myself as an OpenGL expert. I develop vertex and fragment shaders daily during my job. I've been using a lot of advanced extensions for years - so i'm definately not a newbie. And OpenGL has been my API of choice in the last 5 years, but it's getting really annoying. The amount of extensions is always growing, to maintain backwards compatibility. This means that today, you've got up to a hundred extensions listed in the extension string, both on NVidia and ATI cards. Most of these extensions are in two or three versions, some with subtle differences and restrictions.
While implementing floating point pbuffers today, i discovered that you need to create a floating point texture on the GL_TEXTURE_RECTANGLE target only.. even if it's dimension is square. That's right: using GL_TEXTURE_2D will generate a nice GL error.
So let's see how it affects my engine now: i've got a nice, generic interface, to generate a renderable buffer. One of the constructor arguments is the format for this buffer: it can be a standard unsigned byte RGBA buffer, or a floating point RGBA buffer. Now, depending on the case, it will internally either generate a 2D texture, or a RECTANGLE texture.
But extensions are soooo wonderful in OpenGL, that the texture coordinates are not handled the same way for 2D or for RECT textures: texture coordinates are normalized for 2D ones, but non-normalized for rectangular textures.
As a result, the user has to test if OpenGL is using a 2D or a rectangular texture, and must adjust his coordinates. While in theory, everything was in place to be coherent and hidden from the user.
PBuffers are close to a nightmare to use, and the concept of context switching as well as texture copy, are reducing performance for nothing. That's why i'd like to use fragment buffer objects, but they are not supported on ATI cards yet. Arg.
OpenGL has become a big mess, and i'm moderating my words.
Sorry for the rant.
| |
|
| S | M | T | W | T | F | S | | | | | | 1 | 2 | 3 | 4 | | 6 | 7 | 8 | 9 | | | 12 | | | 15 | 16 | 17 | | | 20 | 21 | | 23 | | 25 | 26 | 27 | 28 | 29 | 30 | 31 | | | | | | |
OPTIONS
Track this Journal
ARCHIVES
October, 2009
August, 2009
July, 2009
May, 2009
April, 2009
March, 2009
February, 2009
January, 2009
November, 2008
October, 2008
July, 2008
June, 2008
May, 2008
April, 2008
March, 2008
January, 2008
December, 2007
November, 2007
October, 2007
September, 2007
August, 2007
July, 2007
June, 2007
May, 2007
April, 2007
March, 2007
February, 2007
January, 2007
December, 2006
November, 2006
October, 2006
September, 2006
August, 2006
July, 2006
June, 2006
May, 2006
April, 2006
March, 2006
February, 2006
January, 2006
December, 2005
November, 2005
October, 2005
September, 2005
August, 2005
July, 2005
June, 2005
May, 2005
April, 2005
March, 2005
February, 2005
January, 2005
December, 2004
October, 2004
September, 2004
August, 2004
|