• entries
    232
  • comments
    1462
  • views
    956544

Entries in this blog

Ysaneya

Patch 0.1.6.0 has been released a few weeks ago. This patch introduced improved cockpits with lighting & shadowing, the hauler NPC ship ( static at the moment, but once the game becomes more mature there will be A.I. ), military bases and factories ( currently placeholders: undetailed and untextured ) on planets. See the attached screenshots.

62427d9983e709164254043f438f1f35de9490ef.jpg

7847f913d07e6cdea32a9b1fd3b59cc0f94f4697.jpg

bc30c01d49e0ea03ccec0fc5a7e5003fe813452e.jpg

One of our artists, Dan Hutchings, is making a first pass on the space station modules. In Infinity: Battlescape, we designed our space stations, bases and factories to be modular. This means that we model & texture independant modules, which can get attached together in various configuration layouts. Here's one of such layouts for a space station:

2016_10_13_stationmodules_004.jpg

But more layouts / shapes are possible to create interesting combinations:

stationscalecomparison_720.jpg

Meanwhile, I've been working on refactoring the client / server ( most of the code was quickly set up for our Kickstarter campaign and still suffers from architecturing issues; for example, projectiles hit detection is still client authoritative, which is a big no-no ) and improving networking latency, bandwidth and interpolation. It is expected that this work will take at least a month, if not more, but during this refactoring I'll also add a bunch of new gameplay elements ( teams, resources/credits generation etc.. ).

Work has started on the user interface / HUD too but I'll leave that for a future post.

Here are pics of the cargo ship ( hauler ):

2d1f94b5e8f1aca1a72b62311a4ac61ba017870e.jpg

f8421ac3bbcb2ebc4213fa2e8591518f6bab0df4.jpg

Ysaneya

Hey Everybody, long time no see, Ysaneya here ! I haven't posted in the past 6 years if I count well. Most of you probably don't remember me, but the few of you who do should remember the Infinity project and how it all started back in 2005. It started with a dream, one made of stars and full of procedurally-generated planets to visit. At the time, Elite was a long forgotten franchise and nobody was working on a procedural universe. I started to work in my spare time on a MMO project called Infinity.

7436a3dc88fbee8d8df7ca807159ffc3e7b2b91e.jpg

2005 - 2010: procedural dreams

In the first years, I started to research procedural planets generation. I also developped an entire engine ( nowadays known as the I-Novae Engine ) to support all features I'd need for the Infinity project. Including:


  • A flexible scene-graph
  • A 3D renderer supporting all the latest-gen features and shaders ( shadow mapping, motion blur, HDR, dynamic lighting.. the usual list.. )
  • A physics engine ( I settled on ODE )
  • An audio engine ( OpenAL )
  • A network engine ( based on UDP )
  • All the procedural planetary & universe generation technology

    In 2007 I released a small free game, simply named the "Infinity Combat Prototype". The goal for that game was to integrate all the engine into a game to validate that all the components were working together, and that a game ( some newtonian multplayer combat in arenas in space ) could be produced. The idea was that it'd be the first step that would eventually lead to the whole MMO.


    85d48d2ca89a554ba2ef7b8507ebbeae03dd1d5f.jpg

    Unfortunately, it's pretty much at this point that I started to get "lost" into the ambition of the project. I had created the concept of "community contributions" where wannabe-artists could submit artwork, 3D models & textures to be used in the game, but it quickly took a dozen hours a week to review all this work and to validate/reject it, keeping in mind that 95% of it was at the indy level at best.

    I was the only programmer on the team, and so progress started to slow down tremendously. We entered into a vicious circle where as months were passing, the cool brand new technology was getting deprecated / looking obsolete, and catching up took months for a single feature. That was the time were I replaced the old fashioned renderer by a deferred renderer, implemented dynamic lighting and shadow mapping and all sorts of visually cool stuff.. but meanwhile, gameplay progress was at a standpoint. I spent some time working on the client/server architecture and databases, but nothing too fancy, and definitely not to the point it could be used for a full fledged MMO.

    By 2010 it became crystal clear that as the sole programmer of the project, even using procedural technology and an artists community to alleviate the content generation problem, I couldn't keep up. A few programmers offered their help but clearly weren't up to the task, or gave up very quickly after a few months. If you've been an indy relying on external help by volunteers to work on your project, that should ring a bell.

    But in early 2010, I met Keith Newton, an ex-developer from Epic Games who worked on the Unreal Engine. He offered to set up an actual company, review our strategoy and approach the problem from a professional & business perspective. I was about to give up on the project at that time, so naturally, I listened.

    37ba7cb75897f25dcd20445a9a9526e74c3ceb39.jpg

    2010 - 2012: Infancy of I-Novae Studios

    We formed the company I-Novae Studios, LLC, in early 2010, and started to look for investors that could be interested in the technology. Or companies interested in doing partnerships or licensing.

    Unfortunately it was bad timing and we didn't realize that immediately. If you recall, this was right after the economic crisis of 2008. All the people we talked to were very interested in the tech, but none were ready to risk their money in a small company with no revenue. We had a few serious opportunities during these year, but for various reasons nothing ever came out of it. Another problem was that this period was the boom of the mobile market, and most companies we talked to were more interested in doing mobile stuff than, sic, a PC game.

    2f68b5419531746182d677d396fd19e1eb597389.jpg

    During these years we also revamped our technology from the grounds up to modernize it. We switched to physical-based rendering ( PBR ) at this time, implemented a powerful node-based material system, added an editor ( one thing I simply never worked on pre-2010, due to lack of resources ) and much more. Keith worked approximately 2 years and a half full time, out of his own savings, to mature the tech and look for business opportunities. Meanwhile, our other artists and I were still working part time.

    On the game side, unfortunately things still weren't looking great. It was our strategy to focus back on the technology and put Infinity on hold. We came to the conclusion that we'd probably need millions to realistically have a shot at producing a MMO at a decent quality and in good conditions, and that it couldn't be our first project as a company. In 2012, Kickstarter started to become a popular thing. It was at this time that we started to play with the idea of doing a Kickstarter for a less ambitious project, but still including our key features: multiplayer components and procedural planetary generation. That was how Infinity: Battlescape was born.

    a51b64dadf8d066b7565cf76fd1150c5f9f5150d.jpg

    2013 - 2015: Kickstarter, full steam ahead

    It took us more than 2 years to prepare our Kickstarter. Yup. At this point Keith was back to working part time, but I left my job to dedicate myself to the Kickstarter, working full time out of my own savings on it.

    To produce the Kickstarter we needed a lot of new content, never shown before, and at near-professionel quality. This included a ship with a fully textured PBR cockpit, mutliple smaller ships/props, asteroids, a gigantic space station, multiple planetary texture packs and a larger cargo ship. We decided pretty early to generate the Kickstarter video in engine, to demonstrate our proprietary technology. It'd show seamless take offs from a planet, passing through an asteroid field, flying to a massive space station that comes under attack, with lots of pew-pew, explosions and particle effects. IIRC we iterated over 80 times on this video during the year before the Kickstarter. It's still online, and you can watch it here:

    Meanwhile, I was also working on a real-time "concept demo" of Infinity: Battlescape. Our original plan was to send the demo to the media for maximum exposure. It took around 8 months to develop this prototype. It was fully playable, multiplayer, including the content generated by our artists in the Kickstarter trailer. The player could fly seamlessly between a few planets/moons, in space, around asteroids or dock in a space station. Fights were also possible, but there never was more than a handful of players on the server, so we could never demonstrate one of the keypoints of the gameplay: massive space battles involving hundreds of players.

    56f22eefa53934c9ff2a720421ddb6205a7b45d8.png

    In October 2015, we launched our Kickstarter. It was a success and we gathered more than 6000 backers and $330,000, a little above the $300,000 we were asking for the game. It was one of the top 20 most successful video games Kickstarters of 2015. Our media campaign was a disapointment and we received very little exposure from the mass media. I understandably blame our "vaporware" history. The social media campaign however was a success, particularly thanks to a few popular streamers or twitters that brought exposure on us, and by Chris Roberts from Star Citizen who did a shout-out on his website to help us.

    But as much as we're happy to -finally- have a budget to work with, it was only the beginning..

    99d22042148c3e4343338d94fcef5dab7508f859.jpg

    2016+: Infinity Battlescape

    We started full development in February 2016 after a few months of underestimated post-KS delays ( sorting out legal stuff, proper contracts with salaries for our artists, and figuring out who was staying and who was leaving ).

    Since then, we've focused on game design, producing placeholders for the game prototype and improving our technology. We're still working on adding proper multithreading to the engine, moving to modern Entity-Componeny-System ( ECS ), and figuring out what to do with Vulkan and/or Directx 12. Meanwhile we're also working on networking improvements and a more robust client/server architecture.

    The game is scheduled for release in end-2017.

    All the pictures in this article are coming from our current pre-alpha.

    https://www.inovaestudios.com/

    73bd04480ee8183ab9fe30b7bb6edfe82618a02f.jpg

Ysaneya
It's been many years since the release of the last video showcasing the seamless planetary engine, so I'm happy to release this new video. This is actually a video of the game client, but since there's little gameplay in it, I decided to label it as a "tech demo". It demonstrates an Earth-like planet with a ring, seamless transitions, a little spaceship ( the "Hornet" for those who remember ), a space station and a couple of new effects.

You can view it in the videos section of the gallery.

Making-of the video

Before I get into details of what's actually shown in the video, a few words about the making-of the video itself, which took more time than expected.

What a pain ! First of all, it took many hours to record the video, as each time I forgot to show something. In one case, the framerate was really low and the heavy stress required to dump a 1280x720 HQ uncompressed video to the disk. The raw dataset is around 10 GB for 14 minutes of footage.

14 minutes ? Yep, that video is pretty long. Quite boring too, which is to be expected since there's no action in it. But I hope you'll still find it interesting.

Once the video was recorded, I started the compression process. My initial goal was to upload a HQ version to YouTube and a .FLV for the video player embedded on the website. The second was quite easily done, but the quality after compression was pretty low. The bitrate is capped to 3600 kbps for some reason, and I didn't find a way to increase it. I suspect it's set to this value because it's the standard with flash videos.

I also wanted to upload a HQ version to YouTube to save bandwidth on the main site, but so far it's been disappointing. I tried many times, each time YouTube refused to recognize the codec I used for the video ( surprisingly, H264 isn't supported ). After a few attempts I finally found one that YouTube accepted, only to discover that the video was then rejected due to its length: YouTube has a policy to not accept videos that are more than 10 minutes long. What a waste of time.

So instead I uploaded it to Dailymotion , but it's very low-res and blurry, which I cannot understand since the original resolution is 1280x720; maybe it needs many hours to post-processing, I don't know. There's also now a two parts HQ video uploaded to youtube:
" target="_self">part 1 and
" target="_self">part 2 . If you're interested in watching it, make sure you switch to full screen :)

Content of the video

The video is basically split in 3 parts:

1. Demonstration of a space station, modelled by WhiteDwarf and using textures from SpAce and Zidane888. Also shows a cockpit made by Zidane888 ( I'll come back on that very soon ) and the Hornet ( textured by Altfuture ).

2. Planetary approach and visit of the ring. Similar to what's already been demonstrated in 2007.

3. Seamless planetary landings.

Cockpit

I've been very hesitant in including the cockpit in the video, simply because of the exceptations it could potentially generate. So you must understand that it's an experiment, and in no way guarantees that cockpits will be present for all ships in the game at release time. It's still a very nice feature, especially with the free look around. You will notice that you can still see the hull of your ship outside the canopy, which is excellent for immersion. Note that the cockpit isn't functionnal, so if we indeed integrate it to the game one day, I would like that all instruments display functionnal informations, that buttons light on/off, etc..

patd7_med.jpg

Background

The backgrounds you see in the video ( starfield, nebula ) are dynamically generated and cached into a cube map. This means that if you were located in a different area of the galaxy, the background would be dynamically refreshed and show the galaxy from the correct point of view.

Each star/dot is a star system that will be explorable in game. In the video, as I fly to the asteroids ring, you will see that I click on a couple stars to show their information. The spectral class is in brackets, and follows is the star's name. At the moment, star names are using a unique code which is based on the star location in the galaxy. It is a triplet formed of lower/upper case characters and numbers, like q7Z-aH2-85n. This is the shortest representation that I could find that would uniquely identify a star. This name is then followed by the distance, in light-years ( "ly" ).

I still have to post a dev-journal about the procedural rendering of the galaxy on the client side, in which I'll come back on all the problems I've had, especially performance related.

patd7_med.jpg

Planet

I'm not totally happy with the look of the planet, so it is likely that in the future, I will at least do one more update of the planetary engine. There are various precision artifacts at ground level, as the heightmaps are generated on the GPU in a pixel shader ( so are limited to 32-bits of floating point precision ). I've also been forced to disable the clouds, which totally sucks as it totally changes the look & feel of a planet seen from space. The reason for that is that I implemented the Z-Buffer precision enchancement trick that I described in a previous dev journal, and it doesn't totally work as expected. With clouds, the clouds surface is horribly Z-fighting with the ground surface, which wasn't acceptable for a public video. At the moment, I use a 32-bits floating point Z-Buffer, reverse the depth test and swap the near/far clipping planes, which is supposed to maximize Z precision.. but something must have gone wrong in my implementation, as I see no difference with a standard 24-bits fixed point Z Buffer.

The terrain surface still lacks details ( vegetation, rocks, etc.. ). I still have to implement a good instancing system, along with an impostor system, to get an acceptable performance while maintening a high density of ground features.

patd7_med.jpg

patd7_med.jpg

Look & Feel

Don't think for one second that the "look & feel" of the camera and ship behavior is definitive in this video. I'm pretty happy with the internal view and the cockpit look, but the third-person camera still needs a lot of work. It theorically uses a non-rigid system, unlike the ICP, but it still needs a lot of improvements.

Effects

As you may notice, the ship's thrusters correctly fire depending on the forces acting on the ship, and the desired accelerations. Interestingly, at one given point in time, almost all thrusters are firing, but for different reasons. First, the thrusters that are facing the planet are continuously firing to counter-act the gravity. It is possible to power down the ship ( as seen at the end of the video ), in which case the thrusters stop to work. Secondly, many thrusters are firing to artifically simulate the drag generated by the auto-compensation of inertia. For example when you rotate your ship to the right, if you stop moving the mouse the rotation will stop after a while. This is done by firing all the thrusters that would generate a rotation to the left. Of course, some parameters must be fined tuned.

When the ship enters the atmosphere at a high velocity, there's a friction/burning effect done in shaders. It still lacks smoke particles and trails.

This video will also give you a first idea of how long it takes to land or take off from a planet. The dimensions and scales are realistic. Speed is limited at ground level for technical reasons, as higher speeds would make the procedural algorithms lag too much behind, generating unacceptable popping. At ground level, I believe you can fly at modern airplanes speeds. A consequence of this system is that if you want to fly to a far location on the planet, you first have to fly to low space orbit, then land again around your destination point.

patd7_med.jpg

patd7_med.jpg
Ysaneya

ASEToBin 1.0 release

Finally, the long awaited ASEToBin 1.0 has been released !

ASEToBin is a tool that is part of the I-Novae engine ( Infinity's engine ). It allows contributors and artists to export their model from 3DS Max's .ASE file format and to visualize and prepare the 3D model for integration into the game.

This new release represents more or less 200 hours of work, and is filled with tons of new features, like new shaders with environmental lighting, skyboxes, a low-to-high-poly normal mapper, automatic loading/saving of parameters, etc..

ASEToBin Version 1.0 release, 06/10/2009:

http://www.fl-tw.com/Infinity/Docs/SDK/ASEToBin/ASEToBin_v1.0.zip

Changes from 0.9 to 1.0:

- rewrote "final" shader into GLSL; increase of 15% performance (on a Radeon 4890).
- fixed various problems with normal mapping: artifacts, symmetry, lack of coherency between bump and +Z normal aps, etc.. hopefully the last revision. Note that per-vertex interpolation of the tangent space can still lead to smoothing artifacts, but that should only happen in extreme cases (like the cube with 45? smoothed normals) that should be avoided by artists in the first place.
- removed anisotropic fx in the final shader and replaced it by a fresnel effect. Added a slider bar to control the strength of the fresnel reflection ("Fresnel").
- changed the names of the shaders in the rendering modes listbox to be more explicit on what they do.
- set the final shader (now "Full shading") to be the default shader selected when the program is launched.
- added a shader "Normal lighting" that shows the lighting coming from per-pixel bump/normal mapping.
- added support for detail texturing in "Full Shading" shader. The detail texture must be embedded in the alpha channel of the misc map.
- increased accuracy of specular lighting with using the real reflection vector instead of the old lower precision half vector.
- added support for relative paths.
- added support for paths to textures that are outside the model's directory. You can now "share" textures between different folders.
- added automatic saving and reloading of visual settings. ASEToBin stores those settings in an ascii XML file that is located next to the model's .bin file.
- ase2bin will not exit anymore when some textures could not be located on disk. Instead it will dump the name of the missing textures in the log file and use placeholders.
- fixed a crash bug when using the export option "merge all objects into a single one".
- ambient-occlusion generator now takes into account the interpolated vertex normals instead of the triangle face. This will make the AO map look better (non-facetted) on curved surfaces. Example:
Before 1.0: http://www.infinity-universe.com/Infinity/Docs/SDK/ASEToBin/ao_before.jpg
In 1.0: http://www.infinity-universe.com/Infinity/Docs/SDK/ASEToBin/ao_after.jpg
- added edge expansion to AO generator algorithm, this will help to hide dark edges on contours due to bilinear filtering of the AO map, and will also fix 1-pixel-sized black artifacts. It is *highly recommended* to re-generate all AO maps on models that were generated from previous version of ASEToBin, as the quality increase will be tremendous.
- automatic saving/loading of the camera position when loading/converting a model
- press and hold the 'X' key to zoom the camera (ICP style)
- press the 'R' key to reset the camera to the scene origin
- reduced the znear clipping plane distance. Should make it easier to check small objects.
- program now starts maximized
- added a wireframe checkbox, that can overlay wireframe in red on top of any existing shader mode.
- added a new shader "Vertex lighting" that only shows pure per-vertex lighting
- fixed a crash related to multi-threading when generating an AO map or a normal map while viewing a model at the same time.
- added a skybox dropdown that automatically lists sll skyboxes existing in the ASEToBin's Data/Textures sub-directories. To create your own skyboxes, create a folder in Data/textures (name doesn't matter), create a descr.txt file that will contain a short description of the skybox, then place your 6 cube map textures in this directory. They'll be automatically loaded and listed the next time ASEToBin is launched.
- the current skybox is now saved/reloaded automatically for each model
- added a default xml settings file for default ASEToBin settings when no model is loaded yet. This file is located at Data/settings.xml
- removed the annoying dialog box that pops up when an object has more than 64K vertices
- fixed a bug for the parameter LCol that only took the blue component into account for lighting
- added support for environment cube map lighting and reflections. Added a slider bar to change the strength of the environment lighting reflections ("EnvMap"). Added a slider bar to control the strength of the environment ambient color ("EnvAmb").
- added experimental support for a greeble editor. This editor allows to place greeble meshes on top of an object. The greeble is only displayed (and so only consumes cpu/video resources) when the camera gets close to it. This may allow kilometer-sized entities to look more complex than they are in reality.
- added experimental support for joypads/joysticks. They can now be used to move the camera in the scene. Note that there's no configuration file to customize joystick controls, and the default joystick is the one used. If your joystick doesn't work as expected, please report any problem on the forums.
- added a slider bar for self-illumination intensity ("Illum")
- added a slider bar for the diffuse lighting strength ("Diffuse")
- added a Capture Screenshot button
- added a new shader: checkerboard, to review UV mapping problems (distortions, resolution incoherency, etc..)
- added the number of objects in the scene in the window's title bar
- added a button that can list video memory usage for various resources (textures, buffers, shaders) in the viewer tab
- added a Show Light checkbox in the visualization tab. This will display a yellowish sphere in the 3D viewport in the direction the sun is.
- added new shaders to display individual texture maps of a model, without any effect or lighting (Diffuse Map, Specular Map, Normal Map, Ambient Map, Self-illumination Map, Misc Map, Detail Map)
- fixed numerous memory/resources leaks
- added a button in the visualization tab to unload (reset) the scene.
- added an experimental fix for people who don't have any OpenGL hardware acceleration due to a config problem.
- added a button in the visualization tab to reset the camera to the scene origin
- added a checkbox in the visualization tab to show an overlay grid. Each gray square of the grid represents an area of 100m x 100m. Each graduation on the X and Y axis are 10m. Finally, each light gray square is 1 Km.
- added a feature to generate ambient-occlusion in the alpha channel of a normal map when baking a low-poly to a high-poly mesh. Note: the settings in the "converter" tab are used, even if disabled, so be careful!

Note: Spectre's Phantom model is included as an example in the Examples/ directory !

Screenshots (click to enlarge):

man-comet-1-med.jpg

man-comet-2-med.jpg

man-comet-3-med.jpg

man-comet-4-med.jpg

lynx1-med.jpg

lynx2-med.jpg

lynx3-med.jpg

phantom1-med.jpg

phantom2-med.jpg

phantom3-med.jpg
Ysaneya
Logarithmic zbuffer artifacts fix

In cameni's Journal of Lethargic Programmers, I've been very interested by his idea about using a logarithmic zbuffer.

Unfortunately, his idea comes with a couple of very annoying artifacts, due to the linear interpolation of the logarithm (non-linear) based formula. It particularly shows on thin or huge triangles where one or more vertices fall off the edges of the screen. As cameni explains himself in his journal, basically for negative Z values, the triangles tend to pop in/out randomly.

It was suggested to keep a high tesselation of the scene to avoid the problem, or to use geometry shaders to automatically tesselate the geometry.

I'm proposing a solution that is much more simple and that works on pixel shaders 2.0+: simply generate the correct Z value at the pixel shader level.

In the vertex shader, just use an interpolator to pass the vertex position in clip space (GLSL) (here I'm using tex coord interpolator #6):


void main()
{
vec4 vertexPosClip = gl_ModelViewProjectionMatrix * gl_Vertex;
gl_Position = vertexPosClip;
gl_TexCoord[6] = vertexPosClip;
}

Then you override the depth value in the pixel shader:


void main()
{
gl_FragColor = ...
const float C = 1.0;
const float far = 1000000000.0;
const float offset = 1.0;
gl_FragDepth = (log(C * gl_TexCoord[6].z + offset) / log(C * far + offset));
}

Note that as cameni indicated before, the 1/log(C*far+1.0) can be optimized as a constant. You're only really paying the price for a mad and a log.

Quality-wise, I've found that solution to work perfectly: no artifacts at all. In fact, I went so far as testing a city with centimeter to meter details seen from thousands of kilometers away using a very very small field-of-view to simulate zooming. I'm amazed by the quality I got. It's almost magical. ZBuffer precision problems will become a thing of the past, even when using large scales such as needed for a planetary engine.

There's a performance hit due to the fact that fast-Z is disabled, but to be honnest in my tests I haven't seen a difference in the framerate. Plus, tesselating the scene more or using geometry shaders would very likely cost even more performance than that.

I've also found that to control the znear clipping and reduce/remove it, you simply have to adjust the "offset" constant in the code above. Cameni used a value of 1.0, but with a value of 2.0 in my setup scene, it moved the znear clipping to a few centimeters.

Results

Settings of the test:
- znear = 1.0 inch
- zfar = 39370.0 * 100000.0 inches = 100K kilometers
- camera is at 205 kilometers from the scene and uses a field-of-view of 0.01?
- zbuffer = 24 bits

Normal zbuffer:

http://www.infinity-universe.com/Infinity/Media/Misc/zbufflogoff.jpg


Logarithmic zbuffer:
http://www.infinity-universe.com/Infinity/Media/Misc/zbufflogon.jpg

Future works

Could that trick be used to increase precision of shadow maps ?
Ysaneya
Tip of the day

Anybody who tried to render to a dynamic cube map probably has encountered the problem of filtering across the cube faces. Current hardware does not support filtering across different cube faces AFAIK, as it treats each cube face as an independent 2D texture (so when filtering pixels on an edge, it doesn't take into account the texels of the adjacent faces).

There are various solutions for pre-processing static cube maps, but I've yet to find one for dynamic (renderable) cube maps.

While experimenting, I've found a trick that has come very handy and is very easy to implement. To render a dynamic cube map, one usually setups a perspective camera with a field-of-view of 90 degrees and an aspect ratio of 1.0. By wisely adjusting the field-of-view angle, rendering to the cube map will duplicate the edges and ensure that the texel colors match.

The formula assumes that texture sampling is done in the center of texels (ala OpenGL) with a 0.5 offset, so this formula may not work in DirectX.

The field-of-view angle should equal:

fov = 2.0 * atan(s / (s - 0.5))

where 's' is half the resolution of the cube (ex.: for a 512x512x6 cube, s = 256).

Note that it won't solve the mipmapping case, only bilinear filtering across edges.

Results:
Dynamic 8x8x6 cube without the trick:
http://www.infinity-universe.com/Infinity/Media/Misc/dyncube_seam1.jpg

Dynamic 8x8x6 cube with the trick:
http://www.infinity-universe.com/Infinity/Media/Misc/dyncube_seam2.jpg
Ysaneya
In this journal, no nice pictures, sorry :) But a lot to say about various "small" tasks ( depending on your definition of small. Most of them are on the weekly scale ). Including new developments on the audio engine and particle systems.

Audio engine


As Nutritious released a new sound pack ( of an excellent quality! ) and made some sample tests, I used the real-time audio engine to perform those same tests and check if the results were comparable. They were, with a small difference: when a looping sound was starting or stopping, you heard a small crack. It seems like this artifact is generated when the sound volume goes from 100% to 0% ( or vice versa ) in a short amount of time. It isn't related to I-Novae's audio engine in particular, as I could easily replicate the problem in any audio editor ( I use Goldwave ). It also doesn't seem to be hardware specific, since I tested both on a simple AC'97 integrated board and on a dedicated Sound Blaster Audigy, and I had the crack in both cases.

A solution to that problem is to use transition phases during which the sound volume smoothly goes from 100% to 0%. It required to add new states to the state machine used in the audio engine, and caused many headaches. But it is now fixed. I've found that with a transition of 0.25s the crack has almost completely disappeared.

One problem quickly became apparant: if the framerate was too low, the sound update ( adjusting the volume during transition phases ) wasn't called often enough and the crack became noticeable again. So I moved the sound update into a separate thread ( which will be good for performance too, especially on multi-core machines ) which updates at a constant rate independently of the framerate.

Since I was working on the audio engine, I also took some time to fix various bugs and to add support for adjusting the sound pitch dynamically. I'm not sure yet where it will be used, but it's always good to have more options to choose from.

Particle systems


In parallel I've been working on a massive update ( more technically a complete rewrite ) of the particle system. So far I was still using the one from the combat prototype ( ICP ), dating from 2006. It wasn't flexible enough: for example, it didn't support multi-texturing or normal mapping / per pixel lighting. Normal mapping particles is a very important feature, especially later to reimplement volumetric nebulae or volumetric clouds.

Particles are updated in system memory in a huge array and "compacted" at render time into a video-memory vertex buffer. I don't use geometry shaders yet, so I generate 4 vertices per particle quad, each vertex being a copy of the particle data with a displacement parameter ( -1,-1 for the bottom-left corner to +1,+1 for the top-right corner ). The vertices are displaced and rotated like a billboard in a vertex shader.

Performance is decent: around 65K particles at 60-70 fps on a GF 8800 GTS, but I'm a bit baffled that my new Radeon HD 4890 is getting similar framerates, as it's supposed to be much faster than a 8800 GTS. I ran a profiler and most of the time seems to be spent into uploading the vertex buffer rather than updating or rendering. I don't know whether I should blame Vista or ATI/AMD...

I still have a few ideas to experiment to better manage the vertex buffer and avoid re-filling it completely every frame, especially when some particles are static ( example: all particles in a nebulae ).

Visual Studio 2008


I migrated all my projects to Visual Studio 2008. While doing so I switched the C-runtime library from dynamic to static, hopefully avoiding future problems with missing dependencies. Unfortunately, most of the external libraries I was using were compiled with the dynamic CRT, so I had to update and recompile every single of those libraries, which took a lot of time. I also used that occasion to move the automatic linking of INovae's IKernel from the header files to the cpps.

Normal mapping


SpAce reported normal mapping problems in ASEToBin. He generated a cube in 3ds max, duplicated it, applied a UV map to one of them and used it as the "low poly" mesh, while the other version is the "high poly". Then he baked the normal maps from the hi-poly to the low-poly into a texture and loaded it in ASEToBin. The results were wrong: in 3ds max the cube render was as expected, but in ASEToBin, there was some strange smoothing/darkening artifacts.

I played with that code for days and was able to improve it, but arrived to the conclusion that they were caused by vertex interpolation of the tangent space. 3ds max doesn't interpolate the tangent space per vertex, but actually re-calculates the tangent space per pixel. The only way I could do that in ASEToBin ( or more generally in the game ) is to shift this calculationto the pixel shader, but for various reasons it's a bad idea: it'd hurt performance quite a bit; it'd raise the hardware requirements, etc..

So far I haven't seen any real-time engine/tool that took 3ds max's normal map and rendered the cube with good lighting, which comforts my in my conclusion that it can only be fixed if you perform the calculations per pixel.

Gathering Texture packs


In the past years, many people have made tiling texture packs. Those texture packs have variable quality; some of the textures inside the packs are excellent; others are "good enough"; others aren't so nice. Almost none of them were made with a specific faction in mind - which is partially due to us not providing clear guidelines on the visual style of faction textures -. In any case, I think it's time to collect all those textures, filter them by quality, sort them by faction and re-publish them in a single massive pack everybody can use.

It will take a while to sort everything. A few devs are currently working on new textures ( especially SFC textures ), but I think it would be nice if in the coming weeks some contributors could help. We are primarily looking for generic textures, like plating for hulls, greeble, hangar/garages elements, etc.. Also, if you have work-in-progress textures sitting on your hard drive in a decent ( usable ) state, now would be a good time to submit them.
Ysaneya

Galaxy generation

In the past weeks, I've been focusing my efforts on the server side. A lot of things are going on, especially on the cluster architecture. But one particular area of interest is the procedural galaxy generator. In this journal, I will be speaking of the algorithm used to generate the stars and the various performance/memory experiments I made to stress the galaxy generator.

Overview


Note: video available at the end of the article.

Our galaxy, the Milky Way, contains an estimated 100 to 400 billion stars. As you can imagine, generating those in a pre-processing step is impossible. The procedural galaxy generator must be able to generate stars data in specific areas, "regions of interest", usually around the players ( or NPCs, or star systems in which events happen ).

The jumpdrive system will allow a player to select any star and attempt to jump to it. The range doesn't matter. What's important is the mass of the target and the distance to it. Let's start with a simple linear formula where the probability to successfully jump is a function of M / D ( M = target's mass and D = distance ). Of course, the "real" formula is a lot more complicated and isn't linear, but let's forget about that now.

Under that simple formula, you will have the same chance of jumping to a star that has a mass of 1.0 and that is located 10 LY ( light-years ) away than you have to jump to a star of mass 10.0 that is located 100 LY away..

The mass of stars ( for stars that are on their main sequence ) is defining their color. Stars that are many times as massive as the Sun are blue; Sun-like stars are white/yellow; low-mass stars appear redish and are often called red dwarves.

How does all of that relate to the galaxy generator ? Well, it defines a fundamental constraint to it: it must be hierarchical. In other words, massive blue stars must be generated even when they're very far away, while lighter red dwarves only need to be generated in a volume much closer to the player.

If you don't fully understand that previous sentence very well, read it again and again until you fully realize what it means, because it's really important. Red dwarves that are far away aren't generated. At all. They're not displayed, but they're not even in memory, and do not consume memory. More subtely, it is impossible to "force" them to appear, until you "physically" approach them closer. This also implies that you will not be able to search a star by its name unless it's a "special" star stored in the database.

Generating a point cloud of the galaxy


The algorithm is based on an octree. Remember that stars must be generated hierarchically. The octree is subdivided around the player recursively until the maximum depth ( 12 ) is reached. Each node in the octree has a depth level ( starting at 0 for the root node ) and increased by 1 at each recursion level ( so the maximum will be 12 ). This depth level is important because it determines the type of stars that are generated in that node.

This level is used as an index into a probability table. The table stores probabilities for various star classes at different depths. For the root node ( level #0 ) for example, there may be a 40% chance to generate an O-class ( hot blue ) star, a 40% chance to generate a B-class and a 20% chance to generate an A-class star.

That way, it's possible to drive the algorithm to generate the good proportion of star classes.

The potential number of stars per node is only a function of the depth level. At the root level, there are 50 million stars. At the deepest level ( #12 ) there are 200 stars. Note that the actual amount of stars generated will be lower than that, because stars need to pass a decimation test. That's how you shape the galaxy... with a density function.

The density function takes as input some 3D coordinates in the galaxy and returns the probability in [0-1] that a star exists for the given coordinates.

To generate the spherical halo, the distance to the galactic origin is computed and fed into an inverse exponential ( with some parameters to control the shape ).

To generate the spiral arms, the probability is looked up from a "density map" ( similar to a grayscale heightmap ). The 2D coordinates as well as the distance to the galactic plane are then used to determine a density.

To generate globular clusters, the calculation is similar to the spherical halo, except that each cluster has a non-zero origin and a radius on the order of a few dozen light-years.

The final density function is taken as the maximum of all those densities.

To generate stars for a given node, a random 3D coordinate inside the node's bounding box is generated for each potential star. The density is evaluated for this location. Then a random number is generated, and if that number is lower than the density, the star actually gets generated and stored into the node.

When the node gets recursively split into 8 children, all stars from the parent node gets distributed into the correct child ( selected based on their coordinates ).

As a note, all nodes are assigned a seed, and when a node gets subdivided, a new seed is generated for each child. That seed is used in various places when random numbers need to be generated. Therefore, if the player goes away and a node gets merged, then comes closer again and the node gets split, the exact same stars will be generated. They will have the exact same location, the same color, the same class, etc..

The drawback of procedural generation is that any change made to any parameter of the algorithm ( like the number of stars per node, or the probability tables ) will result in a completely different galaxy. None of the stars will end up at the same place ( or if they do, it's just a coincidence ). So all the probabilities and parameters better be correctly adjusted before the game gets released, because after, it will lead to the apocalypse..





Performance considerations


The algorithm as described above suffers from performance problems. The reason is quite simple: if for a given node you have 1000 potential stars, then you need to generate 1000 coordinates and test them against the density function at each coordinate, to see if a real star has been generated.

I quickly noticed that in the terminal nodes, the densities were pretty low. Imagine a cube of 100x100x100 LY located in the halo of the galaxy, far away from the origin: the density function over this volume will be pretty regular, and low ( I'm making this up, but let's say 10% ). This means that for 1000 potential stars, the algorithm will end up generating 1000 coordinates, evaluate the density 1000 times, and 10% of the candidates will pass the test, resulting in 100 final stars. Wouldn't it be better to generate 100 candidates only ? That would be 10 times faster !

Fortunately it's possible to apply a simple trick. Let's assume that the density function is relatively uniform over the volume: 10%. It's statistically equivalent to generate 1000 stars from which 1 out of 10 will succeed, than to generate 100 stars from which 10 out of 10 will succed. In other words, when the density is uniform, you can simply reduce the amount of stars by the correct ratio ( 1 / density ), or said otherwise, multiply the number of stars by the density ! 1000 stars * 10% = 100 stars.

Most of the time, the density isn't uniform. The lower the depth level of the node is, the larger the volume is, the less chance the density will be uniform over that volume. But even when the density isn't uniform, you can still use its maximum probability to reduce the number of potential candidates to generate.

Let's take a node of 1000 candidates where you have a 1% density on one corner and 20% on another corner (the maximum in the volume). It's still statistically equivalent to a node of 200 candidates ( 1000 * 20% ) with a density of 5% on the first corner and 100% on the other corner.

As you can see, there's no way around evaluating the density function for each candidate, but the number of candidates has been reduced by a factor of 5 while at the same time, the probability of the density function has been multiplied by 5. Less stars to generate, and for each star, a higher chance to pass the test: a win-win situation !

Memory considerations


Until now, I've explained how to generate the galaxy model and how stars are procedurally distributed on-the-fly without any pre-processing. But keep in mind that the algorithm is primarily used on the server, and that there won't be just one player, but thousands of them. How does the galaxy generation works with N viewpoints ?

To keep it short, I modified the standard octree algorithm to split nodes as soon as needed, but delayed merging nodes together until more memory is needed.

The galaxy manager works as a least-recently-used ( LRU ) cache. Stars data and nodes consume memory. When the target memory budget is reached, a "garbage collector" routine is launched. This routine checks all nodes and determines which nodes have been the least recently used ( that is: the nodes that have been generated long ago, but that aren't in use currently ). Those nodes are then merged and memory is freed.

It's a bit tricky to stress test the galaxy generator for performance and memory with multiple players, simply because it's extremely dependent on where players will be located in the galaxy. The worst case would probably be players randomly distributed in the galaxy, all far from each other. But, doing an educated guess, I don't expect this to be the norm in reality: most players will tend to concentrate around the cores, or around each other, forming small groups. But even then, can we say that 90% of the players will be at less than 1000 LY from the cores ? Is it even possible to estimate that before beta starts ?

Galactic map considerations


I've followed with interest suggestions of players in the galactic map thread, and how the galactic map should look like. After testing the galaxy generator, I arrived to the conclusion that everybody severely under-estimates the amount of stars there can be in a volume close to you. For example, in a radius of 100 LY, in the spiral arms with an average density, it's not uncommon to find 5000 stars.

Remember that the jump-drive is not limited exclusively by range. Or more exactly, while distance is a factor, there's no "maximal range". This means that it's perfectly possible to try to jump at a red dwarf that is 5000 LY away. The probability to succeed is ridiculously small ( more than winning at the lottery ), but non-zero. Of course, for the galactic map, this means that even stars that are far away should be displayed ( provided that you don't filter them out ). That's an insane number of dots that may appear on your map...

One of the more effective filters, I think, will be the jump-probability filter. That one is a given: only display stars with a minimum of 50% jump success.

In the following screenshots, you can see a blue sphere in wireframe. This defines the range in which stars are displayed. It's just an experiment to make people realize how many stars there are at certain ranges: by no means it shows how the galactic map will work ( plus, it's all on the server, remember ! ).

I can select any star by pressing a key, and it gets highlighted in pink. On the top-left, you can see some informations about the selected star: first, a unique number that defines the "address" ( in the galaxy ) of the star. On the line below, the 3 values are the X Y and Z coordinates of the star compared to the galactic origin. Then, the star class, its distance in light-years, and finally the jumping probability.

In the coming weeks, I will probably move the galaxy algorithm to the client and start to add some volumetric/particle effects on the stars/dust to "beautify" it. The reason the algorithm will also be running on the client is to avoid having to transfer a million coordinates from the server to the client each time the player opens his galactic map. That wouldn't be kind to our bandwidth...







Video


I produced a demonstration video.
">Watch it on Youtube in HD ! ( I will also uploaded it later to the website as I convert it to .flv ).
Ysaneya
In the past months, I've been wondering how to approach the problem of lighting inside hangars and on ship hulls. So far, I had only been using a single directional light: the sun. The majority of older games precompute lighting into textures ( called lightmaps ) but clearly this couldn't work well in the context of a procedural game, where content is generated dynamically at run time. Plus, even if it did.. imagine the amount of texture memory needed to store all the lighting information coming from surfaces of kilometers-long battleship !

Fortunately, there's a solution to the problem.. enter the fantastic universe of deferred lighting !

Deferred lighting



Traditionally, it is possible implement dynamic lighting without any precomputations via forward lighting. The algorithm is surprisingly simple: in a first pass, the scene is rendered to the depth buffer and to the color buffer using a constant ambient color. Then, for each light you render the geometry that is affected by this light only, with additive blending. This light pass can include many effects, such as normal mapping/per pixel lighting, shadowing, etc..

This technique, used in games silmilar to Doom 3, does work well, but is very dependent on the granularity of the geometry. Let's take an object of 5K triangles that is partially affected by 4 lights. This means that to light this object, you will need to render 25K triangles over 5 passes total ( ambient pass + 4 lights passes, each 5K ). An obvious optimization is, given one light and one object, to only render the triangles of the object that are affected by the light, but this would require some precomputations that a game such as Infinity cannot afford, due to its dynamic and procedural nature.

Now let's imagine the following situation: you've got a battleship made of a dozen of 5K-to-10K triangles objects, and you want to place a hundred lights on its hull. How many triangles do you need to render to achieve this effect with forward lighting ? Answer: a lot. Really, a lot. Too much.

Another technique that is getting more and more often used in modern games is deferred lighting. It was a bit impractical before shader model 3.0 video cards, as it required many passes to render the geometry too. But using multiple render targets, it is possible to render all the geometry once, and exactly once ! independently of the number of lights in the scene. One light or a hundred lights: you don't need to re-render all the objects affected by the lights. Sounds magical, doesn't it ?

The idea with deferred lighting is that, in a forward pass, geometric informations are rendered to a set of buffers, usually called "geometry buffers" ( abbrev: G-buffers ). Those informations usually include the diffuse color ( albedo ), the normal of the surface, the depth or linear distance between the pixel and the camera, the specular intensity, self-illumination, etc.. Note that no lighting is calculated yet at this stage.

Once this is done, for each light, a bounding volume ( which can be as simple as a 12-triangles box for a point light ) is rendered with additive blending. In the pixel shader, the G-buffers are accessed to reconstruct the pixel position from the current ray and depth, then this position is then used to compute the light color and attenuation, do normal mapping or shadowing, etc..

Implementation



G-Buffers



There are a few tricks and specificities in Infinity. Let's have a quick look at them. First of all, the G-buffers.

I use 4 RGBAF16 buffers. They store the following data:


- R G B A
Buffer 1 FL FL FL Depth
Buffer 2 Diffuse Diffuse Diffuse Self-illum
Buffer 3 Normal Normal Normal Specular
Buffer 4 Velocity Velocity Extinction MatID



'FL' = Forward lighting. That's one of the specificity of Infinity. I still do one forward lighting pass, for the sun and ambient lighting ( with full per-pixel lighting, normal mapping and shadowing ) and store the result in the RGB channels of the first buffer. I could defer it too, but then I'd have a problem related to atmospheric scattering. At pixel level, the scattering equation is very simple: it's simply modulating an extinction color ( Fex ) and adding an in-scattering color ( Lin ):

Final = Color * Fex + Lin

Fex and Lin are computed per vertex, and require some heavy calculations. Moving those calculations per pixel would kill the framerate.

If I didn't have a forward lighting pass, I'd have to store the scattering values in the G-buffers. This would require 6 channels ( 3 for Fex and 3 for Lin ). Here, I can get away with only 4 and use a grayscale 'Extinction' for the deferred lights ( while sun light really needs an RGB color extinction ).

'Velocity' is the view-space velocity vector used for motion blur ( computed by taking the differences of positions of the pixel between the current frame and the last frame ).

'Normal' is stored in 3 channels. I have plans to store it in 2 channels only and recompute the 3rd in the shader. However this will require to encode the sign bit in one of the two channels, so I haven't implemented it yet. Normals ( and lighting in general ) are computed in view space.

'MatID' is an ID that can be used in the light shader to perform material-dependent calculations.

As you can see, there's no easy way to escape using 4 G-buffers.

As for the format, I use F16. It is necessary both for storing the depth, but also encoding values in HDR.

Performance



At first, I was a bit disapointed by the performance hit / overhead caused by G-buffers. There are 4 buffers after all, in F16: that requires a lot of bandwidth. On an ATI X1950 XT, simply setting up the G-buffers and clearing them to a constant color resulted in a framerate of 130 fps at 1280x1024. That's before even sending a single triangle. As expected, changing the screen resolution dramatically changed the framerate, but I found this overhead to be linear with the screen resolution.

I also found yet-another-bug-in-the-ATI-OpenGL-drivers. The performance of clearing the Z-buffer only was dependent on the number of color attachments. Clearing the Z-buffer when 4 color buffers are attached ( even when color writes are disabled ) took 4 more time than clearing the Z-buffer when only 1 color buffer was attached. As a "fix", I simply dettach all color buffers when I need to clear the Z-buffer alone.

Light pass



Once the forward lighting pass is done and all this data is available in the G-buffers, I perform frustum culling on the CPU to find all the lights that are visible in the current camera's frustum. Those lights are then sorted by type: point lights, spot lights, directional lights and ambient point lights ( more on that last category later ).

The forward lighting ( 'FL' ) color is copied to an accumulation buffer. This is the buffer in which all lights will get accumulated. The depth buffer used in the forward lighting pass is also bound to the deferred lighting pass.

For each light, a "pass" is done. The following states are used:

* depth testing is enabled ( that's why the forward lighting's depth buffer is reattached )
* depth writing is disabled
* culling is enabled
* additive blending is enabled
* if the camera is inside the light volume, the depth test function is set to GREATER, else it uses LESS

A threshold is used to determine if the camera is inside the light volume. The value of this threshold is chosen to be at least equal to the znear value of the camera. Bigger values can even be used, to reduce a bit the overdraw. For example, for a point light, a bounding box is used and the test looks like this:


const SBox3DD& bbox = pointLight->getBBoxWorld();
SBox3DD bbox2 = bbox;
bbox2.m_min -= SVec3DD(m_camera->getZNear() * 2.0f);
bbox2.m_max += SVec3DD(m_camera->getZNear() * 2.0f);
bbox2.m_min -= SVec3DD(pointLight->getRadius());
bbox2.m_max += SVec3DD(pointLight->getRadius());
TBool isInBox = bbox2.isIn(m_camera->getPositionWorld());
m_renderer->setDepthTesting(true, isInBox ? C_COMP_GREATER : C_COMP_LESS);


Inverting the depth test to GREATER as the camera enters the volume allows to discard pixels in the background / skybox very quickly.

I have experimented a bounding sphere for point lights too, but found that the reduced overdraw was cancelled out by the larger polycount ( a hundred polygons, against 12 triangles for the box ).

I haven't implemented spot lights yet, but I'll probably use a pyramid or a conic shape as their bounding volume.

As an optimization, all lights of the same type are rendered with the same shader and textures. This means less state changes, as I don't have to change the shader or textures between two lights.

Light shader



For each light, a Z range is determined on the cpu. For point lights, it is simply the distance between the camera and the light center, plus or minus the light radius. When the depth is sampled in the shader, the pixel is discarded if the depth is outside this Z range. This is the very first operation done by the shader. Here's a snippet:


vec4 ColDist = texture2DRect(ColDistTex, gl_FragCoord.xy);
if (ColDist.w < LightRange.x || ColDist.w > LightRange.y)
discard;


There isn't much to say about the rest of the shader. A ray is generated from the camera's origin / right / up vectors and current pixel position. This ray is multiplied by the depth value, which gives a position in view space. The light position is uploaded to the shader as a constant in view space; the normal, already stored in view space, is sampled from the G-buffers. It is very easy to implement a lighting equation after that. Don't forget the attenuation ( color should go to black at the light radius ), else you'll get seams in the lighting.

Antialiasing



In a final pass, a shader applies antialiasing to the lighting accumulation buffer. Nothing particularly innovative here: I used the technique presented in GPU Gems 3 for Tabula Rasa. An edge filter is used to find edges either in the depth or the normals from the G-buffers, and "blur" pixels in those edges. The parameters had to be adjusted a bit, but overall I got it working in less than an hour. The quality isn't as good as true antialiasing ( which cannot be done by the hardware in a deferred lighting engine ), but it is acceptable, and the performance is excellent ( 5-10% hit from what I measured ). Here's a picture showing the edges on which pixels are blurred for antialiasing:



Instant radiosity



Once I got my deferred lighting working, I was surprised to see how well it scaled with the number of lights. In fact, the thing that matters is pixel overdraw, which is of course logical and expected given the nature of deferred lighting, but still I found it amazing that as long as overdraw remained constant, I could spawn a hundred light and have less than a 10% framerate hit.

This lead me to think about using the power of deferred lighting to add indirect lighting via instant radiosity.

The algorithm is relatively simple: each light is set up and casts N photon rays in a random direction. At each intersection of the ray with the scene, a photon is generated and stored in a list. The ray is then killed ( russian roulette ) or bounces recursively in a new random direction. The photon color at each hit is the original light color multiplied by the surface color recursively at each bounce. I sample the diffuse texture with the current hit's barycentric coordinates to get the surface color.

In my tests, I use N = 2048, which results in a few thousands photons in the final list. This step takes around 150 ms. I have found that I could generate around 20000 photons per second in a moderately complex scene ( 100K triangles ), and it's not even optimized to use many CPU cores.

In a second step, a regular grid is created and photons that share the same cell get merged ( their color is simply averaged ). Ambient point lights are then generated for each cell with at least one photon. Depending on N and the granularity of the grid, it can result in a few dozen ambient point lights, up to thousands. This step is very fast: around one millisecond per thousand photons to process.

You can see indirect lighting in the following screenshot. Note how the red wall leaks light on the floor and ceiling. Same for the small green box. Also note that no shadows are used for the main light ( located in the center of the room, near the ceiling ), so some light leaks on the left wall and floor. Finally, note the ambient occlusion that isn't fake: no SSAO or precomputations! There's one direct point light and around 500 ambient point lights in this picture. Around 44 fps on an NVidia 8800 GTX in 1280x1024 with antialiasing.



Results



I have applied deferred lighting and instant radiosity to Wargrim's hangar. I took an hour to texture this hangar with SpAce's texture pack. I applied a yellow color to the diffuse texture of some of the side walls you'll see in those screenshots: note how light bounces off them, and created yellow-ish ambient lighting around that area.

There are 25 direct point lights in the hangar. Different settings are used for the instant lighting, and as the number of ambient point lights increase, their effective radius decrease. Here are the results for different grid sizes on a 8800 GTX in 1280x1024:

 
Cell size # amb point lights Framerate
0.2 69 91
0.1 195 87
0.05 1496 46
0.03 5144 30
0.02 10605 17
0.01 24159 8


I think this table is particularly good at illustrating the power of deferred lighting. Five thousand lights running at 30 fps ! And they're all dynamic ( although in this case they're used for ambient lighting, so there would be no point in that ): you can delete them or move every single of them in real time without affecting the framerate !

In the following screenshots, a few hundred ambient point lights were used ( sorry, I don't remember the settings exactly ). You'll see some green dots/spheres in some pics: those highlight the position of ambient lights.












Full lighting: direct lighting + ambient lighting



Direct lighting only



Ambient ( indirect ) lighting only
Ysaneya

Detail textures

Many people have been worried by the lack of updates recently. No, we haven't got lazy, in fact quite the contrary :) We've been awfully busy at work.

In this journal I'm going to review some of the recent work, without going too far into details. In a future dedicated update I'll come back more extensively on the recent server side development.

Detail textures

I've added support for detail textures to the game. It was in fact quite easy to implement ( two hours of work ) but that feature was requested by many artists, so I took a bit of time to experiment it. The test scene is Kickman's shipyard as seen in this screenshot:



Now, this station is huge. Really huge. More than 8 kilometers in height. It was textured using spAce's generic texture pack, but despite using tiling textures, the texture still looks blurry when you move the camera close to the hull:



And now here's the difference using a detail texture:



Since I didn't have any detail texture ready, I simply used mtl_5_d.tga ( from spAce's texture pack ), increased the contrast and converted it to grayscale. I then placed this texture into the alpha channel of the misc map ( mtl_5_m.tga ).

Here, I said it: surprise! surprise!, details textures make use of the unused ( so far ) channel of the misc map. The nice "side effect" is that, like the other kind of maps, you can assign a detail texture for each sub material, which means that a single object can use different detail textures in different areas of the object..

The detail texture does not use a secondary UV map though: the shader takes the diffuse UV map and applies a scale factor ( x8 in this picture ) to increase the frequency. The result is that you see "sub-plating" inside the big plates of the texture.

So what does the shader do exactly ? It acts as a modifier to the shading; please remember that the detail texture is a grayscale image.

1. It is additively added ( with a weight ) to the diffuse color. Note that the intensity 128 / 256 is interpreted as the neutral value: all intensities lower than 128 will subtract, while all intensities over 128 will add. The formula is COL = COL + ( DETAIL - 0.5 ) * weight

2. It is additively added ( with a weight ) to the specular value. Formula is the same than above, with a different weight.

3. It is interpreted as a heightmap and converted to a normal ( by computing the gradient ) on the fly. This normal is then used to displace the original normal coming from the normal map, before the lighting / shading computations are done.

If you're going to update existing textures / packs, you should probably think of the detail texture as a detail heightmap added at a higher frequency on top of the surface.

Instead of interpreting the same texture in two different ways ( additively for the diffuse / spec, and as a heightmap for the normal map ), I could have used a new RGBA map storing the detail normal in RGB and the detail color in alpha, and this would have saved the detail normal computation in the shader. However, this would have required one more texture, wasting precious video memory.

It is unlikely that I'll update ASEToBin to support detail textures anytime soon. ASEToBin uses obsolete assembly shaders, so I'd have to port all those shaders to GLSL, which is many days of work.

Recent work

In the past 2 months, I've been working on various tasks. Some of them are secret, so I can't say much about them. They include work on Earth ( which I can't show for obvious reasons ), and work on networking security ( which I can't explain either, to not give precious hints to potential hackers ). I've also done a couple of experiments with spAce, which will hopefully go public once we get nice results.

What I can say, however, is that I've done some nice progress on the infamous terrain engine. No, it's still not finished, but it's getting closer every day. At the moment it's on hold, as I wanted to progress a bit on the gameplay and server side. I'll probably come back on it with a dedicated journal and more details once it's complete.

I've implemented new automatic geometry optimization algorithms into the engine, which are automatically used by ASEToBin. Those involve re-ordering vertices and indices for maximum vertex cache efficiency. For interested programmers, I've been using Tom Forsyth's Linear-Speed Vertex Cache Optimization . It increases the framerate by 10-20% in scenes with simple shading. When shading is more of a bottleneck, like on the planetary surfaces, it didn't help at all, but the good news is that it didn't hurt the framerate either.

I added a couple of performance / memory usage fixes to the terrain engine. Some index buffers are now shared in system memory; the maximum depth level in the terrain quadtree is limited, saving texture memory on the last depth levels. I'm storing the normals of each terrain patch in a LA ( luminance/alpha ) normal map texture, in two 8-bits channels, and recompute the missing component in a shader. Unfortunately, texture compression cannot be used, since the textures are rendered dynamically. I've also introduced new types of noise to give more variety to the types of terrain that can be procedurally generated.

I added support for renderable cube maps, and I have some ideas to improve the space backgrounds and nebulae, which aren't in HDR yet.

I've also done some serious progress on the server side. The global architecture ( meta-server, SQL server, cluster server and node servers ) is set up. The various network protocols are on their way. I'm now working on dynamic load balancing, so that star systems ( managed by a node ) can migrate to another node when the cpu is too busy. I'll probably come back on the architecture in details in a future update.

Darkfall Launch

Darkfall Online ( a fantasy MMO ) has launched. Why do I mention it ? Well, it's a very interesting case of study for us. Like Infinity, it is produced by an independent company ( although they do have millions of dollars of funding ). Like Infinity, it went for a niche market ( twitch-based combat and full PvP ) which isn't "casual". And like Infinity, it took forever to launch and has been labelled as "vaporware" for years ( although we still have some margin compared to them ).

So, what are the lessons learned from Darkfall's launch ? What can we do to prevent the same problems from happening ?

Unfortunately, I'm a bit pessimistic in that area. Of course that doesn't mean that we won't do our best to have a good launch. But, realistically, we won't have the resources to open more than one server, even if we need a lot more to support all the players trying to connect. This means.. waiting queues. A lot of Darkfall players are, understandably, angry: they paid the full price for the client ( 50$, if not more ? ) but can't get into the game, due to waiting queues that are many hours long. The good news is that, for Infinity, the initial download will probably be free ( but don't quote me on that, nothing is 100% set in stone yet ).

Will the server be stable ? Will it crash ? Will it lag ? Nobody can say for sure. As I see it, it depends on three factors:

- the number of players that try to connect, and more accurately, how much stress they cause on the server ( 1000 players connecting within 1 hour causes less stress than 1000 players trying to connect every minute.. ).

- the server ( physical machine ) performance, network quality and bandwidth available.

- the client / server code performance and quality: hopefully, not too many bugs.

On those three factors, the only one we can directly control is the third one. The machine's performance is mostly a financial problem, and as independent developers, we definitely won't be able to afford a large cluster that can handle tens of thousands of players at the beginning. Finally, how many players try to connect is a double-edged sword: more players means more income, but also mean more stress on the server, maintenance, support problems, etc..

The last lesson learned from Darkfall, IMO, is to communicate with your player base, especially at launch. I can understand the frustration of many players when the game has launched, but the most recent news on the website is a month old or more. Of course, I can also understand the developers who are busy trying to fix their code, but it only takes a few minutes..
Ysaneya

Some nice concept art

I don't usually post concept art, but this one, made by Dr. CM Wong ( alias Koshime ) is particularly good (click on it for high-res):



It is showing a station blowing up. Another concept, showing the original station, can be seen on our forums.

You can see more of his work on his CG Portfolio
Ysaneya

2008 Retrospective


First of all, Happy new year 2009 !

Looking back at 2008, I can't say I'm particularly happy about how things went. There has been some serious delays on what I intended to achieve, and it's not due to a single reason, but more to various causes that accumulated and became critical in 2008.

First of all, back in early 2008 I had an opportunity to sell a license of the engine. Unfortunately, it wasn't ready at that time ( and still isn't today ), and it lacked several features, and more importantly, documentation and tutorial. So I spent two good months to reorganize code, clean some modules, comment the interfaces of the code, and start some documentation and tutorials. That's not "wasted work" of course, but those 2 months weren't directly useful to Infinity.

At the same time, I also decided that it was time to revamp the website and make one that looked more professional and more complete. After studying all the solutions and getting advice from various people, we went for Joomla and two people were in charge of implementing the website, one for setting up Joomla and customizing it to our needs, and one for the layout / design. Long story short, things were delayed and people went busy IRL, and progress stopped. In the end it wasn't until inoX and I put our hands into the dirty work that things started to move at a decent rate. All in all, I would say that I spent more or less 3 months on it.

When the new website launched, the web server collapsed: Joomla was consuming a lot more RAM than on the old website, and visitors started to get blank pages. In emergency I rented a dedicated server and started to move files and databases to the new server. From the crisis to its resolution, I would say I spent 2 weeks on server issues. At this point of time, we're already in September 2008.

Next comes management issues ( that in fact started prior to the server move ) within the writing team. Solving them caused a bit of drama but finally Vileedge was appointed as our new lead storywriter. Reorganization of the writing team was also needed, and all the docs had to be collected and installed on the new wiki. Since I wanted the wiki to have a special organization, I had to do it myself, and it took more or less 3 additional weeks.

It wasn't until October 2008 that I resumed "serious" work on the game.

Hopefully there won't be as many distractions in 2009 and progress will go much faster this year.

GPU Terrain Rendering


Since the last dev journal, I continued to work on generating and rendering terrain on the GPU.

Normal mapping is now fully functional, and I'm quite happy with the results. For each terrain node, a temporary heightmap is generated ( 256^2 for the default quality or 512^2 for the high quality ) and gradiants are computed to extract normals. Two pixels are reserved for the left / right / top / bottom boundaries, so that seams don't appear between adjacent nodes.

The next step was to reimplement ( and improve ) diffuse textures. Procedural texturing is still in use as before, with a 256^2 lookup-table texture ( inputs are slope and altitude ), but instead of giving the ID of the texture layer to use, this time it directly gives a blending coefficient for each layer. With a maximum of 16 layers per planet, and one layer consuming one channel, it may appear that 4 RGBA lookup textures are needed, but I simply packed them all together in a 1024x256 texture and sampled it 4 times in the diffuse texturing shader.

As a result, it is now possible to blend any number of layers per texel, while the previous method only allowed one.

There are other interesting benefits to this technique: first, aliasing is less pronounced since everything is stored in textures instead of being computed per pixel in the final shader. Terrain transitions are less sharp and flicker less in the distance.

The second important benefit is that the final rendering shader is a lot lighter now; as the diffuse texture is only generated once per node, there's no particular work to do per frame, other than sampling the diffuse texture. Previously, for each frame and each pixel, the whole procedural texturing had to be recomputed. It's a bit like if the textures were caching the results between frames. Of course, a lighter shader means a higher framerate; I don't have hard numbers, but I would say the framerate easily doubled. It's not rate for my 8800 GTX to achieve 80-120 fps at ground level, and 150-300 fps in space.

There a big drawback to the new texturing algorithm: storing a normal map and a diffuse texture of 256^2 for each terrain node consumes video memory. A lot of it. In fact, at ground level there can be up to 1000 nodes in memory. If you do the calculations, you'll find that planet textures can quickly fill the whole memory of a 512 MB card, and that's without ships/buildings/other models. That was clearly unacceptable, so I started to look at ways to reduce the resolution of textures when they're far away, or seen at a low angle. In practise, it works a bit like mipmapping, but manually updates the resolution of nodes as the camera moves. More textures have to updated in real-time, but it's worth it, and a planet at ground level with 256^2 textures now consumes around 150 MB of video memory.

I still have a few optimizations in mind; one of them is to store normal maps in two channels instead of three ( note that using three channels has the same memory cost than four ) and recomputing the missing channel in a shader at runtime. This will save 50% of the normal map memory, or 25% of the total video memory for planets, and the visual quality shouldn't be affected noticeably.

Precisions issues at still here. For this reason I will start to move back the geometry generation to the CPU. On the GPU, there are little gaps between terrain nodes due to the limitation of 32-bits precision in floating point calculations. GPUs only support fp32, while CPUs do support fp64, and code is already here on the CPU for fp64, but I have to clean and rewrite some interfaces to make this code work again. Once done, the gaps should disappear between nodes. Normal mapping will stay on the GPU though, as there's no way it can be done fast enough for 256^2 maps ( or higher ) on the CPU.

Finally, I still have some work to do to minimize the popping and geomorphing bugs.

gputerrainas3_med.jpg

gputerrainas1_med.jpg

Revisiting atmospheric scattering


I'm honnestly fed up with atmospheric scattering. I've implemented it two or three times already, but there has always been some problems. Maybe I'm too perfectionist, maybe not. Of course, when I post screenshots I tend to pick the ones that look the best, so while it may appear to you that atmospheric scattering was already good before, in reality.. it still had some annoying problems.

The before-last scattering algorithm suffered from two main problems: sunsets colors weren't vivid enough ( plus there was some unnatural bands of darkness in the sky ), and haze in distant mountains wasn't blue-ish enough.

In november I reworked the theory of atmospheric scattering from scratch, to make sure I was correctly understanding all the formulas and algorithms. The base paper I used is the famous "Display of The Earth Taking into Account Atmospheric Scattering (1993)" by Nishita & All. I first blinded implemented the formulas from the paper in a shader, and results were a bit disapointing. I now had vivid sunsets, but over-saturation on the atmosphere glow, and the blue-ish haze wasn't there either.

As a side note, I wasted almost 2 weeks thanks to ATI drivers. I've hit a high number of driver bugs in the shaders, that made me go mad. As of today, even the latest catalyst 8.12 still have the bugs, but at least I've rewritten the shaders to contourn them via all sorts of nasty tricks you definitely don't want to hear of.

This week end, I decided to have a go at re-implementing the algorithm, but step-by-step, checking every result and making sure it looked good, adjusting parameters in the process. It has given much nicer results so far, as I finally have good sunsets, a good blueish haze and no saturation to white on the atmosphere glow from space.

As usual, random pictures:

gputerrainas4_med.jpg

gputerrainas2_med.jpg
Ysaneya

Craters and normal maps

Normal maps on the GPU

In the last journal, I was explaining that one of the main benefits of working on the GPU instead of the CPU is the ability to create normal maps. Of course, it would be technically possible to generate them on the CPU too, but it would be far too costly, even with multithreading ( which the engine already supports ).

How does it work ? Well, each time the terrain quadtree gets subdivided ( as the viewer zooms in ), I'm rendering to a temporary F32 2D texture the heightmap for the new terrain patch. The shader generates, for each pixel of this heightmap, procedural height values that are a function of the XYZ position on the spherical surface. Then for each terrain node, I create a unique RGBA8 texture and render the normal map into it by deriving the heightmap of the previous step's temporary buffer.

The resolution of the per-patch normal map is depending on the depth level of the patch in the quadtree. At the top-most level, the map is 1024^2. For levels 1 and 2, it uses 512^2. Finally, for all other levels, the resolution is 256^2.

To maximize precision, I wanted to use lighting in tangent space, but due to various lighting artifacts and the tangent space not matching on the planet's cube faces, I ended up using object-space normal maps / lighting. It looks pretty good except on the very last depth levels ( it looks very noisy ), but I'm not sure yet if the bad lighting is coming from lack of precision in the XYZ inputs of the planet ( remember: it's using relatistic scales, and on the GPU I cannot use fp64 ), the lack of precision in the procedural heights generation, or the lack of precision in the object-space normals. Or maybe a combination of the 3...

In the coming week(s) I'll continue to try to increase the precision in the last depth levels, and reimplement diffuse texturing ( still lacking in the following pictures ).

Craters

I've finally been able to produce a decent crater noise function, based on the cell noise ( see previous journal ). Each crater has a perfectly circular shape but then is deformed by a few octaves of fBm, to make it more natural. In the following pics, I'm combining some fractals for the land base and 3 octaves of crater noise. I'm still not 100% happy with it though, as I think the craters's distribution is too uniform, so I will probably continue to play with it a bit.

For the fun, here's the final procedural function that generates all the pics in this journal:


float height(in vec3 world)
{
float land = gpuFbm3D(6, world * 2.0, 2.0, 0.7);
float land2 = gpuMultiFractal3D(16, world * 5.18, 0.8, 2.0, 0.05) * 0.5 - 2.5;
land = (land + land2 * 0.5) * 4.0;

float n0 = gpuFbm3D(10, world * 8.0, 2.5, 0.5) * 0.05;
vec2 c0 = gpuCellCrater3DB((world + n0 * 2.0) * (4.0 + n0 * 16.0) * vec3(1, 1, 1), 3.0, 0.5);
land += c0.x * 3.0;

c0 = gpuCellCrater3DB((world + n0 * 1.0) * (16.0 + n0 * 1.0) * vec3(1, 1, 1), 3.0, 0.5);
land += c0.x * 2.0;

c0 = gpuCellCrater3DB((world + n0 * 0.5) * (64.0 + n0 * 1.0) * vec3(1, 1, 1), 3.0, 0.5);
land += c0.x * 1.0;
return land + 1.0;
}



Pictures

crater6_med.jpg

crater7_med.jpg

crater8_med.jpg

crater9_med.jpg

crater10_med.jpg
Ysaneya
GPU Planetary Generation

Motivation

Until now, the planetary generation algorithm was running on the CPU synchronously. This means that each time the camera zoomed in on the surface of the planet, each terrain node was getting split into 4 children, and a heightmap was generated synchronously for each child.

Synchronous generation means that rendering is paused until the data is generated for each child node. We're talking of 10-20 milliseconds here, so it's not that slow; but since 4 childs are generated at a time, those numbers are always multiplied by 4. So the cost is around 40-80 ms per node that is getting split. Unfortunately, splits happen in cascade, so it's not rare to have no split at all during one second, and suddenly 2 or 3 nodes get split, resulting in a pause of hundreds of milliseconds in the rendering.

I've addressed this issue by adding asynchronous CPU terrain generation: a thread is used to generate data at its own rythm, and the rendering isn't affected too harshly anymore. This required to introduce new concepts and new interfaces ( like a data generation interface ) to the engine, which took many weeks.

After that, I prepared a new data generation interface that uses the GPU instead of the CPU. To make it short, I encountered a lot of practical issues with it, like PBOs ( pixel buffer objects ) not behaving as expected on some video cards, or the lack of synchronization extension on ATI cards ( I ended up using occlusion queries with an empty query to know when a texture has been rendered ), but now it's more or less working.

Benefits

There are a lot of advantages to generating data on the GPU instead of the CPU: the main one is that, thanks to the higher performance, I will now be able to generate normal maps for the terrain, which was too slow before. This will increase lighting and texturing accuracy, and make planets ( especially when seen from orbit ) much nicer. Until now, planets seen from space weren't looking too good due to per-vertex texturing; noise and clouds helped to hide the problem a bit, but if you look carefully at the old screenshots, you'll see what I mean.

The second advantage is that I can increase the complexity of the generation algorithm itself, and introduce new noise basis types, in particular the cell noise ( see Voronoi diagrams on wikipedia ).

Another advantage is debug time. Previously, playing with planetary algorithms and parameters was taking a lot of time: changing some parameters, recompiling, launching the client, spawning a planet, moving the camera around the planet to see how it looks, rinse and repeat. Now I can just change the shader code, it gets automatically reloaded by the engine and the planet updates on-the-fly: no need to quit the client and recompile. It's a lot easier to play with new planets, experiment, change parameters, etc..

I'm not generating normal maps yet ( I will probably work on that next week ), and there's no texturing; in the coming pictures, all the planet pictures you will see only show the heightmap ( grayscale ) shaded with atmospheric scattering, and set to blue below the water threshold. As incredible as it sounds, normal mapping or diffuse/specular textures are not in yet.

Cell noise

.. aka Voronoi diagrams. The standard implementation on the cpu uses a precomputed table containing N points, and when sampling a 3D coordinate, checking the 1 or 2 closest distances to each of the N points. The brute-force implementation is quite slow, but it's possible to optimize it by adding a lookup grid. Now, doing all of that on the GPU isn't easy, but fortunately there's a simpler alternative: procedurally generating the sample points on-the-fly.

The only thing needed is a 2D texture that contains random values from 0 to 1 in the red/green/blue/alpha channels; nothing else. We can then use a randomization function that takes 3D integer coordinates and returns a 4D random vector:


vec4 gpuGetCell3D(const in int x, const in int y, const in int z)
{
float u = (x + y * 31) / 256.0;
float v = (z - x * 3) / 256.0;
return(texture2D(cellRandTex, vec2(u, v)));
}


The cellNoise function then samples the 27 adjacent cells around the sample point, generate a cell position in 3D given the cell coordinates, and get the distance to the sample point. Note that distances are squared until the last moment to save calculations:


vec2 gpuCellNoise3D(const in vec3 xyz)
{
int xi = int(floor(xyz.x));
int yi = int(floor(xyz.y));
int zi = int(floor(xyz.z));

float xf = xyz.x - float(xi);
float yf = xyz.y - float(yi);
float zf = xyz.z - float(zi);

float dist1 = 9999999.0;
float dist2 = 9999999.0;
vec3 cell;

for (int z = -1; z <= 1; z++)
{
for (int y = -1; y <= 1; y++)
{
for (int x = -1; x <= 1; x++)
{
cell = gpuGetCell3D(xi + x, yi + y, zi + z).xyz;
cell.x += (float(x) - xf);
cell.y += (float(y) - yf);
cell.z += (float(z) - zf);
float dist = dot(cell, cell);
if (dist < dist1)
{
dist2 = dist1;
dist1 = dist;
}
else if (dist < dist2)
{
dist2 = dist;
}
}
}
}
return vec2(sqrt(dist1), sqrt(dist2));
}


The two closest distances are returned, so you can use F1 and F2 functions ( ex.: F2 = value.y - value.x ). It's in 3D, which is perfect for planets, so seams won't be visible between planetary faces:

gpugen4_med.jpg

New planetary features

Using the cell noise and the GPU terrain generation, I'm now able to create new interesting planetary shapes and features. Have a look yourself:

Rivers

"Fake" rivers I'm afraid, as it's only using the ocean-level threshold and they don't flow from high altitudes to low altitudes, but it's better than nothing. When seen from orbit, there is some aliasing, so not all pixels of a river can be seen.

It's simply some cell noise with the input displaced by a fractal ( 4 octaves ):

gpugen1_med.jpg

gpugen2_med.jpg

gpugen3_med.jpg

Craters

I've started to experiment on craters. It's a variation of cell noise, with 2 differences: extinction ( a density value is passed to the function, which is used to kill a certain number of cells ), and instead of returning the distance, return a function of the distance. This function of distance is modeled to generate a circular, crater-like look.

Here's a quick experiment with 90% extinction. The inputs are also displaced with a small fractal:

gpugen6_med.jpg

And here's the result with a stronger displacement:

gpugen5_med.jpg

The next step is to add more octaves of crater noise:

gpugen7_med.jpg

It doesn't look too good yet, mostly because the craters at different octaves are just additively added and not combined properly. More experiments on that later.

Planetary experiments

When adding more octaves and combining different functions together, then adding back atmosphere scattering and the ocean threshold, the results start to look interesting. Keep in mind that all the following pictures are just the grayscale heightmap, and nothing else: no normal mapping or no texturing yet !

gpugen8_med.jpg

gpugen9_med.jpg

gpugen10_med.jpg

gpugen11_med.jpg

gpugen12_med.jpg

gpugen13_med.jpg

gpugen14_med.jpg

gpugen15_med.jpg
Ysaneya

A bit of history

General progress

In the past weeks, I've been concentrating on the new website. Thanks to Amshai and Choub, the layout is pretty much finished, but a lot of components are still not yet integrated ( like the galleries, videos, forums, etc.. ). Choub is in charge of this part, so don't hesitate to cheer him up ! :)

Breslin is filling the new website with content. I'm helping as I can, I've been particularly busy on the integration between Joomla and SMF forums, which is more or less working now. Unfortunately, it doesn't feel "polished" yet, and I'm afraid it'll still take many weeks to bring it to the "professional" level.

I've also been busy on the programming side, on all sorts of various topics. One of them is an experiment to port the procedural generation to the GPU, instead of the CPUs. The motivation for that is that space-orbit views aren't looking good enough to me. It seems nice in screenshots, mostly because I choose the best ones to publish, and because there are clouds covering the nasty artifacts.. but the landscape from orbit isn't terrible. It is very aliased, and the texturing is too vertex dependent, which means lots of popping when a terrain chunk gets split.

The best way to fix those problems is to generate terrain at the pixel level, but unfortunately that consumes a lot of performance, and it cannot be done practically on the CPU anymore, at least not with the relatively complex heightmaps I need to generate to achieve a good variety for a game such as Infinity. I'm also cleaning the terrain engine for better performance and flexibility, so that 3rd party developers ( once the engine is released ) can use the terrain component in all sorts of projects.

History

There's nothing more frustrating than to hear lies, or at best mis-informed people, speaking about Infinity, and how it's been in development "forever", and how it is vaporware.

For example, on "that other forum", somebody mentionned that he had been following Infinity's development since 2002. Wow, I certainly didn't expect that.

I mean, this genius has been following development of a game that didn't even exist in the mind of its creator yet ! How brilliant is that ?

Sometimes I wonder if people consciously lie, or are just terribly confused with some other game, or ... ?

The idea of Infinity was born in 2004. Before 2004, I had been busy on other projects, such as LightSpeed, a high-speed futuristic racer ( ala Wipeout ). Here is a screenshot from it:
LightSpeed

This project died due to the lack of artistic content. I had teamed up with 2 other people, a musician and an artist, but he slowly lost interest and/or was too busy with his studies IRL, and so while we achieved a playable prototype ( you could race over a whole lap, but there was no AI or multiplayer ), the project died.

The death of LightSpeed didn't happen over night. During end 2003/early 2004, I started to experiment a spherical terrain engine: the first prototype of Infinity's planetary engine ! But I must point out that I didn't have the idea of making a game yet: it was just a fun experiment, and the engine behind it ( if you can call that an engine ) was DirectX9-based. I didn't work on this very seriously, only a couple hours here and there.
An image of this first prototype

A learnt a lot of things from this first version of the planetary engine. In fact, the most valuable lesson is that it teached me what to avoid, and confronted me to problems that I didn't suspect ( such as floating-point precision to handle realistic scales ). At the same time, it became obvious that with more work, the technology could work, and that a game could be based on it.. and what better concept for a planetary engine than a space-based game ala Elite ? That day, Infinity was born, and the first line of the engine added.

The very first backup of Infinity's engine source that I have, which contains only a few tens of files, is dated 24 June 2004. This can effectively be considered the birth date of Infinity's development ( or close to it, by a few weeks ).

During end 2004, I developed the engine basics: the renderer, the scene-graph, the plugins system, etc.. and I continued to experiment various prototypes ( such as the clouds prototype ).

I started my gamedev.net journal in end 2004, but I only spoke about the engine at that time, I didn't announce the game.

I opened the website in september 2005.
Ysaneya

Lava experiments

General progress

In the past two weeks, I've been working a lot on "polishing" the engine. This means investigating all sorts of bugs and problems that I had noticed, but didn't have the time to fix so far.

For example, I added support for minidumps. When the engine/program crashes, the exception gets intercepted and generates a dump file that contains a stack trace as well as local variable values. It works in release mode too, so there's no performance cost; the only restriction is that I need to keep the .pdb files associated to a release. From a user's point of view, a small dialog box pops up and asks whether a minidump should be saved or not. If agreed, it generates a file called 'minidump.dmp' near the executable. This can be packed together with the log file and sent to me for further investigations.

All future releases of prototypes, ASEToBin, StoryGen etc.. should include this minidump.

Lava experiments

On the "visual" side, I've spent a couple of hours on lava. The effect is "okay" I believe; later on, I plan to add a screen-space heat ( distortion ) effect.

The current shader samples a lava texture and combines 14 octaves at different frequencies. The last layers ( closer to the ground ) are animated, but it's quite subtle. A self-illumination term is calculated and added back additively. However, that alone isn't enough: lava emits light on the surrounding terrain; so to help a bit, I've added a bit of red-ish lighting additively, that is a function of the distance above the lava plane.

With HDRI enabled ( like in those screenshots ), and the glowing effect, it's looking rather nice. I've included day and night shots.

I'm thinking of using a similar shader ( with probably a lot of copy/paste ) for the sun surface effect.

At this point, what is worrying me the most are the z-fighting artifacts between the ocean/lava and the terrain.

I've also created a small terrain editor that allows to tweak the procedural parameters ( those were previously hardcoded ), change the frequencies, etc..







Ysaneya
In the past weeks, I've written a story/descriptions generator for players, NPCs or even locations. Steve Breslin, our storywriter, has helped to design the system, and so will be the one to try to explain it in today's journal (in better english than I could :) ):

Introduction

As you may have heard, we've been working on a story generator. We're releasing a prototype today, and you can download it here

The purpose of this article is to introduce you to the thing, explain the theory, its intended use, some development notes, and some of the practical stuff -- and then invite you to read the manual and play with it a bit.

As you may know, natural language generation (NLG) is one of those programming concepts around which science fiction has woven circles of mysticism, along with greatly inflated expectations and misplaced hopes. Still, NLG is capable of a lot, if by 'a lot' we mean the NL expression of information according to carefully detailed rules and pre-written natural phrases.

For the purposes of story generation, this means that the computer program does not invent or "make up" the events that happen in the story -- the individual events are pre-written by the human designer. Nor does the program invent phrases to express these events -- those phrases are also pre-written by the designer. The program's job is simply to take these elements, the events and the phrases, and automatically combine them in a way that's presentable -- according to rules defined by none other than, yes, the designer.

So, as you would expect, automatic story generation does not make things easier for the human designer -- this you'll see firsthand if you decide to experiment with it. But once the system is in place, the program can automatically generate thousands of unique stories, by combining pre-defined elements in various ways.

You're probably wondering how the game will use the system, so let me start with that....


Player history

The character creation system for Infinity will give the player basic control over their character's background: which scion they start in, their age and gender, and perhaps some other options to provide a greater level of detail. More specific options have not been finalized, but they may include , for example, a peaceful or tumultuous childhood, social class, perhaps the specific home world, and perhaps the profession which the character pursued up to the beginning of game play.

After selecting the parameters, the player can then "roll" the character, and the history generator then designs a story based on the parameters supplied by the player, but filling the story in with numerous additional historical details. If the player doesn't like the history, he can roll again with the same parameters, or change them however suits, and roll again. The purpose of this setup is to give the player a good level of control over the character's history, supplying a rich back-story, which the system can use to make the history relevant to the game experience.

As a quite separate project, it would be feasible to allow custom stories written by the player, but we would have to give up the idea that the character history might influence game events. In order to make them relevant, we would either have to force the player to write their character's history using a narrow and very picky subset of English, and spend a month writing a basic NL interpreter; or we allow players relatively free reign in describing their character (although warning not to expect the game to understand it precisely), and take a few years to write a powerful NL interpreter. Anyway, for now we're concentrating on automatic generation of a short story for the character.

While the history isn't written by the player, we want it to be a history which the player feels comfortable adopting. So we'll try to incorporate as much variability into the stories as possible. -- Keeping in mind that the player can adjust the parameters and re-roll the story until it is satisfactory.

For example, the character could just have graduated from a university, or is perhaps an up-and-coming pirate, or has had trouble with a criminal organization. The really neat part -- each of these contextual and historical details could possibly influence missions the player engages in later on.

The purpose of this is not, of course, to strongly determine the future of the character, and there aren't character skills (as there would be if the game were an RPG). We want the player to be free to be and do whatever he or she wants, and not limited by character attributes or history. The history will not limit the player's options: the players/characters are always free to cut whatever path they like. Gameplay will be enriched a bit, but the character will not be scarred or pigeonholed by their back-story.

So yes, events that happened in the PC's childhood or history will add interest to the play.

The first effect is a "bonus/penalty" inferred from the history. For example, the character could be born in a rich trader family, so as a "bonus" the player would begin the game with a small hauler (different than another player's starter ship). Of course, for every positive there is a negative, so if you get a better startup ship, you might have a "problem" added to your history, to balance things out -- some bounty hunter after you, or a debt that can only be paid by running a dangerous mission as a "favor" for a creditor.

All of this will also influence your reputation, how other factions see you, etc. Again, this can be easily overturned by in-game action; this is just the initial population of startup parameters. Later in the game, some missions might be triggered and influenced based on your history. So maybe you'll be contacted by a NPC you've met in your childhood, that will give you some special missions/jobs: "Ah! I knew you parents!" When doing a mission for a company that your storyline parents worked for; or "We found out who killed your brother..."; you can imagine.

Again, the character will be really determined by what the player does in-game; the history is just a base start, a neat way to encourage player/character empathy early on. If you originate from a dominant Deltan military family, but your heart is in trade and ending up in a Starfold smuggling corp would be great for you, then go for it: no problem. The story does not set limits on player action.


Designing the Algorithm

Figuring out a good NLG program is something like tricking the devil. There are so many pitfalls, so many constraints. You can't just target normal speech, and write an algorithm for that. That's too vague and too wide. The very form of idea and expression must be targeted and designed for automatic generation. It's a delicate marriage of the story concept + the form of expression, balanced with the algorithm we use to make it all happen.

It is technically possible to perform the operation in two passes: first, determine the events of a story; then, follow syntactic rules to give English expression to the events. The second pass requires writing a robust syntax-based NLG engine, which would require many months if not years of work.

Fortunately, it's much easier if each event declares its own NL expression. Thus, we can use much simpler rules to join the pieces of language together. This does not really increase the story designer's work, and it greatly decreases the engine programmer's work.

Our next main question: what is the structural form of the kind of story we'd like to tell, adjusted to the limitations of the algorithm? What we're aiming at is a story form which is going to sound good, and be flexible enough for automation, with multiple story elements which fit well within the form.

We assume we're going to present the story chronologically, so to begin with, the character's history begins in some initial situation -- some preliminary data (say, gender and place of birth). Then, the possible range of "first" events in the story are determined by the initial situation. All proceeding events follow from what comes before, progressing in chronological order (and of course there's an element of fortune).

As a preliminary conclusion, we can say that story elements have preconditions and chronological order.

Thinking about the idea of preconditions, or simply "Conditions," we'll see that it is pretty straightforward. Let's say that the person who is writing events for the story generator has decided to go for an event along the lines of "the player character enters the Deltan military." The designer is probably assuming that the player is born in the Deltan Federation, or at least living in Deltan space by the time of the event. The designer will probably want to arrange things so that this story element does not appear in stories which locate the player character outside of Deltan space. The designer can do this by setting a precondition. The Condition basically says, "this story element is only viable when the player character is in Deltan space." We're constructing the story in order, so the Condition makes sure that each event is logically consistent with the material that came before.

The idea of chronological order is pretty interesting. The simplest version of chronological order would be a system where there's one sentence per chronological event; so for example the first event (and the first sentence in the presentation) tells about the birth planet, the second event/sentence tells about parents and early childhood, the third event/sentence tells about the next highly-specifically defined event-type or concept, and so on. If we did it this way, each 'event' would declare which chronological slot it fits into, and these slots are static -- they're strictly determined ahead of time. The stories would be "randomized," and would be logically coherent (in the sense that, at each point, logical preconditions are not violated), but they would not be particularly variable. This would be acceptable for our purposes, but we decided to kick it up a notch.

The more flexible form we decided upon not only increases the variability of the story, but also makes the algorithm applicable for a wider range of uses (beyond player story generation). Instead of a strictly predetermined form, we allow each element to determine what kinds of elements are permitted to come next. So we assign each story element a type (or "Name"), and then each individual element decides which element types can come next.


NPC history and Planet description

We would like to invite you to use the prototype to write auto-generated stories for NPCs. If, after you play around with the story generator, you are interested in exploring the capabilities of the engine, you could for instance design from scratch a separate 'descriptive' project, such as describing planets.

The prototype includes the main engine and supporting files, an XML datafile which you can use as a template for your story, and a readme file which explains the language of the XML file. There's no reason to get too far into the language in this development journal entry -- please refer to that readme file if you're interested in playing with the generator. However, I would like to mention here the basic concepts at work.

One major point of interest is the scripting language. It is relevant during three phases, the "Operation" phase, which is executed if an element is chosen to be a part of the story; and "Condition" and "Probability" phases, which determine the possibility and likelihood of any elements' inclusion in the story. In the "Operation" phase, you can invent variables for recording events or facts (familyIsWealthy = true) and perform calculations (initialWealth = initialWealth + 40), so that later elements can evaluate these variables in the "Condition" or "Probability" phase.

Another major feature is that you can also assign pieces of text (AKA 'strings') which can be substituted in the pieces of story-text associated with each element. In other words, the story can be parameterized. Say, $birthplanet is a variable string, so you can assign any planet name (birthplanet = "Ourados"), and then as a story element you can write "you were born on $birthplanet"; any time you want to refer back to the birth planet in the story, you can use that variable. (And of course you can evaluate the variable in a "Condition" or "Probability" phase.)

Using precisely the same technique -- but with built-in variables -- you can make your story variable by grammatical person (first, second or third person singular: I, you, or he/she/it).

By way of a second technique, you can enable your story to support automatic pronoun substitution. So, if the same noun (e.g., your father) happens twice within a single sentence or in adjoining sentences, the engine will substitute the second instance with the appropriate pronoun (e.g., he).

The phrases associated with each element are intended to be simple grammatical sentences. This is important because the engine uses special combination rules to conjoin event-phrases in an appropriate way, depending on their "Type." For example, if two elements are both labeled "positive," and the engine conjoins the sentences, the conjunction will be in the 'and' category; if one event is labeled "positive" and the other labeled "negative," then the conjunction will be in the 'but' category. Wherever necessary, you can also override automatic sentence-combination by declaring an element as a mandatory new sentence.


Closing remarks

The system is still in the prototype phase. If you play with it for a while, you will probably come up with some ideas we can use. Please let us know if you discover any capabilities which you think would be particularly useful, or if you find any problems.

In any case, you will soon discover that you need a lot of self-discipline to remember the "flow" of how each element relates to the previous and next, so that it all hangs together when assembled. There's no real cure for that, but don't mind too much; that challenge comes with the territory.
Ysaneya
Meta-server

In the past 3 weeks, I've been busy working on the meta-server. The meta-server is the server on which the client is first connecting. It handles connections to the server clusters ( shards ), patches, authentication, etc.. I won't go deeply in all details, but here are the list of things I've implemented so far:

- authentication
- accounts databases
- registration keys ( for alpha, beta, etc.. )
- statistics
- access lock
- versions checks
- automatic patching ( downloading from mirrors, installing a patch.. )
- warning dialog boxes ( when no connection could be established, or when the drivers are obsolete, etc.. )
- EULA dialog
- security ( signature IDs for transactions / connections )
- bans ( per account or IP address )
- listing shards, getting their description
- logs
- disconnections / reconnections
- busy servers: waiting in a line
- admin rights

Bandwidth optimizations

I've tested and debugged the meta-server by simulating users in different threads, for concurrent access, and at every step I've also verified performance and network bandwidth.

Speaking of network bandwidth, I found a very simple optimization in my RDP ( reliable UDP ) protocol, that allowed to merge ACK packets to packets there were ready to be sent. I think it saved from 10 to 30% of the total bandwidth, which is excellent for only 5 lines of code :)

Story system

Since last week I've also started working on a procedural story system for players and NPCs. Breslin is helping me to formalize the system. It seems like it'll be working quite well so far. More infos on it in a future dev journal.

GTA IV

This week, I've also bought GTA IV on the XBox 360. So I've taken the past 4 days as "vacation", more exactly to play GTA IV to death. In addition to being a lot of fun, it's also very interesting and inspirational for the way the missions are handled. I will also come back on this in a future journal.
Ysaneya
Today's a short tip of the day aimed at graphics programmers.

Most people don't realize that bilinear filtering an RGBA8 ( fixed point, 8 bits per channel ) texture results in a fixed point value.

What I mean is extremely simple. RGB8/RGBA8 textures are a standard used commonly everywhere. In shaders, you usually sample a texture like this (GLSL):


vec4 color = texture2D(tex, uv);


If filtering has been enabled on the texture, the values in 'color' are not full fp32 precision. They are still 8 bits.

I blame this on legacy hardware. I have confirmed this behavior on both NVidia and ATI cards, including the modern ones.

Of course, it only becomes visible when you sum up a lot of textures together, or when you scale a value by a huge factor.

I always have a shudder when I think of all the people using lookup tables or scaled textures in RGBA8, and who don't realize what they are doing.

Here is an example. First a picture of a relatively dark area of a texture, heavily magnified ( needed to demonstrate filtering after all ). There are blocky square artifacts coming from the jpeg compression, ignore them, and just verify in photoshop or your preferred image editor that the pixel values have a smooth gradiant, decreasing 1 by 1 (actually 1/255) between each adjacent:

No scaling

Now the pixel shader simply multiplies this pixel by a constant value of 5. Notice how the pixel values jump by 5 by 5:

Scaling x5

To fix this problem, there are two ways that I know:

1. Perform bilinear filtering yourself: the shader pipeline is fully 32 bits, so while you're sampling 4 texels in 8 bits, the bilinear interpolation will be in full precision. This has a high performance cost.

2. Use a fp16 or fp32 internal format ( at the cost of additional video memory ).

Those two solutions both have a serious cost, either in performance or memory. Why NVidia and ATI haven't implemented bilinear filtering of RGBA8 textures in full precision in hardware yet is beyond my understanding.
Ysaneya
Many people have been asking me how the terrain texturing is implemented, so I'll make a special dev journal about it.

Sub-tiling

The whole technique is based on sub-tiling. The idea is to create a texture pack that contains N images ( also called layers or tiles ), for example for grass, rock, snow, etc..

Let's say that each image is 512x512. You can pack 4x4 = 16 of them in a single 2048x2048. Here is an example of a pack with 13 tiles ( the top-right 3 are unused and stay black ):



Mipmapping the texture pack

Each image / tile was originally seamless: its right pixels column matches it left, and its top matches its bottom. This constraint must be enforced when you're generating mipmaps. The standard way of generating mipmaps ( by downsampling and applying a box filter ) doesn't work anymore, so you must construct the mipmaps chain yourself, and copy the border columns/rows so that it's seamless for all levels.

When you're generating the mipmaps chain, you will arrive at a point where each tile is 1x1 pixel in the pack ( so the whole pack will be 4x4 pixels ). Of course, from there, there is no way to complete the mipmaps chain in a coherent way. But it doesn't matter, because in the pixel shader, you can specific a maximum lod level when samplying the mipmap. So you can complete it by downsampling with a box filter, or fill garbage; it doesn't really matter.

Texture ID lookup

To each vertex of the terrain is associated a slope and altitude. The slope is the dot product between the up vector and the vertex normal and normalized to [0-1]. The altitude is normalized to [0-1].

On the cpu, a lookup table is generated. Each layer / tile has a set of constraints ( for example, grass must only grow when the slope is lower than 20? and the altitude is between 50m and 3000m ). There are many ways to create this table, but it's beyond the point of this article. For our use, it is sufficient to know that the lookup table indexes the dot-product of the slope on the horizontal / U axis, and the altitude on the vertical / V axis.

The lookup table ( LUT ) is a RGBA texture, but at the moment I'm only using the red channel. It contains the ID of the layer / tile for the corresponding slope / altitude. Here's an example:



Once the texture pack and the LUT are uploaded to the gpu, the shader is ready to do its job. The first step is easy:


vec4 terrainType = texture2D(terrainLUT, vec2(slope, altitude));


.. and we get in terrainType.x the ID of the tile (0-15) we need to use for the current pixel.

Here's the result in 3D. Since the ID is a small value (0-15), I have multiplied it by 16 to see it better in grayscale:



Getting the mipmap level

So, for each pixel you've got an UV to sample the tile. The problem is that you can't sample the pack directly, as it contains many tiles. You need to sample the tile within the pack, but with mipmapping and wrapping. How to do that ?

The first natural idea is to perform those two operations in the shader:


u = fract(u)
v = fract(v)
u = tile_offset.x + u * 0.25
v = tile_offset.y + v * 0.25


( remember that there are 4x4 tiles in the pack. Since UVs are always normalized, each tile is 1/4 th of the pack, hence the 0.25 ).

This doesn't work with mipmapping, because the hardware uses the 2x2 neighbooring pixels to determine the mipmap level. The fract() instructions kill the coherency between the tiles, and 1-pixel-width seams appear (which are viewer dependent, so extremely visible and annoying).

The solution is to calculate the mipmap level manually. Here is the function I'm using to do that:


/// This function evaluates the mipmap LOD level for a 2D texture using the given texture coordinates
/// and texture size (in pixels)
float mipmapLevel(vec2 uv, vec2 textureSize)
{
vec2 dx = dFdx(uv * textureSize.x);
vec2 dy = dFdy(uv * textureSize.y);
float d = max(dot(dx, dx), dot(dy, dy));
return 0.5 * log2(d);
}


Note that it makes use of the dFdx/dFdy instructions ( also called ddx/ddy ), the derivative of the input function. This pretty much ups the system requirements to a shader model 3.0+ video card.

This function must be called with a texture size that matches the size of the tile. So if the pack is 2048x2048 and each tile is 512x512, you must use a textureSize of 512.

Once you have the lod level, clamp it to the max mipmap level, ie. the 4x4 one.

Sampling the sub-tile with wrapping

The next problem is that the lod level isn't an integer, but a float. So this means that the current pixel can be in a transition between 2 mipmaps. So when calculating the UVs inside the pack to sample the pixel, it has to be taken into account. There's a bit of "magic" here, but I have experimentally found an acceptable solution. The complete code for sampling a pixel of a tile within a pack is the following:


/// This function samples a texture with tiling and mipmapping from within a texture pack of the given
/// attributes
/// - tex is the texture pack from which to sample a tile
/// - uv are the texture coordinates of the pixel *inside the tile*
/// - tile are the coordinates of the tile within the pack (ex.: 2, 1)
/// - packTexFactors are some constants to perform the mipmapping and tiling
/// Texture pack factors:
/// - inverse of the number of horizontal tiles (ex.: 4 tiles -> 0.25)
/// - inverse of the number of vertical tiles (ex.: 2 tiles -> 0.5)
/// - size of a tile in pixels (ex.: 1024)
/// - amount of bits representing the power-of-2 of the size of a tile (ex.: a 1024 tile is 10 bits).
vec4 sampleTexturePackMipWrapped(const in sampler2D tex, in vec2 uv, const in vec2 tile,
const in vec4 packTexFactors)
{
/// estimate mipmap/LOD level
float lod = mipmapLevel(uv, vec2(packTexFactors.z));
lod = clamp(lod, 0.0, packTexFactors.w);

/// get width/height of the whole pack texture for the current lod level
float size = pow(2.0, packTexFactors.w - lod);
float sizex = size / packTexFactors.x; // width in pixels
float sizey = size / packTexFactors.y; // height in pixels

/// perform tiling
uv = fract(uv);

/// tweak pixels for correct bilinear filtering, and add offset for the wanted tile
uv.x = uv.x * ((sizex * packTexFactors.x - 1.0) / sizex) + 0.5 / sizex + packTexFactors.x * tile.x;
uv.y = uv.y * ((sizey * packTexFactors.y - 1.0) / sizey) + 0.5 / sizey + packTexFactors.y * tile.y;

return(texture2DLod(tex, uv, lod));
}


This function is more or less around 25 arithmetic instructions.

Results

The final shader code looks like this:


const int nbTiles = int(1.0 / diffPackFactors.x);

vec3 uvw0 = calculateTexturePackMipWrapped(uv, diffPackFactors);
vec4 terrainType = texture2D(terrainLUT, vec2(slope, altitude));
int id0 = int(terrainType.x * 256.0);
vec2 offset0 = vec2(mod(id0, nbTiles), id0 / nbTiles);

diffuse = texture2DLod(diffusePack, uvw0.xy + diffPackFactors.xy * offset0, uvw0.z);


And here is the final image:



With lighting, shadowing, other effects:



On the importance of noise

The slope and altitude should be modified with many octaves of 2D noise to look more natural. I use a FbM 2D texture that I sample 10 times, with varying frequencies. 10 texture samples sound a lot, but remember that it's for a whole planet: it must look good both at high altitudes, at low altitudes and at ground level. 10 is the minimum I've found to get "acceptable" results.

Without noise, transitions between layers of different altitutes or slope look really bad:

Ysaneya

I waked up..

Yesterday's news was obviously an April Fool's.. But like last year, a lot of people thought it was true. I must say, posting it in the 31th didn't help, but in the GMT timezone it was near midnight and I didn't want to wait another 12h to post it, which would have been in the middle of the 1st April day for me.

The werewolves comment was a funny reference to this thread, but you had to follow it to understand it.

Rest assured that I have more backup copies than just 4.

Most of the events that I described had a part of truth ( yes, including the cops chasing the thiefs in the woods at 3 am. This really happened ). The timing was not though. The probability that all of that happened the same day, on April 1, is ridiculously low. I'd rather win at lottery.
Ysaneya
Unfortunately, I have a bad news to announce.

Sometimes, things that you thought would never happen, do happen. Today, I learned it the hard way.

I'm a pretty paranoid developer. I always keep 4 copies of my work, as backups, in case something happens. In the worst case, only a few days or weeks of work can be lost. Unless.. fate decides otherwise.

When I waked up this morning, I had a bad surprise. My home computer's external USB drive, which I had for 3 years now, had died. I was storing all the Infinity source code and data files on it. Fortunately, I had a backup. Every few days I transfer that backup to my work computer, which is in a different physical location ( in case my appartment burns, or a horde of werewolfs invade it, or something.. ). My workplace is in Brussels too, but unless a nuclear bomb explodes on Brussels, that should be safe, isn't it ? So since I had many backups, I wasn't too worried. Except...

When I arrived at work this morning, I found that everybody was already here, and in a great panic. Apparently, during the night some thiefs broke in our offices, and since they weren't too tech-savy, they didn't steal the most expensive hardware, but they went directly for our work computers. The bastards! But the most funny thing is, during the night, the office concierge saw some suspicious movements and called the cops. They arrived soon after, and chased the thiefs in the woods. Those last ones probably couldn't escape with all the stolen material, so they threw it away over a fence, to run faster. I don't know if they were finally captured or not, but what I do know is that my computer, on which there was the backup, is badly broken. But fear not! I still have backups. So I still wasn't too worried ( although a bit confident than the first time. It really smelled fishy ). Except...

I also have a backup on my pocket USB drive. It's only 1 GB, so I don't backup as often as on my other computers. A few weeks of lost work, too bad, but not catastrophic either. So I looked in my appartment for the USB key, I was pretty sure I had it in hand a few days ago, but couldn't find it.. until I realized it stayed in my trousers. And I had just washed it yesterday. Guess what ? It didn't survive either.

By now, I was really in panic. But I still had one more chance. One last backup, a .zip archive of the whole project. The one I upload every once in a while on my US server.. in case of a problem ( asteroid fall ? ) in Brussels. So I was saved. The chance that the US server had died during the night was just.. zero.

And I found my file back ! Ouf. Saved. So I quickly copied it to one of my hard drives ( one that didn't burn ), and started to decompress it. A dialog popped in. Please enter the password. Ah, true, it was password protected, of course. But what was the damn password ? No way to remember it, I always choose passwords that are a random mix of weird symbols. But I had noted this password somewhere.. in a file. Where was that file again ?

Then I started to cry when I realized. You see it coming. The file was on my USB drive.

...Tomorrow will be another day, I'm sure it was a nightmare, and I'll wake up..
Ysaneya

Terrain texturing

This journal will cover terrain texturing, and since I haven't taken new screenshots since the last time, it's probably going to lack nice images. But I'll explain a bit the approaches I took with terrain texturing so far, the algorithm I finally settled on, and its limitations.

If you usually don't understand the technicalities of my journals, move on ! because this one will be particularly tough.


State of the art

Preprocessed textures with detail maps

Textures are generated from real photos, retouched by artists, or pre-processed by another software ( like Terragen ). They are stored in files on disk, usually split into areas, and each area is applied to a part of the terrain. For example, Google Earth, Flight Simulator, etc.. are based on real photos, but other games use artist-made textures.

The size of the world is essentially limited by disk space. Because the resolution isn't good enough for close up views, most games apply some detailed textures. Old games only had one detailed texture ( grayscale ), but more recent ones use a set of colored detailed textures.

If I'm not mistken, this is also the technique used in Far Cry / Crysis.

Tiles

In this approach, artists create small textures, called tiles, that contain all the possible transitions between various terrain types. Many of them are packed together in a bigger texture, a set / pack, for efficient usage. This is quite fast and requires less artistic work, but it's an old technique.

Its main flaw is its difficulty to be applied on a terrain with level of detail, and the lack of variety, the repetition patterns, although that can be hidden with a high amount of tiles.

Think of Warcraft 3.

Texture splatting

With this technique, a set of layers ( grass, rock, snow, etc.. ) are blended together depending on local terrain parameters such as slope / altitude.

Most of the time, the algorithm runs at the vertex level: the cpu computes the parameters, and blending weights are passed to the vertex shader. In the pixel shader, the layers are all sampled and combined based on the weights. For example, with 3 layers (GLSL):


vec4 tex0 = texture2D(grassTex, uv);
vec4 tex1 = texture2D(rockTex, uv);
vec4 tex2 = texture2D(snowTex, uv);
vec4 color = tex0 * weight.x + tex1 * weight.y + tex2 * weight.z


One huge problem with this technique is its cost: it's not too bad for 3 layers, but the higher the number of layers, the slower it becomes. Imagine 10 layers. Now, imagine that you also need to sample the normal maps; you're now sampling 20 textures.

I will come back on this technique since it is the heart of what I've finally chosen, with important changes.

Disk space is minimal. Since pixel shading is dependant on the amount of layers that must be combined, a good way to optimize this technique is to compute the N layers the most important per terrain patch, and only sample those.


The old approach

Back in 2005, my first terrain texturing prototype used a derivative of the splatting algorithm. This is, to date, the one still used in the screenshots of the terrain on the website.

The algorithm allocated a unique texture per terrain patch ( let's say 128x128 ), and used the GPU to render to this texture to generate the texturing per slope and altitude, combining N layers.

At this time, I was using 8 layers, and no bump/normal maps. So all it required to work was 8 texture units and pixel shaders 2.0.

There were some real drawbacks though:

- video memory usage: for 1000 patches, a 256x256x4 per patch consume dup to 256 MB of video memory. At 128x128, only 64, but it started to look more blurry.

- small stalls that depend on how fast the pixel shader is at rendering one texture patch.

- no texture compression is possible, as the textures get computed on the gpu, and the compression is implemented in the drivers on the CPU. Enabling texture compression caused the texture to be downloaded from the gpu to the cpu, compressed, then reuploaded cpu to gpu, which cause insane freezes.

- because there's a unique texture per patch, when the terrain LOD stops at the maximum depth level ( lowest to the ground ), textures of course cannot get more refined, and you get a blurry mess.

- screen resolutions always increase. At that time, 128x128 might have been okay in 800x600, but unacceptable for 1680x1050. 256x256 better, but still blurry. The next step would be to use 512x512, but this would cost 1 GB of video memory.

- of course, some video memory must remain for ship textures, backgrounds, effects, GUI, effects buffers, etc.. So it isn't realistic to use 100% of the video memory just for terrain texturing. 50% would be a better number.

- no bump/normal mapping. This would require another renderable texture per patch, multiplying the video memory cost by 2 again.

Summary

If I simply reused this technique today, the results would be:
- let's assume a video card with 512 MB of video memory.
- the budget is 256 MB for terrain texturing
- divided by 2 to have bump mapping, so the real budget is 128 MB
- for 1000 patches, this means the highest resolution the patches could be is 128x128.
- clearly this wouldn't look too good in high resolutions (1280x1024, 1680x1050, etc.. )


Experiments

In January 2008, when I reimplemented the terrain texturing, I experimented many ideas.

The basic one is texture splatting, pretty much as everybody implements it.

But, unlike everybody, I don't have a 10 Km x 10 Km zone to texture. I have a whole planet.

In my early experiments, I have found that 10 layers is the absolute minimum per planet to recreate believable variety. The quota is quickly reached: for an Earth like planet for example: 2 grass, 2 rocks, 2 snow, 2 sand, 1 mud, 1 forest. 16 layers would be more comfortable. But let's stick to 10 layers for now.

The first problem is that most video cards under the latest generation ( GF8 ) only have 16 texture units. Working in multiple passes is a sure framerate killed. So how to do texturing with 10 layers in 1 pass ?

If you want at least some bump mapping, you need 20 texture units. Then you have additional effects that require TMUs too. For example, at the highest quality, shadow maps use 4 TMUs.

So what could be done ? Make a choice:
- goodbye bump mapping
- goodbye special effects (and particularly shadows)
- goodbye variety (reducing the number of layers to 6, which would just fit the 6*2+4 = 16 TMUs).
- goodbye framerate (going to multi-pass and re-render 300K triangles per frame, and keep in mind that the per-object overhead is higher in I-Novae than in other engines due to it working in camera space for high precision).

Clearly, none of this sounds too good either...

Texture sampling explosion

Another problem with the technique described above is that it quickly leads to an explosion of texture sampling instructions. Let's see:

- 10 layers, requires 10 sampling operations.
- with bump mapping, multiply this by 2 -> 20
- UVs must be adjusted to work at any level, from close ground views up to space orbits. For this, you need to sample once for a given UV, adjust the frequency, sample again, and then interpolate. The global cost is x2, so we're now at -> 40
- finally, to avoid popping of textures, blending must be done between 2 textures too. The global cost is again x2, so we're now at -> 80.

You read this right: for 10 layers, a proper shader would need to sample our 10 textures 80 times !

Volumetric textures

My first problem was how to keep all those features in a single pass within 16 TMUs: The Quest For Reducing The Number Of Texture Units (tm), and if possible, reduce the number of samples instructions.

I quickly thought it would be possible to trick it, by packing all the texture layers together as a "stack" into a 3D ( volumetric ) texture.

It worked.. kinda. The TMUs count problem disapeared, but new ones appeared: mipmapping. I though that by playing with filters, it would be possible to only mipmap in 2D but not in the stack / Z direction, but the hardware doesn't work that way..

Disabling mipmaps ? Say bye bye to your framerate, especially in high altitudes views, as all your pixels are high frequency and need to access the volume texture in a random order.

Later I realized that this could be properly implemented ( with good mipmapping and performance ) with texture arrays. But this is a GF8+ extension, so this would mean requiring pixel shaders 4.0. Maybe in a few years..

Texture packs

I was starting to run out of ideas when I realized the layers could be packed together into a huge texture. For example, if all texture layers are 512x512, you can pack 16 of them (4x4) in a single 2048x2048 pack.

Mipmapping wasn't obvious anymore, as mipmaps had to be recreated manually by taking care of the packing and adjacent pixels.

Tiling wasn't obvious anymore, as what you actually need is sub-tiling: tiling UVs within a region of a bigger texture. But this could be faked in the shader..

Mipmapping also required special care, as you need to compute the mipmap level yourself in the shader ( rather than letting the hardware do it ), else you get seams when mipmapping between tiles within the pack.

Fortunately, all of those, while tricky, are relatively inexpensive, a few instructions each. And they are not per-layer, but per pack, so finally, you only need to do it once or twice in the complete shader..

The procedural texturing had to be changed, too. Instead of getting blending weights, I had to sample for a texture ID given a slope/altitude. This ID is then used to compute the UVs for the tile within the pack. This is also a fast operation.

In total, to have mipmapping + bump mapping + 10 or more layers + morphing + UVs at all altitudes, the shader only needs 12 texture samples. Much better than the 80s of the previous algorithm.

The downside is that it's no longer a pure texturing bottleneck, as all the tricks require arithmetic operations. While each one is cheap, it quickly sums up and becomes expensive.

The final shader is around 300 instructions, and the texturing part only consumes 3 TMUs:
- one TMU for the diffuse texture pack
- one TMU for the bump texture pack
- one TMU for the lookup table (slope/altitude -> layerID).

If you have followed so far, you would have noticed something else: the LUT gives a single layerID. So only one layer is used per pixel. This leads to the "sharp edges" in terrain features that many people have noticed and criticized in the latest terrain screenshots.

The LUT can of course be used to store a second layerID, but this multiplies the number of samples by 2. Since the shader is already super slow.. I'm living with this "limitation".


The future ?

The terrain shader is by far the most complex and tricky shader I've ever written, but there's still a lot of room for optimizations. I already have ideas to save 20-30 instructions in the shader without too much effort. By optimizing even more, I'm confudent I can get it down to 250 instructions. That's still a lot, so in high resolutions ( and even not-so-high ones ), the game tends to be heavily pixel-shader limited.

This is of course assuming max quality settings. The shaders all have pre-processing directives, so complete features / effects can be disabled to save performance on slower video cards.

I have no plans to continue to work on the terrain shader in the short term. Maybe in 1 or 2 years, I will come back to it, especially as more and more video cards become pixel shaders 4.0+ able, the texture arrays approach would allow to reimplement this shader in a much more efficient way. Combine that + the rise of video cards powers, and it is likely that in 2 years, a texture-arrays shader on a next-gen cards would be 2 or 3 times faster than what I currently have on my GF8.

If people are interested in all those tricks for the shader, I can post some snippets.

In a future dev journal, I will also come back on noise, and how I need to combine 10 octaves per pixel ( so 10 additional texture samples ) in order to make the features more natural. I'll also mention a few words about geometry ( procedural heightmaps generation ), clouds and water.



Ysaneya
No way in hell I could have done that manually, so I asked OrangyTang from gamedev.net his python script to collect all pictures posted in gamedev.net's journals, and adapted it for the contributions forum.

The result is a mega thread of 25 pages, each containing hundreds of thumbnails of pictures that have been posted in the contributions forum. That's around 12000 images !

All the pages directory can be seen here. Sorted by time, so the more recent pages are in the 20-30:
http://www.infinity-universe.com/Infinity/Contribs/

Clicking on a thumbnail will lead to the thread it has been posted in.

The Python script used to produce those images is also in the above directory. I had to adapt it to parse phpbb forums, ignore unaccessible/deleted threads, ignore avatars/signatures/smileys, etc.. I also made it multi-threaded (20 threads yey!) so that it processes faster.
Ysaneya

Terrain engine

Nitpicking

- clouds don't have a coriolis effect
- no storm effect
- motion blur / bluriness in starfield
- clouds have patterns
- terrain too blue
- atmosphere too thick
- over saturation to white
- areas too sharp / contrasted
- terrain only based on altitude / too randomized
- texture pack for ship is too low contrast / flat
- jagged / aliased shadows
- too strong stars reflections
- lack of bloom
- star field edges are visible
- only one type of clouds

I'm not complaining. Just noticing. I'm getting more and more scared of posting updates. The amount of anticipation, hype and expectation is rising, and honnestly, while many of those remarks are valid and will be fixed, many of them are just not on my todo list.

Take the comment about jagged shadows for example. I've explained in great lengths in past dev journals that a technical choice had to be made between shadow resolution / aliasing, shadowing range and performance. If you increase the range and keep the same number of shadow maps, you'll get more aliasing. If you increase the shadow resolution or amount of shadow maps to decrease the aliasing and keep the same shadowing range, you'll hurt performance ( which is already quite low ).

It's a bit annoying as a programmer to say "this can't be fixed, or I don't have more time to spend to improve that", but really, I have to progress on the game itself.. all I'm saying, nit-pick as long as you want, but don't expect everything to be perfect at release time.

Screenshots time

Sorry for the lack of anti-aliasing, I forgot to enable it before taking those pictures, and I didn't want to spend another half an hour just to take a new set.

Behold, lots of pictures today !

Terrain texturing, sun shafts / god rays, vegetation ( not procedural, modeled and textured by SpAce ), etc...








































Sign in to follow this