Jump to content

  • Log In with Google      Sign In   
  • Create Account


Member Since 11 Apr 2005
Offline Last Active Apr 18 2016 02:32 PM

#5036314 Writing to Render Target from Itself

Posted by spek on 25 February 2013 - 07:08 AM

You can read while writing (at least on all my nVidia cards), but when you grab neighbor pixels around your current locations -thus pixels that might be processed at the same time- you'll get artifacts (read weird colors). So for blurring effects that typically grab a region around a source pixel, I'd say play Ping Pong.

#5035771 Starting with GL Instancing, advice?

Posted by spek on 23 February 2013 - 09:37 AM

Hey, I'm thinking about implementing Instancing, but before doing so, I need some general advice.


So far, most of the rendering is done by looping through (sorted) lists and call a VBO for each object. In some other cases, like drawing a simple sprite, I still use the good old "glVertex3f(...) <4 times>) method. And then for particles, I use a VBO containing many particles that can be updated with OpenGL Transform Feedback.


It makes sense that using Instancing helps when rendering lots of the same object (but at a different position, or with slight differences). But in practice, most of my objects only come in small numbers. 3 boxex, 5 decals, 1 barrel, et cetera. Does it still make sense to implement instancing, or does it actually slow down things with overhead? Or to put it different, when rendering a single quad, is instancing equal or faster than doing it the old 4x glVertex3f way?



Second, some of my objects are animated, and use Transform Feedback to let a Vertex Shader calculate the skinned vertex positions. Each animated model would have its own secundary VBO to write back the transformed vertices. This takes extra memory of course, then again the amount of animated objects is very small compared to the rest. I'm guessing this technique does not co-operate with instancing in any way right?



Third. My meshes use a LOD system. So when getting further away from the camera, they toggle to a simplified mesh. Is it possible to put all LOD models (usually I have 3 to 5 variants) in a single buffer and let the Instancing somehow pick the right variant depending on the distance? Or would using LOD's being less needed anyway?



Any other things I should keep in mind when working with Instancing?


Merci beaucoup


#5032689 Performance & Struct sizes

Posted by spek on 15 February 2013 - 09:38 AM

All right, seems I'm a bit too concerned then hehe, other than the struct size should be dividable through 4 on a 32bit system, to put it simple. So, whether a struct would be 28 or 32 bytes wouldn't matter in that case.


Yet, I've read (a long time ago) that the programmer chose for a specific (small) struct size in complex algorithms such as pathfinding, so it could be easily "cached"... I know why things are cached, but forgot how it exactly worked.



Delphi is not padding all structs in my case, since I specifically tell those structs to be "packed" in some cases. Usually in cases where the struct is also stored in a binary file to keep things compact and easier to predict.



#5028123 Trying to understand Lighting general.

Posted by spek on 02 February 2013 - 10:41 AM

I don't know the DirectX terms exactly, but indeed, find an example that does "MRT" (multi render target) so you can draw an object into multiple buffers. I'm pretty sure that nVidia SDK has a lot of examples on that. Once you have set that up, try to render several pixel attributes into your textures. For example:


* texture1: rgb = pixel diffuse color  a = specular term

* texture2: rgb = pixel world position a = specular glossiness

* texture3: rgb = pixel normal   a = ...?


Note this is just an example. You can compress data such as the normals and positions to get more available for other attributes. But this would be an easy start. Also note that this step does not involve any lighting so far.



>> blending

Yes. Compare it with photoshop & layers. In the first layer, you have your scenery with its diffuse colors - no lights yet. On a second layer, you draw a red circle that represents a red point light. Set the layer blending mode to "additive" or "light up", or whatever its called in PS. Then on layer 3, you can make another lamp, and so on. Finally, merge all light layers and multiply it with the first lighting. It's not exactly the same, but pretty close.


Blending is not too fast, though additive blending is pretty simple, and I wouldn't worry about the performance unless you want LOT's of lights and/or target lower end hardware. But once you master Deferred Lighting, you could pick up Compute Shaders which allow you you to do all lights in a single pass, applied on smaller tiles on the screen. The Battlefield3 Frostbite engine has a nice paper that explains this "Tiled Deferred lighting". But anyhow, that's for later concern.

#5016673 Quality of a sprite based game

Posted by spek on 02 January 2013 - 07:00 AM

A Tower Defense game. I would just make it 2D. Because you will likely reach your goal faster (= making the game), it likely runs better on mobile devices, and most important, drawing topview walking characters or towers likely looks better than using (low poly) 3D models. Just compare old RTS games (Command & Conquer, ...). I found the drawed sprite ones look better than their 3D counterparts. Because sprites can show any complex drawing, while with low-poly models, you are stuck to more simplistic shapes. Same with lighting & animations.


I don't know what you are using right now, but OpenGL can draw transparent quads (sprites) just as well. Neither are you forced to use all kinds of complex shaders or lights. In other words, with OpenGL, you can do what any 2D engine can do as well. So if you like the way how GL works, you can still decide to go for GL of course. And since GL works on those mobile devices as well, its not such a crazy choice. 

#5016629 Quality of a sprite based game

Posted by spek on 02 January 2013 - 04:06 AM

A: Yes it can look good, and second, B: what exactly are you trying to make? and C: what other alternative are you thinking about? I'm asking so we can compare for your particular situation,


The quality depends, of course, on drawing skills. But also about the chosen style and consistency of the graphical content. When chosing sprites, don't expect to make photo-realistic graphics. For cartoonish graphics though, it can work out really well. In facts, making your world, characters and animations look cartoonish, is easier to achieve with sprites. Because the shapes are more curvy / organic compared with polygon objects, and you have more freedom with the animations. Not that its impossible to do with 3D, but that requires high skills.


Speaking of skills, when chosing 3D, the overall difficulty to make things look *good* is higher. So unless you have plenty of time to create a "next-gen" engine + the artists that can make the content, a 3D engine will quickly look outdated when people compare it to Crysis, Unreal, or whatever we got. This is less of a problem when going for 2D. At least... if you can pick a specific, consistent style. And since you also aim for mobile devices, you can't make too complex 3D stuff anyway.



As for the technical aspect to make things look good, consider lighting as one of the most important factors. In 3D, there are a lot of options to simulate somewhat realistic lighting. Shadowmaps, geometric shapes, normal/specular mapping, (baked) GI, and so on. With sprites, this is a bit harder to achieve because your scene is literally flat, and thus lacking information you would need for proper lighting. Although a cartoonish style may not need realistic "correct" lighting, you still may need some tricks to get it look cool instead of flat and dull.


This is probably the reason why platform games are often semi-3D. Using a mixture of 3D shapes and sprites. Don't forget you can still use OpenGL for a 2D looking game. But as said, making a 3D engine requires more work in general. So if you don't really need it...




#5014362 Voxel Cone Tracing, more drama

Posted by spek on 26 December 2012 - 02:49 AM

I think mipmapping is so damn slow because it goes through all pixels, several times, for 6 textures. Replaced it with a manual shader now. It injects the voxels again (as points), so those points perform a simple box filter only at the places where it should be. The results are slight different than mipmap (can't say worse or better, varies a bit), probably because my shader is a bit different and because the plotting & sampling coordinates aren't 100% the same. Getting those right is a bitch with Cg and 3D textures. I haven't tried to skip the voxel injection and perform a simplified mipmapping yet... I can render one long horizontal quad (as a 2D object) to catch all layers at once. Far less draw calls, but more useless pixels to filter.


Anyhow, the framerate raised from 3 to 15 fps, which is not bad at all for my old nVidia 9800M craptop card! For the info,

- framerate was already pretty low due lots of other effects (somewhere around ~24)

- GI effect includes a upscale filter that brings the 1/4 GI buffer back to full size, polishing the jagged edges

- Only 1 grid used so far (128 ^3 texture, each pixel covering 25 cm3)

- 9 diffuse rays, 1 specular ray

- With VCT, the framerate was ~5. Both the construction & the raymarching goes a lot faster with simple texturing


And more important, the results finally look sort of satisfying. Maybe I can show a Christmas shot today or tomorrow hehe. Yet I still think a second bounce is needed if you really want to let a single light illuminate a corridor "completely". But more important for now is to implement the second grid first. And to make some baking options so that the produced GI can be stored per vertex or in a lightmap for older videocards that can't run this technique realtime properly. That would also allow to bake a first bounce (with static lights only) and do a second bounce realtime...




Right now some of the corners appear as brigther spots in the result. Probably because they got a double dose of light indeed. But summing & averaging... For example, I have 2 RED voxels being inserted in the same pixel. One faces exactly to the +Z direction, another only a little bit. The injection code would look like this:

<<enable additive blending>>
float3 ambiCube;
   ambiCube.x = dot( float3( +1, 0,0 ), voxelNormal );
   ambiCube.y = dot( float3(  0, +1,0 ), voxelNormal );
   ambiCube.z = dot( float3(  0, 0,+1 ), voxelNormal );
   ambiCube = abs( ambiCube );
// Insertion
if ( voxelNormal.x > 0 )
    outputColor_PosX.rgba = ambiCube.xxxx * float4( voxelLittenColor.rgb, 1 );
...and so on for the 5 other directions

So, the result could be rgba{1,0,0,1} + rgba{ 0.1, 0,0, 0.1 } = rgba{ 1.1, 0,0, 1.1 }


When dividing through an integer count (2 in this case), I get a dark result. If the other voxel would have rotated slightly further (not contributing to +X axis), the result would have been bright red though. Dividing through its own occlusion sum (1.1) would give a correct result in this particular example, but not if I would have inserted only the second voxel. In that case it would get too bright as well. rgba{0.1, 0,0, 0.1} / 0.1 = rgba{1, 0, 0, 1}


That's why I couldn't find a good way yet. I had the same problem with plenty of other similiar GI techniques btw (LPV for example). In case the voxels are as big as your cells, I would just use Max filtering instead of averaging, But that doesn't work too well when inserting the voxels in a much coarser grid though.



Thanks for helping,


#5013972 Voxel Cone Tracing, more drama

Posted by spek on 24 December 2012 - 11:10 AM

I've been making 3D textures with the world directly injected into them as well, using 3 grids. So I can compare with VCT. I did that a few times before, although I didn't really make use of "cone sampling" concept, leading to serious undersampling issues. Instead I just fired some rays on a fine grid, and repeated the whole thing on a more coarse grid and lerped between the results based on distance.

However, I'm running into some old enemies again. Probably you recognize those (and hopefully fixed them as well :) ).

* MipMapping
Probably I should do it manually, because when simply calling "glGenerateMipmap( GL_TEXTURE_3D )", the framerate dies directly. Instead I could loop through all mipmap levels, and re-inject all voxels for each level. Injecting is more costly, but there are way less voxels than pixels in a 128^3 texture (times 6, and 2 or 3 grids).

* Injecting multiple voxels in the same pixel
The voxels are 25 cm3 in my case, so when inserting them in a bigger grid, or when thin walls/objects are close to each other, it happens that multiple voxels inject themselves in the same pixel. Additive Blending leads to too bright values. Max filtering works good for the finest grid, but does not allow to partially occlude a cell (for example, you want at least 16 voxels to let a 1m3 cell fully occlude).

I should be averaging, eventually by summing up the amount of voxels being inserted in a particular cell (thus additive blend first, then divide through its value). But there is a catch, the values are spread over 6 directional textures, so it could happen you only insert half the occlusion of a voxel into a cell for a particular side. How to average that?

* edit
Still superslow due my lazy mipmapping approach so far, but the results look much better than I had with VCT. Indeed no banding except for specular reflections using a very narrow cone. And it seems the light spreads further as well. Yet, fixing the problems stated above will become a bitch. As well as the occlusion problem. Making the walls occlude as they should, block light in narrow corridors. Reducing the occlusion on the other hand gives leaks. I guess the only true solution on that is using more, less wide rays. I'm curious what the framerate will do. If its higher than with VCT, I can spend a few more rays maybe, though I'm more interested in adding a bounce eventually.

As for the limited size, right now I'm making 2 grids. One 128^3 texture covering 32 m3 (thus 25cm3 per pixel), and a second grid covering 128 m3 (thus 1m3 per pixel). Far enough for my indoor scenes mostly. Outdoor scenes or really bigass indoor areas should switch over the coarser grids. Well, having flexible sizes is not impossible to implement, we could eventually fade over to a larger or smaller grid when walking from area into another. May lead to some weird flickers during transition though...

Merry Christmas btw!

#5013629 Voxel Cone Tracing, more drama

Posted by spek on 23 December 2012 - 04:15 AM

Thanks again you both.


Indeed, the mipmapping issues are giving the tiled look I think. It will be hard to solve, but eventually I'll find something on that. I've also been thinking about scrapping the whole brick idea (which is already problematic on my somewhat older hardware) and to fall back on 3D textures covering the world. Like you described earlier. But, instead of raymarching through the textures (thus sampling 3 textures each step), I could still keep using the VCT octree to see if there is anything useful to sample at a certain point. So, a hybrid solution. I think sampling textures is actually faster than keep using a SVO, but it might be worth a try. Especially if the opacity calculation becomes more complicated (see below).


If I may ask, how do your textures cover the world (texture count, resolution, cubic cm coverage per pixel)? I suppose you use a cascaded approach like they do in LPV, thus having multiple sized textures following the camera. Do you really mipmap anything, or just interpolate between the multiple textures?




Anyhow, you said your solution didn't show banding errors. How did you manage that? Even with more steps and only using the finest mipmap level (which is smooth as you can see in the shots above), banding keeps occuring beceause of the sampling coordinate offsets. In the shots above, you'll see the banding on the wall. The finest brown bands on the wall are sampled from smoothed bricks (mipmap level 0). For the info:

* the smallest octree nodes are 25 cm3

* the ray takes about 4 steps in each node (slightly less, as the travel distance increases each step, depending on the cone angle)


Of course, there still could be a bug in the sample coordinates, but I'd say bandless sampling is just impossible. At least not in the way how I push the ray forwards.




>> Try to remove the opacity calculatuon

Good idea, and just did it. Didn't do any blocking at all, just to see if those dark T-junction corridor parts would catch light now. To exclude eventual other bugs. And... yes! A lot more light everywhere. As you say, the corridors quickly close in, blocking light on the higher mipmapped levels. But, just removing the opacity also leads to light leaks (and much longer rays = slower) of course.


Unless there is another smart trick for this, the only way to fix this is by providing more info to the voxels. In my case, the environment typically consists of multiple rooms and corridors close to each other. Voxels could tell from which room they are. So if the ray is suddenly sampling values coming from another room while the occlusion factor is already high, you know you probably skipped a wall. But yet, this sounds like one of those half-working solutions.



>> The red carpet receives some reflected light from the object in the center

True, but would that really result in that much light? Maybe my math is wrong, but each pixel in my case launches 9 rays, and the result is the average of them. In this particular scenario, the carpet further away may only hit the object with 1 or 2 rays, while the carpet beneath is hits it much more times. In my case the distant carpet would probably either be too dark, or the carpet below the object too bright. In the shot the light spreads more equally (realistically) though.



@Frenetic Pony

Thanks for the OpenGL demo, though it doesn't run on this computer hehe. Trying to dig out the shaders but asides from some common and mipmapping shader, I couldn't find the ray marching part.


Probably Unreal4 GI lighting solution isn't purely VCT indeed, but all in all, it seems to be good enough for true realtime graphics. One of the ideas I have is to make a 2 bounce system. Sure, my computer is too slow to even do 1 bounce properly, but I could make a quality setting that toggles between 0, 1 or 2 realtime bounces. In case I only pick 1, the first bounce is baked (using the static lights only) into the geometry. Not 100% realtime then, but hence none of the solutions in nowadays games are. A supercomputer could eventually toggle to 2 realtime bounces.


And yes, some extra's like SSAO, secundary pointlights or manually overriding the coloring of some parts should stay there. In the horror game I do, we don't always necessarily want realistic lights! Horror scenario's often have large contrasts between bright and dark.


Not sure if I get the Pixar method right... You mean they just add some color to all voxels in the scene?



#5013093 Voxel Cone Tracing, more drama

Posted by spek on 21 December 2012 - 05:41 AM

Oh, one easier to answer little side question... I noticed the performance would suffer quite a lot when chosing a different struct size for my Octree nodes (each node is stored in a VBO). Not sure what I did, I believe reducing the size from 128 bytes to 64 or something. I thought that would make things a bit faster, but the performance actually dropped a lot. Maybe that was a bug elsewhere but anyway: what is a desired struct size for the GPU? Right now my voxels are 64 bytes, octree nodes 128 bytes (using some empty filling to make it 128b).

#5013068 Voxel Cone Tracing, more drama

Posted by spek on 21 December 2012 - 04:01 AM

Sorry to keep hammering on "Voxel Cone Tracing", with a looong post, but since I'm pretty close (I think) I want to finish it for once and for all.
It works more or less now. I can sample GI and specular light from a mipmapped octree. But, it just doesn't look good. Some parts do, some absolutely not. The 3 major issues (besides performance):
  • Light doesn't spread that far (1 bounce), especially not in narrow corridors that I have
  • Light spreads unequal. Incoming colors & strength vary too much.
  • Messy colors
  • Banding artifacts (see previous post)
  • Grainy result due low input resolution & jittering blur
However, when looking at Unreal4 or Crassins results, I believe better results should be possible. So I'm basically curious if someone with experience can point me to the cheats or critical implementation parts. Hence, if you have a interest & time, I would even invite you to help setting up VCT for our game "Tower22". I spend too much hours in GI the last year(s), the pain has to end!
Let's walk through the issues. But first, there might be bugs in my implementation that contribute to these errors. Then again, ALL raymarch / 3D-texture related techniques I tried so far are showing the same problems, not VCT in particular. So maybe I'm always making the same mistakes.
1- Light doesn't spread that far
In Tower22, the environments are often narrow corridors. If you shine a light there, the opposite wall, floor or ceiling would catch light, making an "area" around a spotlight. But that's pretty much it. Not that I expect too much from a single bounce, but in Unreal4, the area is noticably affected by incoming light. Even if it's just a narrow beam falling through a ceiling gap. The light gradually fades in or not, nost just pops in a messy way on surfaces that suddenly catch a piece of light.
Or how about this:
Assuming the light only comes from the topleft corner, then how does the shadowed parts of the red-carpet compute light if they only use 1 bounce?? The first bounce would fire the light back into the air, explaining the ceiling, spheres and walls receiving some "red" from the carpet. But the floor itself should remain black mostly.
Unless they use 2 bounces, or simply another trick (AO / Skylight?) in addition. Going to 2 bounces sounds like a must to me, but first I want to make sure I'm doing everything right in the first bounce. And buy a much faster computer.
2- Unequal light spread
This is killing the quality. It could be code errors, although I don't see bugs in particular when looking at the specular results. Meaning the rays collide at the correct points, also in the higher mipmaps. 
With GI though, it just becomes messy. This has to do with banding errors (see #4), but I suspect more is going on. For the info, I'm using 9 rays for GI. The cone angle is adjustable. When using a different cone, the results are different as well, but not "worse" or "better". Just different. I think the cones shouldn't be too wide (in narrow environments, see pic above), or your rays will quickly stop half way the corridor as they already collide. Making them narrow on the other hand gives undersampling issues, so 9 rays becomes too little.
When you look at some of my shots, you can clearly recognize tiles. I mask that a bit by varying the sample directions randomly a bit, and by blurring afterwards. But still. you can pick them out easily (not good!). This has to do with mipmapping problems, see picture. The higher mipmapped levels aren't always smoothed, making them look MineCrafted. This is because when I sample from the brick corners in their child nodes, those locations may not be filled by geometry at that exact location (thus black pixels). Ifso, the
mipmapper samples from the brick center instead to prevent the result turning black. Well, difficult story.
The badness becomes far more visible when I show AO instead of colors. AO is then simply based on the average distance the 9 rays traveled. But it just varies for each location. I'm guessing some of the rays slip through walls, giving the maximum travel distance. It's nowhere close to the smooth result Crassin showed.
3- Messy colors
In addition to #2, I wonder if I inject the lighting right into the octree. If you look at the shots, you see the octree basically contains a blurry version of the actual scene. You also recognize the textures back in it (see the wood or wallpaper varying colors). This variation in color will also have its effect on the result, making it more messy. Plus in this particular case, the room will get very orange/brownish because most of the reflected color comes from that brown wood floor. Maybe its better to saturate the colors more towards grayish values, and maybe use 1 single color per texture. To reduce the variation.


4- Banding

My previous post explained it with pictures pretty well, and the conclusion was that you will always have banding errors more or less. Unless doing truly crazy tricks. In fact, user Jcabeleira showed me a picture of Unreal4 banding artifacts. However, those bands looked way less horrible than mines. Is it because they apply very strong blurring afterwards? I think their initial input already looks smoother. Also in the Crassin video, the glossy reflections look pretty smooth. Probably he uses a finer octree, more rays, more steps, and so on. Or is there something else I should know about?
My approach takes 4 steps per octree node. Earlier with only 1 step, I would per accident skip walls if the ray would sample somewhere at the edge of a node (stored as a brick in a volume texture). By taking multiple smaller steps, this problem was solved more or less, and the bands got finer (but still very noticable). Maybe I should take way more steps, but obviously, it will kill the performance as the amount of octree traversals and volume texture reads would increase insanely.
Yet another thing I might be doing wrong, is quadrilinear sampling (am I spelling that right?). In a mipmapped 3D texture, hardware does this for you. But in our case the 3D texture is sparse, meaning the bricks are scattered everywhere. So instead of really mipmapping the volume texture, I just generete extra bricks for the higher mipmap levels that are placed elsewhere. When travering the octree, it remembers the target node, but also the previous node from 1 level higher. This gives access to two different bricks,
so I can lerp between them. Not very subtle though. Each time I have to read the textures, I actually read 6 times. Positive or Negative X value, positive or negative Y value, pos/neg Z value. And that for 2 mipmap levels.
5- Grain
The grain is easy to explain, and probably the easiest to fix. I'm doing the sampling at 1/4 size of the screen. Then upscale it with a jitter blur. This produces big random blurry pixels. Using 1/2 of the screensize would save a lot, but again makes the technique a lot slower. No idea what Unreal4 does.
Apologees for the long post, but I just don't know how to explain it shorter. Complex stuff!!

#5009569 Voxel Cone Tracing, raymarch collision testing problems

Posted by spek on 11 December 2012 - 04:27 PM

I would swear I attached an image in the first post... Anyway, a pic showing some of the problems. Let me know if it's not clear hehe.

Attached Thumbnails

  • VCT_RayMarchProblem.jpg

#5009565 Voxel Cone Tracing, raymarch collision testing problems

Posted by spek on 11 December 2012 - 04:18 PM

I'm in the need for a smart guy/girl again to finish the last bits of "Voxel Cone Tracing". Though this problem is not just for VCT, but for raymarching & using a volume(3D) texture to test collisions in general.

So far I made the octree, the bricks, and mipmapped the whole thing. There are some differences with the original CVT technique. For one thing, I'm making 2x2x2 pixel bricks instead of 3x3x3. I can still smoothly interpolate the bricks over the geometry without getting a tiled view.

However, the actual raymarching is penetrating the walls (thus sampling light behind an obstacle) too often. I know why (see the picture), but how to solve it? In short, when doing linear sampling in a brick, the occlusion value I get is often too small to stop the ray. So the result will partially pick from geometry further behind. Also another problem is that my results are too dark in some cases, if the ray samples from a higher mipmapped level that was mixed with black pixels (no geometr-vacuum octree nodes). In the image, you can see that the results really depend on at which point I sample within a brick. When MipMapping =blurring with unfulled neighbor nodes, all these problems get worse, One extra little annoying problem is that this also created "banding" artifacts.

There are 2 simple things I can do A: sample with "nearest" instead of "linear" filtering and B: take a lot more (smaller) steps to assure you sample the same node multiple times. However, solution A will lead to "MineCraft" results, and B makes the already heavy technique even slower. And still doesn't guarantee rays penetrating unless I take awfully much samples.

As for the (compute) shader code, let's illustrate the path of a single ray:
rayPos = startPoint + smallOffset
occluded= 0
color   = (0,0,0)
radius  = 0.25 // my smallest nodes start at the size of 0.25 m3
while (occluded < 1)
// Get a node from the octree. The deepest level
// depends on the current cone radius
node = traverseOctree( rayPos, radius )

// Check if there might be geometry (thus nodes)
// at the current cone size level
if (node.size <= radius)
  // Sample brick
  // localOffset depends on the ray position inside
  // the
  localOffset = absToLocalPosition( rayPos, node.worldPos )
  texCoord3D = node.brickCoord + localOffset
  colorAndOcclusion = sampleBrick( texCoord3D, rayDirection )
  // Add
  occluded += colorAndOcclusion.w
  color	+= colorAndOcclusion.rgb

// increase cone size so we take bigger steps
// but also sample from higher (more blurry) mipmapped nodes
radius += coneAngleFactor

// March!
rayPos += rayDirection * radius

So, to put it simple, the ray keeps moving until the occlusion value gets 1 or higher. When there might be geometry at the ray position, we add the values we sample from a brick, stored as 2x2x2 pixels in a 3D textures. Probably important to know as well, the color and occlusion we sample also depends on the ray direction, and in which way the voxels were facing.

// Using 6 textures to store bricks.
// Colors & Occlusion get spread over the 6 axis (-X, +X, -Y, ...)
float3 dotValues;
  dotValues.x = dot( float3( 1,0,0 ), rayDirection );
  dotValues.y = dot( float3( 0,1,0 ), rayDirection );
  dotValues.z = dot( float3( 0,0,1 ), rayDirection );
  dotValues = abs( dotValues );

if (rayDirection.x > 0 )
  colorX = tex3D( brickTex_negativeX, texcoord ); else
  colorX = tex3D( brickTex_positiveX, texcoord );
if (rayDirection.y > 0 )
  colorY = tex3D( brickTex_negativeY, texcoord ); else
  colorY = tex3D( brickTex_positiveY, texcoord );
if (rayDirection.z > 0 )
  colorZ = tex3D( brickTex_negativeZ, texcoord ); else
  colorZ = tex3D( brickTex_positiveZ, texcoord );

float4 result = colorX * dotValues.xxxx +
   colorY * dotValues.yyyy +
   colorZ * dotValues.zzzz ;
That means when the ray travels almost parallel to a wall, it only gets a bit occluded by the wall (which makes sense I'd say).

Well, anyone experience with this?

#4986218 tex3D & raytracing performance

Posted by spek on 02 October 2012 - 04:34 PM

Simple question... or at least... let's start simple: Is a tex3D read (much) more expensive than a tex2D read?

With attempt #3529 on G.I. I tried to store light data (as voxels) in 3D-textures and raycast through them in several ways. It "works", but in order to keep it realtime, I need to reduce the raycasting loops as little as possible. Less voxels, short-range rays, less rays... All in all, the quality suffers and its still pretty damn slow. To give an idea what's going on:
// for ~25.000 voxels (rendered as points into a 64x64x64  3D texture)
// the fragment shader sends out 9 rays in several directions, to see where they collide with
// the environment (stored as SH values into another 64x64x64  3D texture)
for (int i=0; i < RAYCOUNT; i++)
     float3 direction = RAY_DIR[i];
     float3 rayPos = startPos;

	 // Check where the ray hits a surface
	 while !occluded  &&  steps++ < 25
	     rayPos += direction * stepsize;
	     float4 occl = tex3D( shTexOccl, rayPos );
	     // decode Spherical Harmonics, and check if we collided
     // Get GI data from ray end position
The raycasts are killing here. But at the same time, other effects that test rays over 2D textures don't seem to be very slow (RLR reflections for example can easily do hundreds of checks per pixel over a fullscreen image)...
- Is tex3D really that slow?
- Or is it just because the rays fly in all directions, making it very hard for the GPU to do things in parallel / texture caching?
- And/or is the loop coded in a dumb way?

Of course, there are multiple ways to store data. I could splat down everything into a 2D texture, or put it in a buffer for example. Not figured out how Crassin exactly stored his Voxel Octrees for techniques like "Voxel Cone Tracing", but it seems to be a mixture of 3D textures, data-buffers, and moreover, fast enough to achieve awesome things in realtime... Any hints on effective raytracing on a GPU?


#4928873 Is it possible to prevent players from altering the client side graphics in a...

Posted by spek on 06 April 2012 - 01:12 PM

Not sure how they do it, but one simple way to cheat is tweaking shader code or adjusting images being used in the game. For example, making all the leaves/grass transparent.

Maybe a wild idea, but maybe this helps detecting cheaters (as far that as is possible):
- For each map, before starting, render a special part of the scene in the background. This scene is a compilation of typical content of that map. Grass, a soldier, coverage, and whatsoever.
- Render the scene and take a snapshot
- Compare the snapshot with a prebuild (hardcoded) bitmap that contains the same image how it *should* look
- If not (almost) equal, it means the player tweaked something
- Shout Al Qaeda and blow up his computer

Basically it's sort of an iris-scan.

Of course, there are some practical issues as images never look 100% the same, especially not if there are many options. So there must be some tolerance in the comparison, otherwise noone will be able to play. But other than that, the comperator should detect invisible grass, blue colored handgrenades, or whatever that has been visually changed by adjusting the textures/shaders...

I bet there are still workarounds, as said, it's just a wild idea that popped up suddenly.