• Advertisement


  • Content count

  • Joined

  • Last visited

Community Reputation

279 Neutral

About GuyCalledFrank

  • Rank
  1. drawing clouds in a flight sim

    I once worked on a flight sim and also had to do the clouds you can fly through. I started with Wang's (MFS 2004) approach and added some shader tricks later. Results were not super realistic, but, well, they worked. We were able to set different density from clear to totally overcast and fly through. http://oi50.tinypic.com/fc7f5y.jpg http://oi48.tinypic.com/2njzuis.jpg http://oi50.tinypic.com/5clisk.jpg http://oi45.tinypic.com/2v8onl2.jpg http://oi45.tinypic.com/2yu0pxh.jpg   Each cloud is made of 5-20 billboards and there are around 0-200 clouds on the sky. Billboard positions inside a cloud are randomly generated when the game starts. On runtime they were "tiled" - I just moved them according to the wind and when a cloud moved beyond certain limit from camera, it then would respawn far away on the other side of the sky and will start moving again. All billboards of all clouds are drawn in 1 DIP, thanks to instancing. However, due to alpha sorting, instancing buffer has to be updated each frame (there must be a better way). Shader tricks include making the bottom of cumulus clouds look kinda flat, direct lighting, simple fake SSS and alpha/billboard rotation changes when inside a cloud.
  2. max coord values and fp precision

    I've just read a paper about how related problems were handled in Just Cause 2. May be useful (from page 9): http://www.humus.name/Articles/Persson_CreatingVastGameWorlds.pdf
  3. Area lights (Forward and Deferred)

    That BGE video looks really nice imho. Seems similiar to billboard reflections (quad intersection in shader + mip adjustment for glossiness) plus some pretty attenuation. I would like to know how to compute such attenuation . Spherical thing is a little different. --- Missed Chris_F's link in 3rd post... will take a look
  4. Bumped cubemap aliasing

    Well, I actually quoted this link as "original Toksvig paper" in my first post, it's clickable there ) --- OK, I think then it should be log2(toksvigFactor * p), where p is kind of specular power for biggest mip level (as if it contains specular highlights). As I use Modified CubemapGen for cubemaps generation, I even know excatly this number (it's named "cosine power" there). But results are still disappointing: http://oi50.tinypic.com/hv782f.jpg
  5. Bumped cubemap aliasing

    I've noticed a lot of shimmering coming from shading, especially from specular and environment maps used together with normal maps, and I really really want to fix it some way.   The best solution I found on topic is this (because of simplicity and low performance hit): http://blog.selfshadow.com/2011/07/22/specular-showdown/   These Toksvig maps gave me kind of acceptable results with blinn-phong: Without: http://oi48.tinypic.com/mlmfia.jpg With: http://oi49.tinypic.com/346vnvd.jpg   However, I'm still struggling with getting them work with cubemaps. Original Toksvig paper has a little advice, saying "This effect can be modeled by computing lod for the environment map as a function of log2(some_greek_letter)", but it's not really helpful. Anyway, we can vary the size of specular highlights, but we are still a bit limited with envmaps, because we only have a discrete number of mips and a linear interpolation between them - and I think, this limitation can break some theory behind Toksvig AA (or maybe I'm wrong). I tried different ways to use Toksvig factor in texCUBElod, but no success, I have either aliasing or everything is just totally blurred.   Aliased: http://oi46.tinypic.com/2961zlv.jpg (1-Toksvig)*numMips as lod: http://oi47.tinypic.com/344fm6v.jpg   And here is the best version, just brutally supersampled cubemap inside shader (11 fetches): http://oi46.tinypic.com/1679sf7.jpg But it's a huge performance hit!   Is it possible to get that supersampled quality, only exploiting mips and not relying on heavy amounts of fetches?   Thanks in advance.
  6. CUDA + D3D9?

    Hi. I'm trying to add some CUDA functionality into my existing game engine. What exactly I want to do is to render to a texture, postprocess it with CUDA and then use processed texture in shaders when rendering. Even more exactly: I want to calculate summed area table for a texture, because doing it with multiple passes/screen aligned quads (as usually advised) is terribly slow (talking about a 2048x2048 texture). I'm not really sure that CUDA will make it much faster, but at least I want to give it a try. Got last CUDA (5.0). All examples compile and work fine.   But I'm facing some rather silly error and I can't find what causes it. I did everything exactly as in their d3d9 examples:   cudaGetDeviceCount - returns 1 device (my card), good cudaSetDevice - set it to my device 0 (though it seems to be optional). no error returned cudaD3D9SetDirect3DDevice - give it my d3ddevice which is not null - no error returned   cudaGraphicsD3D9RegisterResource - give it a pointer to an uninitialized cudaGraphicsResource*, a 2D texture and cudaGraphicsRegisterFlagsNone. And after that I get: All CUDA-capable devices are busy or unavailable   Why? Can't see any difference between API calls in my app and their samples. The only cause of such error I googled so far is that cudaDeviceProp::computeMode may be set to a weird value, but it's not my case (checked with cudaGetDeviceProperties).   Thanks in advance.   --- Things are getting even weirder: if I create a fresh project, I can type and run any cuda code there without a problem. For example, cudaMallocPitch runs fine just after main. BUT, pasting the same cudaMallocPitch after main in my old engine project makes it show that error! ----   Ah, stupid me. It turns out that a long time ago I messed with the default heap/stack allocation sizes in linker settings. Setting them back to default values fixed the error I should probably close this topic, but I don't see how.
  7. How is it possible to make BSP tree traversal run in N threads? More concretely, I have a task, described by such pseudocode: [CODE] node { plane bounding_sphere leafData* }; array sortedDataToRender; processNode node { if (node.bounding_sphere in frustum) { if (node is leaf) sortedDataToRender.push_back(node); if (camera.pos behind node.plane) { if (node.right) processNode node.right if (node.left) processNode node.left } else { if (node.left) processNode node.left if (node.right) processNode node.right } } } [/CODE] This code achieves 2 aims: - no need to check is object in frustum if its parent node is not; - free back-to-front/front-to-back sorting. There are potentially thousands of leaf nodes may exist. Splitting planes for each node are generated as shortest axis-aligned planes dividing AABB of objects on two parts; they are always go through AABB's center But this code is very branchy and recursive, I just cant understand how to make it more flat and parallelable.
  8. Streaming mesh data in background

    hmm you're right about buffers being too large. I think I'll try both techniques and take the most appropriate for certain scenes.
  9. Streaming mesh data in background

    [quote]would do it in several frames, each frame allows only a limited amount of data (i.e. 1000 vertices ) to be uploaded. This way you have a constant impact on performance. A mesh should be marked as hidden during upload time to prevent any artifacts or crashes while rendering.[/quote] totally agreed [quote]There is very little difference between creating a texture in vram and then updating it vs creating it when you already have the data.[/quote] Creation time includes additional API calls and VRAM block allocation, it is better to avoid them if possible (just like we're trying to avoid dynamic allocation in RAM at runtime). And if I want to avoid these dynamic allocations, then it is a bit more difficult to follow the first two rules, which would be really simple as you mentioned in case of create/release_for_each_mesh scenario. I think the problem here is not with bandwidth but with additional calculations needed to find a free data block in VRAM, also chaotic allocation/deallocation may bring fragmentation. So I came with a new idea: store one huge VB and IB for all static meshes and treat it as circular buffer. If we need to load a new chunk of static meshes, we have 2 ways: - this mesh is already in the buffer (found through std::map or something) - just link the RAM representation of mesh to the appropriate offset of buffer, increment reference counter for this mesh; - this mesh is not in the buffer: increment buffer's "tail" by the size of new VB/IB, stream data to that block. If we need to unload old chunk: - if no references are linked to this chunk - just increment "head" of the buffer; - if there are references, reupload shared data to the tail/change offsets in all references, then increment head (delete chunk)
  10. Hi. I want to get rid of loading screens in game and load anything in background while moving from one sector/chunk/location/whatever to another one. It is not so difficult with textures, because there are not too many texture sizes used, so I can preallocate some number of 256/512/1024 textures and update them on demand. But the main difficulty is VB/IB, because size of meshes can vary a lot. The obvious way is to divide the world on equal-sized pieces and store in VB/IB pool much like textures, but it is not really good idea when you have a lot of duplicated objects like streetlights, trashcans, trees, etc; It is difficult to imagine a game with totally unique geometry. So, somehow I need to manage these duplicated objects with rules like: - avoid allocating/freeing meshes in runtime; - do not store one object multiple times; - do not waste too much VRAM (for example I can go with pools again and store small objects in pool limited by much larger chunk size, resulting in waste of memory); - do not load more than needed (like loading new location with 50% of objects from an old one) What is a best way to accomplish it using DX9? --- I know how to load data to RAM in separate thread, the question is about updating VB/IB only. The best solution I can find is to have one huge VB/IB, upload anything to it, when deleting something - defragment the buffer by reupload existing data to new locations to close the gap from deleted object.
  11. Hi there! I'm a graphics programmer and been told to make the game compatible with eMagin Z800 device WITH some stereo effect. After researching a bit, I found that nvidia stereo drivers are not supported anymore. So the only way to use the device is just plugging it as monitor and rendering simple flat image in 800x600. What exactly stereo drivers did? rendering images for two eyes with interlacing very fast? rendering each eye to a half of the output? Maybe I can simulate them if set up rendering correctly and mess with pixel shaders a bit?
  12. fake alpha to coverage in deferred rendering?

    [quote]That presentation is basically using the same technique proposed in the inferred lighting paper published by Volition.[/quote] seems like it really is. thanks for turning it out. [quote]Like I was saying earlier you can still use the basic premise of A2C, which a screen-space dither pattern used to clip/discard pixels based on their alpha. In fact if you make your own dither pattern and store it in a texture, you can easily make a much better pattern than what's used in the hardware for A2C. The big downside of course is that without MSAA the quality will suffer, since there will be no blended values (each pixel is either on or off).[/quote] already tried it out - terribly noisy, but may work if rendering screen at least 2x larger and then downsampling. I'll try that.
  13. fake alpha to coverage in deferred rendering?

    yes, I actually meant the problem with MRT which is often used for rendering GBuffer. it doesn't support MSAA and it is to slow to render depth/normals/diffuse as separate passes. but i think there is a key in that slides)
  14. Hi. I've been using forward rendering for a long time and specifically MSAA and alpha-to-coverage that allowed me to draw a lot of foliages with nice soft edges without sorting hundreds of them. But depth and shader complexity grew, so I moved to a deferred approach. Everything is works much faster now and FXAA is not too bad comparing to MSAA, BUT the lost of alpha-to-coverage is sad. I'm using DX9 btw. But then I found these slides: http://www.slideshare.net/codevania/deferred-rendering-transparency I'm sure it's some trick to render transparent surfaces similiar way to alpha to coverage, but with deferred shading. But it's on Korean, so I can't get the idea even translating it through google translate. Maybe someone can understand what did the author meant?
  • Advertisement