• Announcements

    • khawk

      Download the Game Design and Indie Game Marketing Freebook   07/19/17

      GameDev.net and CRC Press have teamed up to bring a free ebook of content curated from top titles published by CRC Press. The freebook, Practices of Game Design & Indie Game Marketing, includes chapters from The Art of Game Design: A Book of Lenses, A Practical Guide to Indie Game Marketing, and An Architectural Approach to Level Design. The GameDev.net FreeBook is relevant to game designers, developers, and those interested in learning more about the challenges in game development. We know game development can be a tough discipline and business, so we picked several chapters from CRC Press titles that we thought would be of interest to you, the GameDev.net audience, in your journey to design, develop, and market your next game. The free ebook is available through CRC Press by clicking here. The Curated Books The Art of Game Design: A Book of Lenses, Second Edition, by Jesse Schell Presents 100+ sets of questions, or different lenses, for viewing a game’s design, encompassing diverse fields such as psychology, architecture, music, film, software engineering, theme park design, mathematics, anthropology, and more. Written by one of the world's top game designers, this book describes the deepest and most fundamental principles of game design, demonstrating how tactics used in board, card, and athletic games also work in video games. It provides practical instruction on creating world-class games that will be played again and again. View it here. A Practical Guide to Indie Game Marketing, by Joel Dreskin Marketing is an essential but too frequently overlooked or minimized component of the release plan for indie games. A Practical Guide to Indie Game Marketing provides you with the tools needed to build visibility and sell your indie games. With special focus on those developers with small budgets and limited staff and resources, this book is packed with tangible recommendations and techniques that you can put to use immediately. As a seasoned professional of the indie game arena, author Joel Dreskin gives you insight into practical, real-world experiences of marketing numerous successful games and also provides stories of the failures. View it here. An Architectural Approach to Level Design This is one of the first books to integrate architectural and spatial design theory with the field of level design. The book presents architectural techniques and theories for level designers to use in their own work. It connects architecture and level design in different ways that address the practical elements of how designers construct space and the experiential elements of how and why humans interact with this space. Throughout the text, readers learn skills for spatial layout, evoking emotion through gamespaces, and creating better levels through architectural theory. View it here. Learn more and download the ebook by clicking here. Did you know? GameDev.net and CRC Press also recently teamed up to bring GDNet+ Members up to a 20% discount on all CRC Press books. Learn more about this and other benefits here.


  • Content count

  • Joined

  • Last visited

Community Reputation

2549 Excellent

About JoeJ

  • Rank
    Advanced Member
  1. Algorithm

    How do you mean this? Clustering is usually used for bottom up building processes - this gives better trees but is too slow for games, we always use top down build. The building process is basically the same as for octree, you just (can) use the bounding volume center instead grid cells to get the split point. Also, if you use octree with unconstrained bounding boxes as you describe, then you already use BVH with maximum children of 8 (Nothing wrong with that - that's exactly what i suggested.) But to make clear: I don't wanna sound persuading with something. I can neither say BVH is better than octree nor rebuild is better than update.
  2. Algorithm

    If you are right then my understanding of the term 'loose octree' is wrong (and i've never been sure of that). I'll point this out, because i think i have a better solution for this (also i don't think i'm not the first one using it, but no tutorial covers this, so...) There are multiple ways to handle objects intersecting the grid: 1. Put them in the lowest (internal or leaf) node that does not intersect (e.g. https://www.gamedev.net/articles/programming/general-and-gameplay-programming/introduction-to-octrees-r3529/ ) Problem: Intersecting objects cause much more tests than necessary - an object at world center collides with every object in the scene - unsteady performance between frames - very bad. 2. Put intersecting objects in multiple nodes. Requires to store a list of pointers to objects per node instead a single pointer, can't sort objects to tree without duplication, memory requirements unpredictable - still bad. 3. Put objects into internal and leaf nodes depending on their size. Objects are allowed to extend the grid by half cell size in every direction. This solves all above problems at the cost of a small increase of tests. (This is what i called 'loose octree', but i came up with it myself so let me know the proper name for it). It allows more movement as well but defines the rules exactly and more importantly: It allows to get the bounds from the grid, so no need to store a bounding box per node. How does this relate to BVH? BVH avoids those problems and offers the flexibility from ground up. It's only worth to mention that it makes sense to store large objects in interior nodes as well. If we store all objects just in the leafs, a large objects prevent small bounds, so small objects nearby collide with every other small object near the large one. If all your objects have similar size, it's better to put them all in the leafs. (Decisions like that make it hard to give a good genaral example, so you find lots of research about details, but hardly a good introduction that fits your specific need. But a BVH node has just a bounding volume and some children - it's simple. You should not need a tutorial or a library to make it) Personally i've implemented octree with update and neighbour pointers for collision detection. It worked well at that time but it's the hardest route to go and not very cache friendly. I've implemented BVH for rendering purposes only but i'd pick it for collision detection as well nowadays.
  3. Algorithm

    What we mean with "cache friendly linear array" is simply brute force e.g.: BoundingBox myBoxes[200]; for (int i=0; i< objectCount; i++) for (int j=0; i+1< objectCount; j++) if (myBoxes[i].Overlaps(myBoxes[j]) ) potentialPairs[pairCount++] = MakePair (i,j); The inner loop reads memory perfectly cache efficient. A tree can do tis only for the small number of its children (and the final objects in the leafs). After that you need to read memory from a random location to follow the children -> potential cache miss. So you should try brute force first and see if it's too slow. Did you?
  4. Here are details about Conservative Rasterization using GS: http://wwwcg.in.tum.de/fileadmin/user_upload/Lehrstuehle/Lehrstuhl_XV/Research/Publications/2009/eg_area09_shadows.pdf Poabably the same as in the GPUGems article. Less details here: https://www.seas.upenn.edu/~pcozzi/OpenGLInsights/OpenGLInsights-SparseVoxelization.pdf But they also talk about some atomic lock maybe interesting for your point B.
  5. Algorithm

    You can balance this: Use larger branching factor (binary tree is bad, tree with 8, 27 or 64 children is good - if they are in linear memory order so cache friendly). Build less levels and store more objects per leaf, making a linear copy of those objects. My personal experience is that trees can always become a win for performance, but first you need to make sure you actually have a performance problem (win for 200 vs. 200 - but still not worth the effort, loss for 2 vs. 200)
  6. Algorithm

    You did not tell those numbers, i assumed most of the world would move. But first: 200 objects is not much - you wont see much difference. BVH build should take something like 0.1ms on one core (wild guess). You don't need to delete if you rebuild. You did not intend to allocate each node with new, did you? This would be horribly slow. So slow that i did not try this again the last 10 years - maybe memory allocation is less expensive today, but i doupt it. Just preallocate enough memory and write your nodes in cache friendly order. (I also never use recursive function calls to build or traverse a tree - this has not only an additional cost, it also prevents you from understanding what happens. Better to use another preallocated memory for stack, queue or however you call it. But that's optional. I have to say that most tutorials fail to show this things. They tell you how it works but not how to do it efficiently.) For a 200 vs 2 objects ratio you're right and updating the tree should be faster than rebuilding. But you loose the property that you can write all child nodes in order (instead you get fragmented memory), and this will slow traversal down. You could traverse the result once to sort for memory order, but that's the same cost as a complete rebuild. Also you have to consider always the worst case, which is a larger number of moving objects. You would need to implement multiple methods to profile for the fastest, but usually whatever you pick initially is fast enough. For 200 objects you don't need to worry anyways. Rebuild is easier to implement than update, and BVH is easier and more flexible than octree, so it's a good start and probably you'll keep it.
  7. Algorithm

    There was a thread about spatial hashing recently: https://www.gamedev.net/forums/topic/689399-spatial-hashing/ Personally i'd start with a BVH (AFAIK most if not all physics engines use BVH now), and rebuild it every frame. The fastest way to build is to run over all current objects to get the bounding box, then use the box center to sort the objects to the 8 resulting subspaces (child nodes) from that. This gives not the best tree but quality does matter less than building time - usually. Should be almost as easy to implement than a regular grid.
  8. Algorithm

    This depends on how you store a tree in memory (all children in order vs. a pointer to the next child), what kind of tree you use, how your dynamic / static object count ratio is ,etc. Updating trees can be fast and good, e.g. octree for collision detecten if the tree does not change much from frame to frame. (But in general i agree) You can also use two seperate trees for static world / dynamic objects (in case of BVH just put both under the root node and done), so you only rebuild for dynamic objects. Using BVH you can also only rebuild the top levels of the tree and just refit the bottom levels bounding volumes (makes sense if you have large static but moving structures like spaceships). The sentence is not finished, being confused... You mean expected NOT to move? more confusing... Are all cubes the same size? Game like Minecraft? You want collision detection or something else?
  9. Algorithm

    Maybe you could sort by Morton Code? From that nearby objects can be found without the need for a hierarchy by iterating the list in both directions until stuff is too far away.
  10. No. AFAIK there are plans to add interop in upcoming realease but not now. I don't know which APIs will be supportet. (There are also rumors VK and CL will maybe merge.) Why do you want this? There should be no real reason in practice.
  11. The difference is that you have access to LDS memory. This gives you 2 advantages: * Share data between threads in the same workgroup. * Use LDS to cache data from main memory. Blur is agood example to utilize both, (code in GLSL): layout(local_size_x = 64,local_size_y = 1,local_size_z = 1) in; [...] shared float lds[64+2]; void main(void) { uint threadID = gl_LocalInvocationID.x; uint leftmostPixelIndex = gl_WorkGroupID.x * 64; float leftNeighbour = pixelBuffer[leftmostPixelIndex + threadID - 1]; // we store this value in a register lds[threadID] = leftNeighbour; // we put the value also to LDS so other threads have access if (threadID < 2) lds[threadID + 64] = pixelBuffer[leftmostPixelIndex + threadID + 64 - 1]; // read two more values memoryBarrierShared(); barrier(); // now we have loaded 64 pixels plus left and right neighbour float blurredResult = leftNeighbour * 0.25 + // we could load this one from LDS, but it's a lot faster to use the register we have anyways. lds[threadID + 1] * 0.5 + // our pixel lds[threadID + 2] * 0.25; // right neighbour blurredPixelBuffer[leftmostPixelIndex + threadID] = blurredResult; } But blur is too simple to get excited. I found the Chapter in OpenGL Super Bible really good and recommend it even if you're using DX. (Talks about building blocks like prefix sums, shows how to solve N-Body problem, ...)
  12. You can reproject the depth buffer from the previous frame. (There should be an article here on gamedev.net. Quite a lot games use GPU occlusion queries. IIRC Cryengine uses it and there should be a paper.) You can also render occluders for the current frame at first and do frustum and occlusion culling on GPU to avaid a read back. Personally i implented a software approach using low poly occluders: Put occluders, batches of world geometry, character bounding boxes etc. in an octtree and render coarsely front to back. Raster the occluders to a framebuffer made of span lists (so no heavy per pixel processing). Test geometry BBox against framebuffer and append to drawlist if it passes. Advantage: If you are in a room, not only geometry but also occluders behind walls will be rejected quickly because the octree BBox already fails the visibility test and the whole branch gets terminated. Very work efficient. Can cover dynamic scenes or open / closed doors. Disadvantage: Because the whole system relies on early termination parallelization makes no sense. Unfortunately i don't know in which situations my method can beat others because i did no comparisions, but you can think of it. I also remember a paper or article about how an older version of Umbra worked, but can't give any link. They managed to use something like a low resolution framebuffer by rastering portals instead occluders to ensure correctness. The diffulicty is to extract those portals as a preprocess... IIRC For line of sight tests you sould use simple raytracing. A full blown visibility determination is demanding even for a single camera and no good option for NPC vs. Player even if first person. I don't think any recent game still uses the same system for collision detection and visibility - Quake was really a special case here.
  13. Occlusion Culling or Visibilty Determination might be good terms to search. Since the old times there is hardware occlusion tests on GPU which is new: You can render low poly occluders to a low resolution z-Buffer, build a mip-map pyramid and test bounding boxes against that. This approach can work for dynamic stuff and of course you can do it in software on CPU as well. Probably artists need to make the occluders by hand.
  14. OpenGL

    Here are some options how to store directional ambient light, any of them can be called 'probe': One color for each side of a cube, so 6 directions and you interpolate the 3 values fitting a given normal. (Known as Valves ambient cube used in HL2) Spherical Harmonics. The number of bands you use define the detail, 2 or 3 bands (12 / 27 floats) is enough for ambient diffuse. Cube maps. Enough details for reflections (used a lot for IBL / PBR today). Lowest LOD == ambient cube. Dual paraboloid maps (or two sphere maps). Same as cube maps, but needs only 2 textures instead 6. Independent of the data format you choose there remains the question how to apply it to a given sample position. some options: Bounding volume set manually by artist, e.g. a sphere or box with a soft border: You find all affecting volumes per sample and accumulate their contribution. Uniform grid / multiresolution grids: You interpolate the 8 closest probes. E.g. UE4 light cache. Voroni Tetrahedronilaztion: You interpolate closest 4 probes (similar to interpolating 3 triangle vertices by barycentric coords for the 2D case). AFAIK Unity uses this. Also the automatic methods often require manual tuning, e.g. a cut off plane to prevent light leaks through a wall to a neighbouring room. Notice a directionless ambient light used in Quake does not support bump mapping. The 'get ambient from floor' trick works well only for objects near the ground. There are probably hundrets of papers talking about details.
  15. Maybe these: https://github.com/derkreature/IBLBaker https://github.com/dariomanesku/cmftStudio