• Advertisement


  • Content count

  • Joined

  • Last visited

Community Reputation

2929 Excellent

1 Follower

About Aressera

  • Rank

Personal Information

  • Interests
  1. DX11 Clip points hidden by the model

    Even a simple implementation of BVH will reduce that time by several orders of magnitude. The fastest ray tracing algorithms can trace millions of rays per second on 1 desktop CPU thread.
  2. Divide and conquer broadphase

    I'm unsure how this would be faster than a well-implemented BVH, since it is essentially a BVH construction algorithm. Do you have a BVH implementation to compare against and relative performance numbers on a variety of benchmarks? Those would be necessary to draw any real conclusions. I feel like there could only be a performance win here if the scene is highly dynamic and would require the BVH to be rebuilt each frame. In most cases, the BVH could be built every 10 frames, then only refitted on the other frames for much less cost than rebuilding every frame. BVH also allows better SIMD utilization using 4-ary or 8-ary trees (intersect single ray or AABB against 4 or 8 nodes at once), because the data can be pre-packed into a form that is fastest for the intersection query. BVH can also apply heuristics (e.g. SAH) during construction to greatly improve performance by choosing a better splitting plane. In theory, you could use the same heuristics in your algorithm, but would have to apply them every frame, rather than every 10+ frames, which would slow things down. You should try an iterative implementation (with a manually-implemented stack), it would probably be faster (all fast ray tracing algorithms use a loop+stack instead of recursive function calls). The fastest broadphase I have personally implemented used a dynamic octree. Each object stored a pointer to the node it was in, and nodes had pointers to their parents/children. This allowed the tree to be updated really fast when no objects are moving (just check to see if the object is entirely contained in current node AABB). If not contained, then walk up the hierarchy toward root. If the object is completely contained in any children, push the object down into the subtree. This was old code and didn't use any SIMD, so I don't know how it would fare against a proper BVH implementation, but it was still significantly faster than a spatial hash. Consider using a hash set for the output pairs, but only after the narrowphase step. Use a pre-allocated vector until then. (a few duplicate collision checks is probably less expensive than building a set of broadphase pairs each frame, since there will be many more broadphase pairs than narrowphase colliding pairs). The hash key is an order-invariant hash (e.g. xor) of the objects that are potentially colliding.
  3. Thanks everyone for the replies, but I don't think anyone has really come close to answering the question I posed twice, so I'll try again: What is people's experience with how engines/editors split tasks between the UI thread and the rendering thread? Particularly, how would rendering to VR on the UI thread impact the ability to consistently hit 90fps? I'm not looking for tips on how to write multithreaded software at a low level, I already know how to do that. I am more looking to see if there are any pitfalls in either of the two possible thread architectures I could use (described in the OP). Which way would be best if you were writing a new engine from scratch?
  4. I'm not talking about how to multithread the engine, that's a separate concern. I will have a job system/thread pool where necessary on the internals. I'm more asking about how people interface the engine/editor with the OS event thread, and especially how that impacts latency-sensitive applications like running VR within the editor.
  5. I am about 60-70% done with my game engine editor and have some questions about how to best design the threading architecture of my editor. Currently, I have 2 separate threads: one that only handles the OS events (call this the UI thread), and another that is the "main" thread of the editor that updates in a loop at 60Hz and handles all of the rendering/simulation updates. My editor uses a custom GUI that is rendered using OpenGL, and all of the updates/rendering of the GUI also happen on the main thread. Events (mouse,keyboard) received on the UI thread are double-buffered and sent to the main thread. The events are then dispatched on the main thread to the hierarchy of GUI widgets. This used to work fine until I started adding more features. Now, recently I have implemented drag and drop, and it requires handling everything on the UI thread so that it can inform the OS whether a drop can be performed. At the moment, I have mostly ignored thread saftey. When I drag objects around the editor (say from a list into a scene viewport), it can sometimes cause those dropped objects to have their graphics data initialized from the UI thread, which causes a crash since there is no OpenGL context on the UI thread. The same problem occurs when I use native OS menus - selecting a menu item (e.g. creating a new mesh) sends a callback from the UI thread. If I directly create the object in that callback, it can crash when the graphics data is initialized. I can synchronize these callbacks, but it is quite a lot of work to do since I have dozens of menus that would need to safely communicate to the main thread what item(s) were selected. Drag and drop is more problematic since I have to immediately return the drag operation that can happen from the UI thread callback. So, I ask for your advice for how I should proceed to fix these problems. I see 2 possible paths: Merge the UI/main threads into a single thread that does everything. This will be safest of all, but could cause other problems with responsiveness. I am worried about what happens when I add VR support and need to have precise control over the frame rate. On OS X (my main development platform), I don't have control over the UI thread's event loop so I'm not sure how to ensure that I get a callback every 60Hz or 90Hz to keep up the frame rate. To do this, I have to rewrite a lot of the windowing/buffer swapping code to work from a UI thread callback instead of a simple loop. Keep it as is, but synchronize the hell out of it. This is LOT of work (probably adds 20-40 man hours, which I'd rather avoid). Every single menu in the very large editor will have to have threadsafe event communication. Drag and drop becomes more difficult to implement (I have to mutex with the main thread, do everything on the UI thread, and then communicate the dropped objects to the main thread when a drop operation occurs). This sync/communicate would have to be done at every place in the GUI that accepts a drag operation (dozens of locations). BUT - this way gives me more control over the frame rate (e.g. I can drop to 10Hz when the editor is not the foreground app), VR timing becomes easier. I also don't have to worry about stalling the UI thread if my rendering takes too long. How do existing game engine editors (Unity, Unreal, etc.) manage their threads? I believe that Unity editor all runs on the UI thread (including rendering), but I may be wrong. What is the best overall architecture that is versatile, safe, performant, and future-proof?
  6. While I agree for signals that can be negative (e.g. audio), I'm not sure that's correct for images, since they represent the physical light intensity (which is proportional to energy and can't be negative). I think the energy in an image would be just the sum of the pixel values at each wavelength, since the integral of the light wave's squared amplitude has already been done by the image sensor.
  7. 3D Transparent Shader problem

    You can get face normals using the derivatives of the interpolated surface position: varying vec3 lerpPosition; // surface position, interpolated from vertex positions void main(void) { vec3 faceNormal = normalize( cross( dFdx( lerpPosition ), dFdy( lerpPosition ) )) }
  8. 3D Transparent Shader problem

    That trick to cull backfaces won't work in all cases with vertex normals because the interpolated normal isn't consistent with the planar geometry. I notice similar artifacts when using that trick for 2-sided lighting. You will need to use face normals, since these are consistent with the underlying geometry, or just tolerate the artifacts.
  9. Extracting face/hit data after a GJK step

    You can't really use GJK to get the contact info unless you combine it with some other approach such as EPA (expanding polytope algorithm), or add a margin around the objects. The problem is that GJK only determines if the objects are intersecting, but not the exterior points on each object. If you try to use the final simplex to generate contact info, you will find that objects tend to sink into each other over time. The simplex does not necessarily lie on the surface of the minkowski difference. In my engine, I use EPA to push the simplex boundary out to the edge of the minkowski difference. Then, the contacts can be determined from the nearest point on the minkowski difference to the origin in configuration space. EPA produces a convex hull made up of triangles and the contact lies on the closest triangle to the origin. Find the barycentric coordinates of the closest point on the triangle, then use those barycentric coordinates to find the point on each object (you need to store {pointOnA, pointOnB, pointOnB-pointOnA} for each convex hull vertex). This gives you 1 contact on each frame. You would then need to combine it with contact caching over multiple frames to get multiple contacts that are needed to achieve stable stacking.
  10. Why does GraphicsDevice need to know anything about Game? The game should be responsible for configuring the device. Any parameters, etc needed by the device should be passed into the device (e.g. pass GraphicsDeviceInfo* instead of Game* to init()). Try to think of your system design as a hierarchy/tree of complexity, with low level modules at the leaves and high level modules (e.g. Game) at the top. Each level in the tree should only know about its direct dependencies (its children). As for forward declarations, I use them as needed whenever you have circular dependencies, or when you want to hide the implementation (PIMPL idiom). Keeping includes in source files and forward declaring in headers can also reduce build times and public dependencies (e.g. I keep all <windows.h> includes in the source files to avoid polluting the global namespace).
  11. In the last few days I have noticed that whenever I visit a post topic the CPU usage suddenly goes up from 1%-5% to nearly 100%. In activity monitor the process is shown as "safari web content". This doesn't happen when viewing a forum or when creating a topic (e.g. right now it's fine), but as soon as I read a post, the insane CPU usage begins. As soon as I close the window, the CPU usage drops to normal. My system specs: Mac OS X 10.8.5, Safari 6.2.3
  12. C++ derived class template D:

    You need a typedef: class AComponent : PComponent<...> { public: typedef PComponent<...> Base; AComponent() : Base(...) {} };
  13. For any sort of acoustic source, I first try the Shure SM81. It has an extremely flat frequency response without the BS "enhanced" high end that most condenser mics have, hence its ubiquitous use for overheads/hi-hat, string instruments, etc. It will capture exactly what you are hearing without any kind of coloration. I'll use it over any high-dollar Neumann for this specific reason.
  14. 3D Shadow Shimmering When Moving Objects

    Maybe you could quantize the positions of objects to texels when rendering the shadow map (but only the shadow map)?
  15. Fifth Engine

    Maybe the Retro Encabulator can make some sense of this thread?
  • Advertisement