• Content count

  • Joined

  • Last visited

Community Reputation

420 Neutral

About nonoptimalrobot

  • Rank
  1. questions about singletons

      This is misleading -- You may "want" only one, but it's very, very seldom that one cannot imagine a scenario where, in fact, you actually need more than one, and even more seldom when having more than one would actually be incorrect. Usually when a programmer says "I'll use a singleton because I only want one of these." what he's really saying is "I'll use a singleton because I don't want to bother making my code robust enough to handle more than one of these."   If you can justify choosing convenience over correctness you're free to do so, but you've made your bed once you've chosen a Singleton and you alone will have to lay in it.     ...yeah, 'correctness' in software design; something that's largely considered undefinable.  We are not scientist doing good science or bad science; we are craftsman and what one paradigm holds over the other is simply the kinds of coding decisions that are discouraged vs encouraged.  The idea is to use designs that encourage use cases that pay productivity dividends down the road and discourage use cases that become time sinks.  In this sense Singletons are ALL BAD but making your life easier down the road shouldn't be a directive that is pathologically perused such that the productivity of the moment is drawn to a crawl.  I guess the point I'm trying to make is that a tipping point exists between writing infinitely sustainable code and getting the job done.  Here are some examples:   Singletons make sense for something like a GPU; it is a piece of hardware, there is only one of them and it has an internal state.  Of course you can write code such that the GPU appears virtualized and your application can instantiate any number of GPU objects and use them at will.  Indeed I would consider this good practice as it will force the code that relies on the GPU to be written better and should extend nicely to multi GPU platforms.  The flip side is that implementing all this properly comes at a cost in productivity at the moment and that cost needs to be considered over the probability of seeing the benefit.   Another example is collecting telemetry or profiling data.  It's nice to have something in place that tells everyone on the team: "Hey, telemetry data and performance measurements should be coalesced through these systems for the purposes of generating comprehensive reports."  A Singleton does this while a class declaration called Profiler and Telemetry does not.  Again, you can put the burden of managing and using instances of Profiler and Telemetry onto the various subsystem of your application and once again this may lead to better code but if the project never lives long enough to see that 'better code' pay productivity gains then what was the point?   I don't implement Singletons either personally or professionally (for the reasons outlined by Ravyne and SiCrane unless explicitly directed to do so) but I have worked on projects that did use them and overall I was glad they existed as they made me more productive on the whole.  In these instances the dangers of using other people's singletons in already singleton dependent systems never came to fruition and the time sink I made writing beautiful, self contained, stateless and singleton free code never paid off.  Academic excellence vs. pragmatism:  it's a tradeoff worth considering.  Mostly I'm playing devils advocate here as I find blanket statements about a design paradigm being all good or all bad misleading.  Anyway, this is likely to get already is and people aren't even disagreeing yet. :)  I'm out.
  2. questions about singletons

      Yep along with the Global Access property SiCrane mentioned.  The third property of the Singleton paradigm is control over the relative order of construction of singleton objects, something that's not possible with object instances that are declared as globals.  As with most design paradigms Singletons are mostly used to show intent to would be modifiers of the system rather than to prevent bonehead mistakes by making them syntactically illegal.      Mostly agree in the academic sense but in practice I would say they have their uses stemming primarily from convenience.  Sometimes you truly only want one instancing of something running around (assert tracker, GPU wrapper etc) in which case having a single point of access keeps things simple.  Avoiding the threading pit-falls can be done by proper constructing the singleton to deal with multi-threaded access and being vigilant about keeping singleton access out of code that doesn't absolutely need it.
  3. a map<T,T> as a default parameter...

    You would need to do this: bool createWindowImp(DemString winName, // window name DemString winTitle, // window title DemUInt winWidth, // window width. DemUInt winHeight, ExtraParameters params = ExtraParameters()); Which will push an instance of ExtraParameters (which will be empty) onto the stack and pass it on to the function.  The instance will get pushed off the stack after the function's scope terminates.  A more efficient approach would look like this: bool createWindowImp(DemString winName, // window name DemString winTitle, // window title DemUInt winWidth, // window width. DemUInt winHeight, ExtraParameters* params = 0); Inside createWindowImp you will need to check to see if param is non-null and proceed to dereference it and extract values if so.   [EDIT]   I see ExtraParameters is an std::map, in that case IF you want to go with the fist option you will definitely want to tweak it a bit: bool createWindowImp(DemString winName, // window name DemString winTitle, // window title DemUInt winWidth, // window width. DemUInt winHeight, const ExtraParameters& params = ExtraParameters()); If you don't pass by reference then each time the function is called a temporary std::map will be created for the params varaible and a deep copy will be preformed between it and whatever you happen to be passing in; that equates to a lot of memory that gets newed only to be prompty deleted.  It's good practice to toss the 'const' in there too unless you want to pass info out of createWindowImp via params (usually forwned upon).
  4.   Not necessarily but as DigitalFragment pointed out there are many advantages to doing so; either way the concept is the same.  If your hardward supports texture fetches in the vertex shader and floating point texture formats it's generally a good idea (such features are common these days).
  5.   The skeleton is just a big list of matrices or quaternion-translation pairs that are usually stored in a constant buffer.  "Sending the bone transforms" is just a matter of updating a GPU resource; this only needs to happen oncer per frame.  To use that data you simply bind it to the GPU, this isn't free but it's generally very cheap as it doesn't involve moving data from the CPU to the GPU.  After binding the resource the shaders evoked by subsequent draw calls will have access to the skeleton data.   Generally speaking you should pack your entire skeleton into a single resource and update it once per frame regardless of how many draw calls it will take to render the skinned mesh.
  6. D3DXMatrixLookAtLH internally

    This: contains a description of the algorithm.  Note that the matrix is identical to a object-to-world space transform that has been inverted.   The gist of the algorithm is to compute the local x, y and z axis of the camera in world space and then drop them into a matrix such the camera is placed at the origin.   Transforms containing rotations and translations only are constructed as follows: | ux uy uz 0 | | vx vy vz 0 | | nx ny nz 0 | | tx ty tz 1 | // x-axis of basis (ux, uy, uz) // y-axix of basis (vx, vy, vz) // z-axis of basis (nx, ny, nz) // origin of basis (tx, ty, tz) That's a simple world transform; you are building a view matrix so you actually want the inverse which is computed as follows: | ux vx nx 0 | | uy vy ny 0 | | uz vz nz 0 | | -(t dot u) -(t dot v) -(t dot n) 1 | // u = (ux, uy, uz) // v = (vx, vy, vz) // n = (nx, ny, nz) // t = (tx, ty, tz)
  7. How does depth bias work in DX10 ?

    This funny looking quantity: 2**(exponent(max z in primitive) - r doesn't contain a typo.  As far as I know the double star should be interpreted like so: pow(2, (exponent(max z in primitive) - r) The 'exponent' can also be rewritten more explicitly: pow(2, (pow(e, max z in primitive) - r) Where 'e' is the natural number 2.71828... Personally, I think that formula should have been written like this: 2^(e^(MaxDepthInPrim) - r) MaxDepthInPrim is the maximum depth value (in projection space so values on [0, 1]) of the current primitive being rendered.  If you are drawing a triangle with three vertices that have a depth value of 0.5, 0.75 and 0.25 the MaxDepthInPrim will be equal to 0.75.    The constant 'r' is the number of mantissa bits used by the floating point representation of your depth buffer.  If you are using a float32 format then r = 23, if you are using a float16 format then r = 10 and if you are using a float64 format then r = 52.   I'm not entirely clear on the utility behind this formula.  If you play around with it a bit you can tell it's related (but not equal to) the distance between adjacent floating point values in your depth buffer.  The e^(MaxDepthInPrim) is just a way of doing a non linear interpolation across the deltas between adjacent values at different points in the depth buffer.   [EDIT] typed clip-space when I should have typed projection-space.
  8. This is how I've done it in the past:   Grass "clumps" are placed in the world individually or via a spray brush in the world editor.  The brush and placement tool have various options to make this easy including an 'align to terrain' behavior and various controls over how sizes, orientations and texture variations are handled.  This process generates a massive list of positions, scales and orientations (8 floats per clump).  There are millions of grass clumps so storing all this in the raw won't do...   At export time the global list of grass clumps is partitioned into a regular 3d grid.  Each partition of the grid has a list of the clumps it owns and quantizes the positions, scales and orientations into 2 32 bit values.  The fist value contains four 8bit integers: the first 3 ints represent a normalized position w.r.t. the partition's extents and the 4th is just a uniform scale factor.  The second 32 bit value is a quaternion with its components quantized to 8 bit integers (0 = -1 and 255 = 1).   At runtime the contents of nearby partitions are decompressed on the fly into data that's amiable to hardware instancing.  This was a long time ago so it was important the vertex shader didn't spent to much time unpacking data.  Theses days with bit operations available to the GPU you might be able to use the actual compressed data and not need to manage compressing and uncompressing chunks in real-time, if you do use a background thread.   It worked pretty well and was extended to support arbitrary meshes so the initial concept of a grass clump evolved to include pebbles, flowers, sticks, debris etc.  Any mesh that was static and replicated many times throughout the world was a good candidate for this system as long as its vertex count was low enough to justify the loss of the post transform cache caused by the type of HW instancing being used.
  9. Distortion/Heat Haze FX

      The 'cascades' are generated by slicing up the view frustum with cutting planes that are perpendicular to the view direction.  Objects that use distortion are rendered in the furthest distortion cascade first and the nearest distortion cascade last.  The frame buffer is resolved into a texture between rendering each cascade.  Folding water into this is definitely add hoc.  The cascades get split into portions that are above the water plane and below the water plane.  Render the cascades below normally then resolve your frame buffer and move on to render the cascades that are above the water plane.  If the camera is below the water reverse the order.  Ugly huh?  Obviously none of this solves the problem it just mitigates artifacts.     Yeah, those post (especially the second one) are misinformation or at least poorly presented information.  Alpha blending, whether using pre-multiplied color data or not is still multiplying the contents of your frame buffer by 1-alpha so the result is order dependent.  Consider rendering to the same pixel 3 different times using an alpha pre-multiplied texture in back to front order.  Each pass uses color and alpha values of (c0, a0), (c1, a1) and (c2, a2) respectively and 'color' is the initial value of the frame buffer.  Note I'm writing aX instead of 1-aX here because it requires less parenthesizes and is therefore easier to visually analyze, this doesn't invalidate the assertion. Pass 1: c0 + color * a0 Pass 2: c1 + (c0 + color * a0) * a1 Pass 3: c2 + (c1 + (c0 + color * a0) * a1) * a2 = result of in order rendering Now lets reverse the order: Pass 1: c2 + color * a2 Pass 2: c1 + (c2 + color * a2) * a1 Pass 3: c0 + (c1 + (c2 + color * a2) * a1) * a0 = result of out of order rendering Unfortunately: c2 + (c1 + (c0 + color * a0) * a1) * a2 != c0 + (c1 + (c2 + color * a2) * a1) * a0  You still need to depth sort transparent objects regardless of how you choose to blend them...unless of course are just doing additive blending, then it doesn't matter.
  10.   OpenGL never had this problem!?  Sigh.  I've been using the wrong API all these years...
  11. Your code is a little strange but it seems to work, nothing jumps out as wrong.   Off topic note:  It is a little strange to render as frequently as possible but lock updating at 60Hz.  If nothing moved why render again, the generated image will be the same.  Don't get me wrong games do this all the time but usually in a slightly more complicated way.  For example:  if updating happens at 30Hz and rendering happens at 60Hz then the renderer can linearly interpolate between (properly buffered) frame data generated by the updater to give the illusion of the entire game running at 60 fps.    Back to the problem at hand...     This is a valid observation (something is amiss) but using fps to measure relative changes in performance can be misleading.  A change from 57-58 fps to 43-42 fps means each frame is taking approximately 6.1 milliseconds longer to finish.  A change from 6k-7k fps to 125-130 fps means each frame is taking approximately 7.6 milliseconds longer to finish.  Not as drastic as it first seems.  Think of fps measurements as velocities, they are inversely proportional to time.  If you have a race between two cars it doesn't make a lot of sense to say car A finished the race at 60 mph and car B finished the race at 75 mph.  Want you really want to know is the time it took each car to finish the race.     Agreed but I'm not sure why your performance is tanking.  Hopefully someone who knows a bit more about OpenGL can point to something that's blatant performance problem.
  12. Bullet Impact on Mesh Edge

      I'm curious as well.   At one point I was dealing with a game what had highly detailed meshes used for rendering and exceptionally simplified meshes used for collision.  This meant the normal solution of projecting and clipping your decal against the collision geometry resulted a decal that was rarely consistent with the topology of the rendered object.  Obviously you can solve this by giving your decal system access to the render geometry for the purposes of projecting and clipping but this wasn't practical; the computational geometry code wanted triangles in a specific format that didn't mix well with rendering so there was a huge memory overhead.  Our render was deferred which meant easy access to the depth buffer and a normal buffer, for this reason the decision was made to render decals as boxes and use various tricks in the pixel shader to create the illusion of decals being projected onto the scene.  Essentially UV coordinates would be tweaked based on the result of a ray cast.  It worked fairly well but had some view dependent artifacts (sliding) that were distracting if you zoomed in on and rotated around a decal that had been projected onto a particularly complicated depth buffer topology.  There was also the issue of two dynamic objects overlapping, if one of the objects had a decal stuck to it in the region of the overlap the decal would appear on the other object; this only happened when the collision response was faulty so it wasn't that big a deal.   Anyway, waiting for Hodgman's insight...
  13. Render queue design decision

      Sounds like you can fix this by tweaking the way your renderer works.  Why not hand off the entire Renderable component to the render which can then ask for the RenderToken vector (via some interface in Renderable) for the purposes of sorting?
  14.   This.  Class hierarchies have a way of spiraling out of control towards the end of project making bug hunting tedious just as it becomes the most important task.  There is also the runtime flexibility it allows which is handy given that live updating your world via an external or embedded editor has huge implications for the productivity of the design team.  Adding and removing components to an entity and propagating those changes to a running instance of the game is exceptionally awkward to do with a standard OOP design.    I think there is room for both approaches.  The Entity-Component model works best at a high level, the extreme end of that being only having a single entity GameObject that contains components.  The components themselves are built from a usual class hierarchy or Rener(ables) Update(ables) Stream(ables) etc.  In this scenario a game object representing a vehicle would have a component for the engine, the wheels, the seats, etc.  In turn Engine would inherit from IUpdatable but not much else.  Wheel would inherit from IUpdatable and IRenderable etc.  This is the approach I favor and it works well as long as the team as a whole is clear about where one design paradigm ends and the other begins and what type of functionality belongs in which system.  It's easy for people not privy to the intent of the two systems to start blurring the line and eventually create a mess that must be cleaned up later.  Code reviews help with this.
  15. Holy crap, man! Couldn't you just create separate ctors for normal construction and deserialization construction? I mean, I get the OnPlacedInWorld() part, but it sounds like OnCreate() and OnStartup() could just be normal ctors that call a common member function after the serializable values are set. Why do you guys avoid using the ctor? Or, rather, if it needs to be plugged in this way, why not a pair of factory functions?   This is pretty standard.  It's not that you can't achieve the same with multiple constructors but breaking it up into multiple function calls or passes allows a few different things:   1) External systems can run code in-between the initialization passes.  This is almost a necessity, especially for games.    2) Code reuse.  Often times the default constructor does stuff that all the other initialization functions want done.  If everything was broken out into multiple, overloaded constructors then a lot of code would get copied across them.   3) Clear intent.  OnPlacedInWorld() OnCreate() and OnStartup() tells would be modifiers of the class where to put various bits of initialization code while the systems using the class know when to call what.   4) Performance.  As mentioned, the minimally filled out default constructor allows faster serialization as you can avoid doing things that will simply be undone by the serializer.