• Content count

  • Joined

  • Last visited

Community Reputation

1205 Excellent

About snowmanZOMG

  • Rank
  1. GCC auto vectorizer alignment issues

    Pretty awesome talk by Andreas Fredriksson from Insomniac Games:   I highly recommend it.
  2. GCC auto vectorizer alignment issues

    If you really want to use those vector instructions and be guaranteed that they are being used, you need to use intrinsics.   The auto-vectorization provided by compilers, as Ravyne mentioned, should be viewed as a bonus if it occurs.  If you must get vector instructions it's far simpler to just write the intrinsics yourself than to go through strange hoops to write C/C++ code that the auto-vectorizer will accept, which is a totally backwards (and unreliable) way of getting these instructions to be emitted.  Additionally, I've found that it's just annoying to try to get the compiler to realize things that I know is true for my code (such as alignment or aliasing).  Intrinsics are the way to go here.   I tried compiling your code on g++ 4.8.4 with the options you give (except c++14, since my compiler is not new enough) and I get identical assembly for both functions and the assembly is in the aligned form with no prologue/epilogue to deal with unaligned values.
  3. The Simple Funnel algorithm pre-visited

      Can you show me an image to describe what's going on?  Something doesn't vibe with me very well with what you've described, but I'm not sure I'm fully understanding your situation.  For instance, I don't understand what you mean by "concave polygons with interior edges that wrap around islands".  If my recollection of CDT is correct, it shouldn't make a difference if you have a concave polygon or not.
  4. The Simple Funnel algorithm pre-visited

      I tried to implement Demyen's approach but I gave up after trying for a really long time to get all the details just right.  Although I haven't tried it myself, I suspect that shrinking the nav mesh would indeed be a much simpler approach, and if you can afford it, it might be best to have a few different nav meshes tuned for particular sizes.   As for the other person (, I do exactly this approach on my game and if you can operate under the same assumptions, namely that the portals you have are guaranteed for the agent of radius R, it works great (at least I think so)!  Performance is absolutely a non-issue for me.  On very long paths, my entire string pulling + tangent to circle computation takes on the order of ~50000 cycles which is around 0.02 ms on my machine.  The A* search takes far, far longer.  The A* search alone to generate that path could take me 0.25 to 0.5 ms.   You have far bigger problems to worry about when it comes to performance in pathfinding.  The A* search (if you're using it) will be the lion's share of the time spent.
  5. The Simple Funnel algorithm pre-visited

    Ah ha!  Yes, I worked through your example and I think what you were referring to is when you get to this case:     And it had me stumped for a second until I remembered that you need to treat the destination point as a final portal to process.  It's not obvious in Mikko's explanation and in most other explanations, but it's mentioned in Jeff Erickson's description of the funnel algorithm:     In my own code, I create dummy portals for the start and end positions where the left and right endpoints of the portals are the same as the positions themselves.  This should fall into place nicely with the rest of the algorithm as long as your cross product comparisons are using >= or <= (not the strictly < or >).   I hope this helps!
  6. The Simple Funnel algorithm pre-visited

      Off the top of my head I it seems that quads are guaranteed to generate a valid funnel due to their inherently regular topology. In the case I presented above, just take a piece of paper and follow the left and right edges from the initial apex and you'll notice that due to the elongated shape of the triangles the left and right edges never cross and therefore never generate a new apex before the goal is reached. Unless I'm reading it totally wrong, the algorithm simply can't cope with input like this, which is why I started the topic in the first place.   Could you please try and elaborate on where my description fails? I'd much rather this is a fault on my part than that the algorithm is inherently flawed .     I'm not sure how the topology works out here to guarantee a funnel?  Maybe we are misunderstanding:           The squares are all nav mesh nodes and there are tiny red and blue markings which indicate the endpoints and midpoints of portals.   I will try to remind myself of the exact details of the algorithm and do a step by step sample through your case (Maybe I'm wrong!  But I'm fairly certain your case is not a failure... but I should probably check, shouldn't I?).
  7. The Simple Funnel algorithm pre-visited

    I myself have implemented Mikko's funnel algorithm and it works just fine in my version (albeit, my nav mesh is restricted to quads which may not be aligned).  Your description of the "failure case" in the example isn't correct.  I don't recall the exact details of the algorithm, but you are basically setting an apex and then walking the left and right sides to succeeding portals until the left or right cross the other.   When the left/right crosses you've found a new apex point which needs to be a part of the path.   Are you sure you understand or have implemented the algorithm correctly?  Mikko's algorithm simply backtracks a few points to avoid having to use the deque at all which is an annoying data structure to have to use for something so simple.   Edit:   Sorry didn't catch your description under the image since clicking the image just blows it up and I never saw the text under it.  It's not enough to go off of to know if you understand the algorithm fully, but the left and right legs definitely do cross in this example.   Check out 2.2.4 Funnel in this document if you want another description
  8.   Sure you can.  The STL heap functions don't provide an update function, but you don't exactly need one in the context of A* if you keep enough extra state.  A properly written heap update performs in O(log n) time, but the only way to perform an update using the STL provided functions is to modify the contents directly yourself and then call make_heap() which runs in O(n) time.  This is not desirable.  However, you don't need to do a full heap rebuild; as long as you keep flags to know when a node is opened or closed, you can simply keep throwing on the updated node cost values into your heap and every time you pop off the heap you should check if the node you retrieved is open or not.  If it isn't open, just skip it!   This is how I've implemented my A* function.  You may be concerned that you'll have multiple instances of a node in the heap with different costs, but this doesn't actually affect the correctness of the algorithm as long as you have a flag (or some other structure) to determine if the node is in the open list or not.  The heap serves only to keep track of the next min cost node to explore, not to actually tell you if a node is open or not.  And due to the relaxation process of edges inherent in shortest path algorithms, you will always be popping off the most current cost estimates to a node if you have multiple copies.   Using a good structure to retrieve the min cost node is critical to good performance in cases where you have a large map to path through, but possibly just as important is the initialization time.  I haven't looked thoroughly at your code but I also suspect your graph initialization can be a bottleneck.  You should profile or instrument your code so you know exactly which parts take the most time.  If your main loop takes up a lot of time, chances are you need a faster structure to retrieve your min cost nodes.  Currently, I'm inclined to believe that most of your performance is dependent the open list data structure, simply because you stated earlier that the performance depends heavily on the length of the path.
  9.   Your running times for binary heaps are wrong?  I'm not aware of any binary heap that functions with those running times.  Fibonnaci heaps may give you that insert time (if my memory serves correctly) but I don't think the delete is O(log n) in the worst case.
  10. How to write fast code without optimizing

    Your question seems to imply that you want to write code that is fast/efficient the first time around without having to rewrite it later due to slow performance.  I find that this rarely is the case in practice, although, you can definitely design systems to be more efficient up front or design them to be made easier to optimize later.  But this is incredibly difficult to do.  It requires a lot of experience in algorithms, computer architecture, the problem domain, and just plain programming to be able to know how to write code to make it "fast enough" for your purposes in such a way that you won't have to come back to it later.   I've been programming pretty seriously for about 2 or 3 years now in addition to 4 years of schooling from a reasonably good CS program.  I would say I have just enough experience now to sort of see where I can afford to "be lazy" with my programming since I just know it won't make any bit of perceptible difference to the user and where I should definitely think more about the problem before embarking on a coding frenzy since performance might be a concern.   It's really difficult (in my opinion) to get a good feel for this other than to just write a lot of code, profiling, inspecting, and really getting to know your code and its relationship to performance.  You eventually get to a point where you can just sort of think about a problem and be like "Yeah, that problem size is so small it won't matter." or "That algorithm might be a little slow, but because I run it only once a frame on a small number of items, it's not a big deal".  If you really know your algorithms and the problem domains, you will be able to easily identify huge red performance flags, like "Oh wow, I definitely don't want to do this A* search in this loop since the graph is thousands of nodes" and therefore spend more time and energy on designing for good performance on those cases instead.
  11. You might want to check out
  12. Have you made a game before? Do you have experience with the operating systems on all the platforms you are thinking about supporting? Are you comfortable with getting into the guts of setting up a proper build system and understand the compilers you'll need to use for each platform? Do you know which language features are supported by the compilers? Are you using libraries which are available on all platforms? If you've answered no to any of these questions, you almost certainly should not try to make a game that is cross platform.   You'll waste so much time and energy on making something work on all the systems rather than finishing a game on just one of them.
  13. It's possible to take certain portions of the behavior tree and make it shared amongst all agents to help reduce memory overhead and potentially improve performance.  But the cost of this I find is usually higher than it's worth.  The cost you usually incur is separating the core structure of the tree with the runtime state that's necessary to allow for proper initializations and restarts of the tree of a behavior that was previously running.  This, in turn, can make maintaining the code a bit more of a hassle.   In my own personal projects, I don't see that cost being worth it.  It may just be I've never found a particularly elegant way to make the core tree structure shared but the tree traversal state on a per agent basis, but I've also never needed to optimize that portion of my code.  I have relatively simple behaviors in my game but I still tick all the agents every frame and I may spend at most 0.5% total frame time in the behavior tree?   I really don't see the point of trying to do this aside from educational purposes.  Chances are you'll gain very little from attempting to make this optimization.
  14. Pathfinding Algorithm, grid to regions

    You might want to look up rectangulation and rectangular partitioning algorithms.  It has actively been researched in other areas, specifically for VLSI and chip manufacturing, so you may need to look up some literature with those terms.  But this is a very active area of research (basically anything that has to do with computation geometry is).  In the general case, rectangulation can be quite difficult and most of those algorithms are probably not fast enough for your needs, and quite frankly, you probably don't need something that general or optimal.  Check out   I think you'll find that an optimal solution will be quite difficult to achieve.  Without going into a lot of digging and careful reading of the literature, I suspect this problem is NP-complete. not exactly the same problem, but very similar to yours.   Am I correct in understanding that your search space is 16 * 16 * 64 = 16384?  If that's the number of nodes you have in your A* search space, it's large but still manageable without significant optimization...  My game currently has some levels which can have 10000 nodes if I force a very fine nav mesh resolution and with ~80 agents I still have ~300 FPS (can't tell you the A* search time since I haven't instrumented that).  It's not completely smooth, but it definitely isn't as bad as 30 ms, so I don't agree with your assertion that "this is of course to be expected".  My A* implementation isn't hyper optimized or anything.   I think your best strategy right now is to drop the quest of "minimum" or optimality and go for a reasonable reduction in the search space.  Try something dead simple first, like trying to build the largest cube/square and just go in the same direction every time.  Perhaps before even that, make sure your A* search is reasonably performant.  It doesn't sound like you have a very good implementation to me, but I need a lot more context to be able to know.  Are your data structures designed for good cache locality?  Are you using a decent priority queue to maintain your open set?   I've found that in my own A* implementation, the biggest gains were in simply making the initialization of the graph faster.  My first implementation had this terrible Array of Structs format where each node contained all of its A* state information.  For me to initialize the whole graph required individual setting of f, g, and parent states on each node.  Very bad for memory performance.  I rearranged the data structures to be a Struct of Arrays format where the graph now contains an entire array of f, g, and parent values and now initialization of the graph are very fast memsets() on entire blocks of memory.  There are some other clever ways to make the initialization closer to O(1) time but I didn't want to deal with that.   This was a very simple transformation for me to make and yielded enormous performance improvements, so much so that I basically have come to a point where I just don't have to worry about A* performance in my game.
  15. Sticking to a rectangular grid/mesh can be easy from the initial implementation point of view, but they aren't without their own problems.  You have to consider the needs of your game to reach a good solution.  If your game's actual play space is grid based then the rectangular grid solution should work quite well, but if you have a full 3D continuous game space, you'll probably find lots of potential problems with using a grid (How do you generate the grid?  What resolution?  How many nodes are necessary to get good coverage/accuracy?).   Going with a triangulated mesh isn't a clear win either.  Given a 3D game, you could use some of the level geometry as a starting point for your triangulation, but you have to solve a lot of different problems (many of which don't have clear right answers/solutions) to get something that would be usable for your game.  If you have the expertise and the capability, you can have the mesh and it will give you a lot of nice properties, but getting the mesh in the first place is often very difficult.   Check out Recast/Detour It's a very nice (so I've heard, never used it myself, but the features look impressive) navigation mesh generation library and if I'm not mistaken, it was used in Killzone: Shadow Fall!  The source is available so you can take a look at some of the techniques used.   If you're learning the basics of A*, pathfinding, and path following, I would highly recommend sticking to the rectangular grid as a starting point.  If you do in fact have a 3D game, it probably won't be ideal but it's simple enough that it should get you going and have you focused on getting a complete workable solution for AI agents to reason about your world and path through it.  Once you complete this, you could move on to having some sort of triangulated mesh.  If so, I would recommend reading up on Delaunay triangulations and more specifically, constrained Delaunay triangulations.