• Announcements

    • khawk

      Download the Game Design and Indie Game Marketing Freebook   07/19/17

      GameDev.net and CRC Press have teamed up to bring a free ebook of content curated from top titles published by CRC Press. The freebook, Practices of Game Design & Indie Game Marketing, includes chapters from The Art of Game Design: A Book of Lenses, A Practical Guide to Indie Game Marketing, and An Architectural Approach to Level Design. The GameDev.net FreeBook is relevant to game designers, developers, and those interested in learning more about the challenges in game development. We know game development can be a tough discipline and business, so we picked several chapters from CRC Press titles that we thought would be of interest to you, the GameDev.net audience, in your journey to design, develop, and market your next game. The free ebook is available through CRC Press by clicking here. The Curated Books The Art of Game Design: A Book of Lenses, Second Edition, by Jesse Schell Presents 100+ sets of questions, or different lenses, for viewing a game’s design, encompassing diverse fields such as psychology, architecture, music, film, software engineering, theme park design, mathematics, anthropology, and more. Written by one of the world's top game designers, this book describes the deepest and most fundamental principles of game design, demonstrating how tactics used in board, card, and athletic games also work in video games. It provides practical instruction on creating world-class games that will be played again and again. View it here. A Practical Guide to Indie Game Marketing, by Joel Dreskin Marketing is an essential but too frequently overlooked or minimized component of the release plan for indie games. A Practical Guide to Indie Game Marketing provides you with the tools needed to build visibility and sell your indie games. With special focus on those developers with small budgets and limited staff and resources, this book is packed with tangible recommendations and techniques that you can put to use immediately. As a seasoned professional of the indie game arena, author Joel Dreskin gives you insight into practical, real-world experiences of marketing numerous successful games and also provides stories of the failures. View it here. An Architectural Approach to Level Design This is one of the first books to integrate architectural and spatial design theory with the field of level design. The book presents architectural techniques and theories for level designers to use in their own work. It connects architecture and level design in different ways that address the practical elements of how designers construct space and the experiential elements of how and why humans interact with this space. Throughout the text, readers learn skills for spatial layout, evoking emotion through gamespaces, and creating better levels through architectural theory. View it here. Learn more and download the ebook by clicking here. Did you know? GameDev.net and CRC Press also recently teamed up to bring GDNet+ Members up to a 20% discount on all CRC Press books. Learn more about this and other benefits here.
Sign in to follow this  
Followers 0

Optimisation of core code

5 posts in this topic

I have a 3D physics engine, which has a couple of performance critical sections (broadphAse collision detection, narrowphase detection and collision response). I've pushed them quite far (as far as possible from an algorithmic standpoint) and would like to try and extract just a little more out of all the systems.

I would like to start looking at the likes of cache misses and cache hits in the program , in specific sections. Using visual studio (professional) how would I g o about detecting these things, and trying to optimise my methods? Note that I know there are better ways to get performance boosts, however this is a leaning exercise.

Share this post

Link to post
Share on other sites
I've use the profiler in vs a bit, and I've used codexl for opengl debugging, but the only thing I could make from it was"here's where your method is spending the most time" which stops being helpful. Can I look at cache hits or
Misses? That's pretty much the only other thing I'm aware I should be optimising for. I'm not too concerned about x platform (although my code does work on mac as well, not tested on Linux) but I'm most familiar with windows as a dev environment, so I'd like to start here. I've come across vtune before but figure it's not gonna tell me anymore than the vs profiler with my current level of knowledge.

Share this post

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
Sign in to follow this  
Followers 0

  • Similar Content

    • By TheTurkishMoose
      I'm currently working on an open-world, dense urban project using the dev tools released by a design studio for a game they have released.
      I can run the base game (which is also set in dense urban areas) at 1080p with ultra settings with a solid 60FPS but if I go to 4K then I get about 30FPS.
      The minimum specs of the game:
      CPU: 3.4GHZ
      GPU: 1GB VRAM
      The recommended specs of the game:
      CPU: 4.0GHZ
      GPU: 2GB VRAM
      My question is about LOD switching. Using the dev tools, you can create your own unique buildings but I'm worried about how I should create the LODs. All the different buildings I create will use many similar objects such as windows and detailed objects like air conditioners, chimneys e.c.t. It seems more convenient to me to create many smaller LODs rather than creating a new singular LOD for every building that's made to save time and also, if I edit the values of a smaller decoration, it would take effect across all the other smaller LODs already created. I hope this has made sense to you.
      If I create a LOD for multiple parts of a single building then I can keep creating new ones easily and all the LODs are already created. But of course, that could effect the performance. However, with minimum specs like 3.4GHZ, could I compromise with more LOD switches?
      I'm new to generating and creating LODs and could use a bit of advice and guidance. Unfortunately, I cannot disclose too much about the project or show screenshots as it is currently under wraps. Any help would be appreciated - thanks!
    • By ManHunter
      Hi community,

      here is Emanuele from Crimson Games Development Department.
      An user asked me about how I am dealing with main MMOGs problems in Heroes of Asgard, so I prepared an article about this topic.

      So today we will discuss about main optimization problems that you can find in a MMO game development.

      I will be happy if someone will add his contribute, so we can learn together: I will add it to open post!


      By definition, a MMOG should allow you to play with a huge amount of people at once and interact with them as if you are in a normal multiplayer game, this in a persistent world.
      Now, if we want to dissect a little more this statement, we will see that this is impossible without applying various “tricks” behind the scenes.


      You can definitely understand how when the amount of connected players grows, server performances will be degraded.
      Many operations on the server are required to operate on all connected players or a subset of them, on all objects around the world, on all monsters and their AI, etc. All these calculations are executed several times per second: imagine, then, to have to iterate over 200 players, having to iterate over 2,000 players or having to iterate over 20,000 players, frame each frame of your server simulation. For each iteration, I have to send packets, make calculations, change positions, etc. There is, therefore, an exponential growth of the computational load for each new connected player.
      As you can well imagine, is a very large amount of work for a single machine, this due to an obvious hardware limitation.
      Usually, therefore, there is a maximum threshold of concurrent players simultaneously processed, after which the server itself (the physical machine) can not keep up, creating a negative game experience (lag, unresponsive commands, etc).
      You can not accept new connections beyond this threshold until a seat becomes available, in order to not ruin the experience for those who are already connected and playing.
      You could then start multiple servers on different machines, so you can host more players, but of course they can not interact with players from other servers.
      The division into various “server instance” definitely does not fall within the definition of MMOG, as it does not allow you to interact with all players in a persistent world, but it creates different instances of the same world. It is acceptable, of course: but it isn’t what we want to achieve.

      That said, what can we do to “bypass” a little bit this problem? And what did I already do for Heroes of Asgard? What I describe is the result of my experience and, therefore, it is also what I provided for Heroes of Asgard, obviously trying to get the best.


      There are several measures that can be applied to improve the maximum threshold. Yes, improve it: there will always be a maximum threshold beyond which it is difficult to go (by maintaining the same hardware, of course).


      As first thing: write good code, with your brain attached to this task and without unnecessary waste of resources. It may seem obvious and trite, but it is not. Wasting resources is equivalent to worsen server’s available resources.
      Wasting bandwidth means exhaust it in no time, every single piece of data that is transmitted has to be carefully selected. If I send an extra byte for each user, when my server hosts 20,000 players, it means sending about additional 20KB for each frame.
      Wasting CPU cycles is like shooting myself in the foot: the actions you perform must be kept to a bare minimum, add a single more function call per user may mean adding N additional CPU cycles, which for 20,000 users will be N x 20000 additional CPU cycles.
      Waste memory (and therefore to allocate unnecessary resources) is harmful: the allocation requires both additional CPU cycles and memory. And system memory ends.
      In managed environments, also leave resources allocated causes garbage collection, which may mean spending huge CPU cycles to free resources, instead of serving the players and simulate the world.
      Ultimately, wasting resources in your code will ensure that you will spend more money and more frequently to improve your servers (when your userbase increases), in order to maintain acceptable performance.


      As you certainly know, the simulation of a virtual world can be executed a certain number of times per second by the server. This means that every second, all entities and systems in the world are “simulated” a certain number of times. The simulation can include AI routines, positions/rotations updates, etc. It allows you to infuse ”life” to your virtual world.
      The number of times your simulation is performed is called FPS or Frames Per Second. It is obvious that if the simulation is cumbersome and requires time, our hardware will tend to simulate the world less times in one second. This can lead to a degradation of the simulation.
      But consider: does we need a big amount of simulations performed by the server? Does we need to strive our hardware in this manner? Can we, however, improve this?
      Yes. For most games with few players in the same map, and a high game speed (see the FPS, with a high number of commands) our world can be simulated 60 times per second (or less, obviously it depends on game type).
      For a MMOG a more little amount can be enough, depending on the genre.
      There is no need to simulate the world many times per second as possible, since this will change the simulation in a minimal way, wasting more resources than necessary.
      In Heroes of Asgard, for example, the world is simulated 20 times per second (at the moment).


      We said that in an MMOG we must be able to interact with other players and with the surrounding environment and I should be able to do it with anyone in the world at that time. Quite right, of course.
      But, from the point of view of a player, do you really need to know what a player is doing on the other side of the map? No, not always. Indeed, in the majority of cases this player isn’t interested to know if another player, as example, is walking or not in another far area. Send an information that can not be displayed on the user’s screen is a waste of resources.
      This observation is important, it allows us to implement a big optimization.
      How can I inform a particular player only on entities that may interest him?
      Why not break the map (or maps) in zones? A simple subdivision is grid one: divide the map in N x M zones, where N and M are greater than or equal to 1. This technique is also known as space partitioning or zones partitioning.
      In this way, a player can only receive information on the entities contained in its area, without needing to have knowledge of distant entities. If in my map 8000 entities are uniformly distributed and it is divided into a 4 x 4 grid, the player who is in the [1, 1] zone will have the burden of receiving information only about 500 entities. A great advantage, doesn’t it?
      But consider: what if the player is on zone’s borders? It will not see the players in the nearby zones, although they are visible.
      We can therefore understand that the player will have to be informed about the entities contained in its zone and in zones immediately contiguous.
      The size of the zones allows you to optimize a lot this method, so depending on the size of a map the size of the grid can vary , in order to obtain the best effect. Also the shape of the zones can vary, to better fit to the composition of the map.


      As mentioned, zone division already offers a decent level of optimization, allowing us to send information about a single entity to the players who really can benefit from them.
      But let us ask ourselves a question: can we identify useless information in our zone division (remember that also include those contiguous, so in a regular grid we have to dealt with 9 zones in the worst case)? Of course we can.
      Most likely a player does not affect entities outside of his field of view.
      If I can not see an entity, I do not care to trace what it is doing, although it may be in my own zone. Then sending information about that entity is a waste of resources.
      How can you determine what your server needs to send to a specific player? The easiest way is to trace, in fact, the field of view. Everything within that radius is what matters to the specific player, entities outside are not necessary to the specific player’s world simulation.
      And since we already have a zone subdivision, we can simply iterate over the entities in player’s zones of interest (instead of all entities in the map) to determine who is within our field of view. This concept is also called area of interest or AoI.
      So, continuing the example before, let’s iterate on 500 entities instead of 8000, to extrapolate those hypothetical 25 which fall within the visual range and exchange information through the network only with them.
      From 8000 to 25, a good result: doesn’t it? And without the user suffers of missing information as it does not see them. Indeed, it will notice less use of resources.
      You can further enhance the area of interest, by applying various measures: organize various levels of visual rays; the most distant entity will receive updates less frequently filter the interesting entities depending on the morphology of the map; if an entity is in our sight, but behind a mountain, I can possibly ignore it. This measure, however, (in my opinion) only makes sense if you already use culling for other things, so you don’t introduce additional calculations to filter few other entities DISTRIBUTE YOUR COMPUTATION LOAD

      We already said that a single machine will still have a certain threshold beyond which, despite all the optimizations made, you will experience performance degradation (and thus a bad gaming experience).
      Fine, but then why not take advantage of multiple computers simultaneously?
      There are obviously different ways to do it.
      For example, in Heroes of Asgard each map that composes the world is hosted on a separate process. This causes each map can be hosted on a different physical machine.
      Obviously, however, you can go down even more and accommodate sets of zones on separate processes (so a single map may be divided into several parts and hosted by different servers).


      You can also combine global services (such as chat) in different server processes, to give to your player the impression that, even being connected to different maps (so different servers), you can interact with distant players. Furthermore, break those services from the main world is getting an additional gain in performance.


      As mentioned, allocate memory costs a good amount of resources. So why not reuse what we already allocated? The use of objects pools is of great importance in the multiplayer development. It allows to shift the burden of allocating costs when it can be faced with no problems, for example during bootstrap of our app server.
      A monster is defeated and dies? Well, I put it aside. I can use it again when another monster must be spawned, just recovering from my pool.
      Of course it is clear that you have to use a certain criteria in order to choose which objects to keep in memory and which are not. Should I keep in memory a pool of a monsters that spawns once a month? No, it may be useless. Should I keep in memory a pool of objects representing the drop of the currency? Yes, it makes more sense.

      Of course, an important part of this thread is for resources. Articles, papers: each thing you think that can be useful on this topic.

      Spatial Partitioning
      Objects Pooling
      Game loop

      Feel free to add your questions or your contribute!

      Best regards,
    • By gamervb
      when checking if some object is within a certain area, which approach would have better performance if using glm library?
      vec3 pos;
      // Using a bounding box to represent that area
      1) if ( pos.x > box.MinX && pos.x < box.MaxX
           && pos.y > box.MinY....) { // do something..}
      // Using a sphere to represent that area
      2) if ( glm::length ( pos - sphere.centerPosition ) <= sphere.radius ) { // do something..}
    • By Heelp
      Guys, I have 7 animated enemies. And my fps is 62 for now( not capped). But when I add 5 more enemies and make them 12, my fps drops to 31. I traced the problem and I finally found it, it's my BoneTransform() function, which fills my vector of TransformMatrices that I use in the vertex shader in order to animate the skeleton. But it rapes my CPU. ( when I comment the BoneTransform() function, framerate goes from 30 to 166!( sometimes jumps between 166 and 200 ). And I kind of stole most of the function from a tutorial on skeletal animation, and I'm sure it's pretty optimized, so there must be some other reason.


      I used some models from World of Warcraft. And the interesting thing is that I have the game, and when I play it( when I play WoW ), I can have 20 players around me, and my fps is great, but when I add the same models in my own game, my fps drops like crazy and it's 10 times slower than the original game, why? ( bear in mind that I haven't even loaded any map, I just spawn 12 enemies walking on air, and my cpu runs like a fat truckdriver, wtf is that?? ).
    • By lonewolff
      Hi Guys,
      At present, I send the W, V, & P matrices to the shader where they are multiplied within the shader to position vertices.
      Would it be more efficient to pre-multiply these on the CPU and then pass the result to the shader?
      Thanks in advance :)
  • Popular Now