• Advertisement
  • Popular Tags

  • Popular Now

  • Advertisement
  • Similar Content

    • By akshayMore
      Hello,
      I am trying to make a GeometryUtil class that has methods to draw point,line ,polygon etc. I am trying to make a method to draw circle.  
      There are many ways to draw a circle.  I have found two ways, 
      The one way:
      public static void drawBresenhamCircle(PolygonSpriteBatch batch, int centerX, int centerY, int radius, ColorRGBA color) { int x = 0, y = radius; int d = 3 - 2 * radius; while (y >= x) { drawCirclePoints(batch, centerX, centerY, x, y, color); if (d <= 0) { d = d + 4 * x + 6; } else { y--; d = d + 4 * (x - y) + 10; } x++; //drawCirclePoints(batch,centerX,centerY,x,y,color); } } private static void drawCirclePoints(PolygonSpriteBatch batch, int centerX, int centerY, int x, int y, ColorRGBA color) { drawPoint(batch, centerX + x, centerY + y, color); drawPoint(batch, centerX - x, centerY + y, color); drawPoint(batch, centerX + x, centerY - y, color); drawPoint(batch, centerX - x, centerY - y, color); drawPoint(batch, centerX + y, centerY + x, color); drawPoint(batch, centerX - y, centerY + x, color); drawPoint(batch, centerX + y, centerY - x, color); drawPoint(batch, centerX - y, centerY - x, color); } The other way:
      public static void drawCircle(PolygonSpriteBatch target, Vector2 center, float radius, int lineWidth, int segments, int tintColorR, int tintColorG, int tintColorB, int tintColorA) { Vector2[] vertices = new Vector2[segments]; double increment = Math.PI * 2.0 / segments; double theta = 0.0; for (int i = 0; i < segments; i++) { vertices[i] = new Vector2((float) Math.cos(theta) * radius + center.x, (float) Math.sin(theta) * radius + center.y); theta += increment; } drawPolygon(target, vertices, lineWidth, segments, tintColorR, tintColorG, tintColorB, tintColorA); } In the render loop:
      polygonSpriteBatch.begin(); Bitmap.drawBresenhamCircle(polygonSpriteBatch,500,300,200,ColorRGBA.Blue); Bitmap.drawCircle(polygonSpriteBatch,new Vector2(500,300),200,5,50,255,0,0,255); polygonSpriteBatch.end(); I am trying to choose one of them. So I thought that I should go with the one that does not involve heavy calculations and is efficient and faster.  It is said that the use of floating point numbers , trigonometric operations etc. slows down things a bit.  What do you think would be the best method to use?  When I compared the code by observing the time taken by the flow from start of the method to the end, it shows that the second one is faster. (I think I am doing something wrong here ).
      Please help!  
      Thank you.  
    • By CPPapprentice
      Hi Forum,
      in terms of rendering a tiled game level, lets say the level is 3840x2208 pixels using 16x16 tiles. which method is recommended;
      method 1- draw the whole level, store it in a texture-object, and only render whats in view, each frame.
      method 2- on each frame, loop trough all tiles, and only draw and render it to the window if its in view.
       
      are both of these methods valid? is there other ways? i know method 1 is memory intensive  but method 2 is processing heavy.
      thanks in advance
    • By wobes
      Hi there. I am really sorry to post this, but I would like to clarify the delta compression method. I've read Quake 3 Networking Model: http://trac.bookofhook.com/bookofhook/trac.cgi/wiki/Quake3Networking, but still have some question. First of all, I am using LiteNetLib as networking library, it works pretty well with Google.Protobuf serialization. But then I've faced with an issue when the server pushes a lot of data, let's say 10 players, and server pushes 250kb/s of data with 30hz tickrate, so I realized that I have to compress it, let's say with delta compression. As I understood, the client and server both use unreliable channel. LiteNetLib meta file says that unreliable packet can be dropped, or duplicated; while sequenced channel says that packet can be dropped but never duplicated, so I think I have to use the sequenced channel for Delta compression? And do I have to use reliable channel for acknowledgment, or I can just go with sequenced, and send the StateId with a snapshot and not separately? 
      Thank you. 
    • By dp304
      Hello!
      As far as I understand, the traditional approach to the architecture of a game with different states or "screens" (such as a menu screen, a screen where you fly your ship in space, another screen where you walk around on the surface of a planet etc.) is to make some sort of FSM with virtual update/render methods in the state classes, which in turn are called in the game loop; something similar to this:
      struct State { virtual void update()=0; virtual void render()=0; virtual ~State() {} }; struct MenuState:State { void update() override { /*...*/ } void render() override { /*...*/ } }; struct FreeSpaceState:State { void update() override { /*...*/ } void render() override { /*...*/ } }; struct PlanetSurfaceState:State { void update() override { /*...*/ } void render() override { /*...*/ } }; MenuState menu; FreeSpaceState freespace; PlanetSurfaceState planet; State * states[] = {&menu, &freespace, &planet}; int currentState = 0; void loop() { while (!exiting) { /* Handle input, time etc. here */ states[currentState]->update(); states[currentState]->render(); } } int main() { loop(); } My problem here is that if the state changes only rarely, like every couple of minutes, then the very same update/render method will be called several times for that time period, about 100 times per second in case of a 100FPS game. This seems a bit to make dynamic dispatch, which has some performance penalty, pointless. Of course, one may argue that a couple hundred virtual function calls per second is nothing for even a not so modern computer, and especially nothing compared to the complexity of the render/update function in a real life scenario. But I am not quite sure. Anyway, I might have become a bit too paranoid about virtual functions, so I wanted to somehow "move out" the virtual function calls from the game loop, so that the only time a virtual function is called is when the game enters a new state. This is what I had in mind:
      template<class TState> void loop(TState * state) { while (!exiting && !stateChanged) { /* Handle input, time etc. here */ state->update(); state->render(); } } struct State { /* No update or render function declared here! */ virtual void run()=0; virtual ~State() {} }; struct MenuState:State { void update() { /*...*/ } void render() { /*...*/ } void run() override { loop<MenuState>(this); } }; struct FreeSpaceState:State { void update() { /*...*/ } void render() { /*...*/ } void run() override { loop<FreeSpaceState>(this); } }; struct PlanetSurfaceState:State { void update() { /*...*/ } void render() { /*...*/ } void run() override { loop<PlanetSurfaceState>(this); } }; MenuState menu; FreeSpaceState freespace; PlanetSurfaceState planet; State * states[] = {&menu, &freespace, &planet}; void run() { while (!exiting) { stateChanged = false; states[currentState]->run(); /* Runs until next state change */ } } int main() { run(); } The game loop is basically the same as the one before, except that it now exits in case of a state change as well, and the containing loop() function has become a function template.
      Instead of loop() being called directly by main(), it is now called by the run() method of the concrete state subclasses, each instantiating the function template with the appropriate type. The loop runs until the state changes, in which case the run() method shall be called again for the new state. This is the task of the global run() function, called by main().
      There are two negative consequences. First, it has become slightly more complicated and harder to maintain than the one above; but only SLIGHTLY, as far as I can tell based on this simple example. Second, code for the game loop will be duplicated for each concrete state; but it should not be a big problem as a game loop in a real game should not be much more complicated than in this example.
      My question: Is this a good idea at all? Does anybody else do anything like this, either in a scenario like this, or for completely different purposes? Any feedback is appreciated!
    • By svetpet
      Hello, I want to optimize the used memory in my game so that it supports low end devices - for instance iPhone 4s.
      I know that some of the main things I should look into are memory leaks, big textures and some game specific things, which occupy a lot of memory.
      To detect all that I am using MTuner on Windows and Instruments (Allocations) on XCode.
      What are you generally looking for when optimizing memory? What instruments are you using? My target platform is iOS.
  • Advertisement
  • Advertisement
Sign in to follow this  

Only 12 Enemies, And My Fps Drops To 30, Why Is That?

This topic is 620 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

Guys, I have 7 animated enemies. And my fps is 62 for now( not capped). But when I add 5 more enemies and make them 12, my fps drops to 31. I traced the problem and I finally found it, it's my BoneTransform() function, which fills my vector of TransformMatrices that I use in the vertex shader in order to animate the skeleton. But it rapes my CPU. ( when I comment the BoneTransform() function, framerate goes from 30 to 166!( sometimes jumps between 166 and 200 ). And I kind of stole most of the function from a tutorial on skeletal animation, and I'm sure it's pretty optimized, so there must be some other reason.

[attachment=32770:lowfps.gif]

I used some models from World of Warcraft. And the interesting thing is that I have the game, and when I play it( when I play WoW ), I can have 20 players around me, and my fps is great, but when I add the same models in my own game, my fps drops like crazy and it's 10 times slower than the original game, why? ( bear in mind that I haven't even loaded any map, I just spawn 12 enemies walking on air, and my cpu runs like a fat truckdriver, wtf is that?? ). Edited by jbadams

Share this post


Link to post
Share on other sites
Advertisement

Enemies shouldn't "have" a deltaTime, you should just pass them the current deltaTime to their update function.

(right now it looks like you are mixing your updating, player input event processing, AI thinking, and rendering, all in one function)

 

I used some models from World of Warcraft. And the interesting thing is that I have the game, and when I play it( when I play WoW ), I can have 20 players around me, and my fps is great, but when I add the same models in my own game, my fps drops like crazy and it's 10 times slower than the original game, why?

 

Because it's not about what data you're loading, it's about how your code uses it. Well, okay, it's about the data and the code working together.

 

Your code and WoW's code is different, and thus your framerate and WoW's framerate is different.

 

(Make sure you don't use WoW's models in any copy of your game you distribute publically, btw - that's copyright infringement)

Share this post


Link to post
Share on other sites

Thanks for the answer.

 

But!! :)

 

The way I see it, there are two ways.

 

First way: Pass the deltaTime to each function in the Enemy class. The problem with this method is that if I have 100 movement functions, I need to pass deltaTime 100 times every frame.

 

Second way: Pass the deltaTime in the class as a variable each frame. The pros are that I pass the deltaTime only once per frame and I can use it by 100 functions if I want to. The second way sounds better, because you pass the variable only once and use it as much as you want.

 

And about the fps drop, what would you suggest? I mean, what should I do, I'm pretty sure I have loaded the skeletal animation correctly, because most of the code is stolen from different tutorials, and I don't know what's wrong.

 

I think the problem is that when I start the game, it uses only 1 core on my laptop, but when I start WoW, all 4 cores are used, can this be the problem?

Edited by codeBoggs

Share this post


Link to post
Share on other sites

Ok, thanks frob, I will definitely search for a profiling tool because I haven't used one so far. Seems to me that this will take time to fix, so I will leave it for some later moment.

 

Another question came to mind and I don't want to make another post.

 

I loaded a very simple map for my paintball game. The map has a floor with a couple of walls and that's it. The floor is perfectly flat and the walls... well, I copied the floor, rotated it to 90 degrees and I made the walls with it.

 

And I need a very, very simple collision detection for this map. The only two options that come to mind are:

 

1.Make AABBs for the floor and the walls and do checks every frame.

2.Take a picture of the scene from above and store the depth values in a framebuffer and somehow use them to decide if the player is going to collide.

 

Is there something else I can do, and if not, what option should I choose from these two?

Edited by codeBoggs

Share this post


Link to post
Share on other sites

It reads like you hit your geom limit, some times known as a object limit.

 

Lucky it is easy to test, double the poly count of the animated model and note the frame rate, then use a very low(100 polygon) animation model an note the frame rate.

If you still run at the same frame rate with low polygon models and high polygon models, give or take a frame or two, then it's the geom limit.

 

If you get low frame rate no matter the poly count it is often the geom limit or the shaders in my experience.

 

 

PC graphic cards can only render so much objects at real time, this is known as the geom limit or object limit. You can batch models into one model to use less objects or you can use instancing.

For animation objects you want instancing as dynamic batching can be very hard and unpredictable.

 

 

The problem is that this can be many things from draw calls to bad programming, modeling and many other things.

Share this post


Link to post
Share on other sites

First way: Pass the deltaTime to each function in the Enemy class. The problem with this method is that if I have 100 movement functions, I need to pass deltaTime 100 times every frame.

Passing a single float, even 100 times, is very cheap, if not free. You're pre-optimizing and obfuscating code without any real gain.

Share this post


Link to post
Share on other sites

ferrous, true story man, I always forget to think before optimizing..

 

Scouting Ninja, I don't think I've hit any limits because in the original game my pc can handle 10 times more enemies and everything is ok, but I will give it a try, I just need to change the models poly count with blender, thanks for the idea by the way.

Share this post


Link to post
Share on other sites

Guys, I solved it.

const aiNodeAnim* Model::FindNodeAnim( const aiAnimation* pAnimation, const string NodeName )
{
    for ( uint i = 0 ; i < pAnimation->mNumChannels ; i ++ )
    {
        const aiNodeAnim* pNodeAnim = pAnimation->mChannels[i];

        if ( string( pNodeAnim->mNodeName.data ) == NodeName )
        {
            return pNodeAnim;
        }
    }

    return NULL;
}

This function here swallows 250 of my fps per second( from 380 to 30 ) and this is the function that finds the proper animation for every node in the model. Basically it counts from 0 to 120( in my case ), and for every loop it does a string comparison in order to find the proper animation for the node. Can you believe it? I still can't. This function is placed in a recursive function called ReadNodeHierarchy() that reads all the nodes matrices and calculates the interpolation and consequently, the final transformation matrix.

 

Here is the function:

void Model::ReadNodeHeirarchy( float AnimationTime, const aiNode* pNode, const Matrix4f& ParentTransform, int currentAnim )
{
    string NodeName( pNode->mName.data );

    const aiAnimation* pAnimation = this->scene->mAnimations[currentAnim];

    Matrix4f NodeTransformation( pNode->mTransformation );

    const aiNodeAnim* pNodeAnim = FindNodeAnim( pAnimation, NodeName ); //Only this function swallows 250 fps, believe it or not.

    if ( pNodeAnim )
    {
        // Interpolate scaling and generate scaling transformation matrix
        aiVector3D Scaling;
        calcInterpolatedScaling( Scaling, AnimationTime, pNodeAnim );
        Matrix4f ScalingM;
        ScalingM.InitScaleTransform( Scaling.x, Scaling.y, Scaling.z );

        // Interpolate rotation and generate rotation transformation matrix
        aiQuaternion RotationQ;
        calcInterpolatedRotation( RotationQ, AnimationTime, pNodeAnim );
        Matrix4f RotationM = Matrix4f( RotationQ.GetMatrix( ) );

        // Interpolate translation and generate translation transformation matrix
        aiVector3D Translation;
        calcInterpolatedPosition( Translation, AnimationTime, pNodeAnim );
        Matrix4f TranslationM;
        TranslationM.InitTranslationTransform( Translation.x, Translation.y, Translation.z );

        // Combine the above transformations
        NodeTransformation = TranslationM * RotationM * ScalingM;
    }

    Matrix4f GlobalTransformation = ParentTransform * NodeTransformation;

    if ( boneMapping.find( NodeName ) != boneMapping.end( ) )
    {
        uint BoneIndex = boneMapping[NodeName];
        boneInformation[BoneIndex].FinalTransformation = GlobalInverseTransform * GlobalTransformation *
                                                    boneInformation[BoneIndex].BoneOffset;
    }

    for ( uint i = 0; i < pNode->mNumChildren; i ++ )
    {
        ReadNodeHeirarchy( AnimationTime, pNode->mChildren[i], GlobalTransformation, currentAnim );
    }
}

This magical statement gulps all my CPU power:   const aiNodeAnim* pNodeAnim = FindNodeAnim( pAnimation, NodeName );

 

This is because the readNodeHierarchy() function is recursive. For example you have one transformation matrix for the fingers, but then you need to multiply by the arm transformationMatrix because the arm moves the fingers too, and then the body moves the arm which moves the fingers and so on and so on. And every time that happens, the findNodeAnim() function counts from 0 to 120( mNumChannels) in order to find the proper animation based on the node's name and it does string comparison and some other crazy stuff million times per second.

 

And this was the code from this tutorial: http://ogldev.atspace.co.uk/www/tutorial38/tutorial38.html

 

It does the job for a tutorial, it is readable, but it is very unoptimized. What I did is to cache all the animChannels' indices with their proper bone in a map container, now the same 12 enemies run on 100 fps instead of 30 fps, so 70 fps gained by caching all the animations.

 

here is the fps with 24 enemies.

 

[attachment=32776:gamefixed20fps.gif]

 

I wonder what crazy fps gain can be made if I cache the interpolation matrices too?

Edited by codeBoggs

Share this post


Link to post
Share on other sites
Sign in to follow this  

  • Advertisement