Performance issue using for loop in script

Recommended Posts

pjmendes    100
Hi all, While attempting to implement a tile map system in AS, I met with the following issue: running a single for loop (withouth any code inside) will make my framerate drop considerably (from about 500 to 30 fps); accessing and calling values inside a single dimensional or multi-dimensional array (containing handles to classes defined script-side), will drop the framerate to unusable levels. (Project has been compiled in "Release" mode). Am I doing something wrong, or is AngelScript not supposed to be used this way? Any recomendations? How I'm defining my "for" loops:
uint totalTilesX = 300; // small test value, will be used a much larger value in-game
uint totalTilesY = 10;
uint totalTiles = this.totalTilesX * this.totalTilesY;
for(uint idxTile = 0; idxTile < totalTiles; ++idxTile) {}


Script classes definitions and calling methods: TileMap structure:
class TileMap
{
TileMap(uint totalTilesX, uint totalTilesY, uint tileWidth, uint tileHeight)
{
this.totalTilesX = totalTilesX;
this.totalTilesY = totalTilesY;
this.tileWidth = tileWidth;
this.tileHeight = tileHeight;
this.totalLayers = 10;

// Resize it to desired dimension
this.tileMap.resize(this.totalTilesX * this.totalTilesY);
}

// this function is called once every cycle, contains several for loops and behaves very poorly
void update(float timeSinceLastFrame)   // update tile properties
{
// Test: contains the loop code above
}

void addTile(Tile@ tile, uint tileX, uint tileY, uint layerIndex)    // tile and tilemap x,y and layer coordinates
{
@this.tileMap[tileY * this.totalTilesY + tileX] = tile;
}

void drawTilemapGrid()
{
// draw column lines
for(uint idxColumn = 0; idxColumn <= this.totalTilesX; ++idxColumn)
{
Vector2 lineStart(idxColumn * tileWidth, 0);
Vector2 lineEnd(idxColumn * tileWidth, totalTilesY * tileHeight);
basicEngine.drawLine(lineStart, lineEnd, 1.0f, 0, 0, 255, 255);
}

// draw row lines
for(uint idxLine = 0; idxLine <= this.totalTilesY; ++idxLine)
{
Vector2 lineStart(0, idxLine * tileHeight);
Vector2 lineEnd(totalTilesX * tileWidth, idxLine * tileHeight);
basicEngine.drawLine(lineStart, lineEnd, 1.0f, 0, 0, 255, 255);
}
}

// properties
Tile@[] tileMap;
uint totalTilesX;
uint totalTilesY;
uint tileWidth;
uint tileHeight;
uint totalLayers;
}


Definitions of array structure:
Tile@[] tileMap; // also tried "Tile@[][]", to a worse performance when accessing inside loops


Tile class (contains another script defined class named "Sprite")
class Tile
{
Tile(Vector2 position, Sprite@ sprite) // The default constructor
{
this.position = position;
@this.sprite = sprite;
}

~Tile() {}

void resetHasUpdated()
{
this.sprite.hasUpdated = false;
}

void update(float timeSinceLastUpdate)
{
if(!this.sprite.hasUpdated)
{
this.sprite.imageSprite.update(timeSinceLastUpdate);   // update animation state
this.sprite.hasUpdated = true;
}
}

void draw(Vector2 tileOffset)
{
this.sprite.imageSprite.setPosition(this.position + tileOffset);
this.sprite.imageSprite.draw();
}

// class properties
Vector2 position;   // x,y position of sprite within tile
Sprite@ sprite; // reference sprite to draw as tile
}


[Edited by - pjmendes on April 2, 2010 11:22:54 PM]

Share on other sites
Thy Reaper    100
At first guess, I'd suggest that line callbacks could be causing the slowdown, as they should be called for each iteration of the loop, probably even for empty loops.

Share on other sites
pjmendes    100
Thanks for the tip. However, I'm not calling "asIScriptContext::SetLineCallback" anywhere, so I suppose no line callback is defined (I have defined Exception and Message callbacks though, but they are not being invoked).

Share on other sites
WitchLord    4677
The loop itself shouldn't be a bottleneck. In my performance test I have loops with 10 million iterations and with several expressions inside them execute in just a couple of hundred milliseconds. I also have other tests that make 10 million calls to a script function, which execute in just over 1 second. You can have a look at these tests in the tests/test_performance sub folder in the SVN.

Some generic things that can be done to improve performance:

1. don't use line callback (you've already told us you don't)
2. do not create a new context for each call, instead reuse them
3. do not search for the function id for each call, do it once and store the id
4. compile the library without support for multithread unless you really need it
5. compile the script without the line cues (engine property asEP_BUILD_WITHOUT_LINE_CUES)

Can you show us how you're calling the script functions?

Share on other sites
pjmendes    100
Thanks for the tips, I'll try them out. I expected the engine to perform well in for loops, so I was looking for ways in which my code could be behaving incorrectly.
Regarding each tip:

2. "do not create a new context for each call, instead reuse them" - already doing that.
3. "do not search for the function id for each call, do it once and store the id " - not currently doing this, will do.
4. "compile the library without support for multithread unless you really need it" - I'm not using multithread, so I will try this.
5. "compile the script without the line cues (engine property asEP_BUILD_WITHOUT_LINE_CUES)" - ok, will try it.

I will look at "tests/test_performance" results as well, thanks.

"Can you show us how you're calling the script functions?"

In my main cycle, every frame, the only call to the engine is through the following set of instructions, passing "update" as the function name ("update" does a couple of things and then calls TileMap.update(), where the performance drop happens):

void ScriptEngine::runScript(std::string functionName){    asIScriptModule *mod = scriptEngine->GetModule("MyModule");    int result = mod->GetFunctionIdByName(functionName.c_str());    if(result < 0)    {        std::cout << "An error occurred obtaining the function '"<< functionName <<"'\n";        switch(result)        {            case asEXECUTION_EXCEPTION:   // The execution didn't complete as expected. Determine what happened.                std::cout << "An exception '"<< this->context->GetExceptionString() << "'occurred. Please correct the code and try again.\n";                return;            break;            case asERROR:                std::cout << "The module was not compiled successfully.\n";            break;            case asMULTIPLE_FUNCTIONS:                std::cout << "Found multiple matching functions.\n";            break;            case asNO_FUNCTION:                std::cout << "Didn't find any matching functions.\n";            break;        }        //std::cout << "Now running the function...\n";        //assert(result >= 0);        return;    }    int funcId = result;    result = this->context->Prepare(funcId);    result = this->context->Execute(); // execute function    if( result != asEXECUTION_FINISHED )    {        switch(result)        {            case asERROR:                printf("Invalid context object, the context is not prepared, or it is not in suspended state.\n");            break;            case asEXECUTION_ABORTED:                printf("The execution was aborted with a call to Abort.\n");            break;            case asEXECUTION_SUSPENDED:                printf("The execution was suspended with a call to Suspend.\n");            break;            //case asEXECUTION_FINISHED 	The execution finished successfully.            case asEXECUTION_EXCEPTION:    // The execution ended with an exception.                // An exception occurred, let the script writer know what happened so it can be corrected.                printf("An exception '%s' occurred. Please correct the code and try again.\n", this->context->GetExceptionString());            break;        }    }}

Share on other sites
andrew1b    229
Quote:
 Original post by WitchLord4. compile the library without support for multithread unless you really need it

How do I do that?

Share on other sites
Thy Reaper    100
Quote:
Original post by andrew1b
Quote:
 Original post by WitchLord4. compile the library without support for multithread unless you really need it

How do I do that?

I'm pretty sure you add a define in as_config.h:

It's listed as the first thing in //FEATURES in that file.

Share on other sites
WitchLord    4677
Quote:
 Original post by pjmendesThanks for the tips, I'll try them out. I expected the engine to perform well in for loops, so I was looking for ways in which my code could be behaving incorrectly.

It should perform well even for the default settings. I find it strange that you got the performance issue that you reported. Nobody else have reported this before. Some developers are even using AngelScript to do particle systems, which is something that have a very high number of iterations.

I definitely want to find out what the problem is here.

Quote:
 Original post by pjmendesIn my main cycle, every frame, the only call to the engine is through the following set of instructions, passing "update" as the function name ("update" does a couple of things and then calls TileMap.update(), where the performance drop happens):*** Source Snippet Removed ***

Remove these two lines:

    asIScriptModule *mod = scriptEngine->GetModule("MyModule");    int result = mod->GetFunctionIdByName(functionName.c_str());

These are notably slow, as they are currently doing simple linear searches. It would definitely be possible to improve the performance of them, but since they are not meant to be called frequently there is not much use in that.

Make these calls once, right after compiling the script, and store the function id for later use.

Also, the script shows that TileMap is a class. But the runScript function is only implemented to call a global function. Where is the missing link?

Share on other sites
WitchLord    4677
Quote:
 Original post by Thy ReaperI'm pretty sure you add a define in as_config.h:#define AS_NO_THREADSIt's listed as the first thing in //FEATURES in that file.

That is correct. It can also be defined in the makefile or project settings, if you prefer not to change the as_config.h file itself.

Share on other sites
pjmendes    100
Quote:
 Original post by WitchLordIt should perform well even for the default settings. I find it strange that you got the performance issue that you reported. Nobody else have reported this before. Some developers are even using AngelScript to do particle systems, which is something that have a very high number of iterations.I definitely want to find out what the problem is here.

Cool, thanks for looking into it! From your description of other projects, I should be able to implement what I plan to, expecting good performance.

Quote:
 Original post by WitchLordRemove these two lines: asIScriptModule *mod = scriptEngine->GetModule("MyModule"); int result = mod->GetFunctionIdByName(functionName.c_str());These are notably slow, as they are currently doing simple linear searches. It would definitely be possible to improve the performance of them, but since they are not meant to be called frequently there is not much use in that.Make these calls once, right after compiling the script, and store the function id for later use.

I did that, but it didn't do much difference. I agree that it's probably not relevant to improve their performance. I believe the bottleneck is elsewhere, either related to using script defined classes, or the fact that the script classes are using application registered classes through handles and/or values (I'm using both: "BasicSpriteEngine" is a handle and "Vector2" is a class type I defined myself, not the Vector2 you provide in Angelscript's source).

Quote:
 Original post by WitchLordAlso, the script shows that TileMap is a class. But the runScript function is only implemented to call a global function. Where is the missing link?

Here is my complete script file, containing the class declarations and the main game cycle's code ("update", called every frame). The global "update" function called from the application is the last function, which calls "TileMap.update" (that single call is what disturbs the performance).

// container for a BasicEngineSprite and indication if it has already been called "update" or notclass Sprite{    Sprite(BasicEngineSprite@ imageSprite)    {        @this.imageSprite = imageSprite;        this.hasUpdated = false;    }    BasicEngineSprite@ imageSprite;    bool hasUpdated;    // to ensure update is only called for this sprite once a cycle}// The class declarationclass Tile{    Tile(Vector2 position, Sprite@ sprite) // The default constructor    {        this.position = position;        @this.sprite = sprite;    }    ~Tile() {}    void resetHasUpdated()    {        this.sprite.hasUpdated = false;    }    void update(float timeSinceLastUpdate)    {        //this.imageSprite.update(timeSinceLastUpdate);   // update animation state        if(!this.sprite.hasUpdated)        {            this.sprite.imageSprite.update(timeSinceLastUpdate);   // update animation state            this.sprite.hasUpdated = true;        }    }    void draw(Vector2 tileOffset)    {        // update position based on this object's index layer        //print("final tile pos: "+ (this.position + tileOffset).str() +"\n");        this.sprite.imageSprite.setPosition(this.position + tileOffset);        this.sprite.imageSprite.draw();    }    // class properties    Vector2 position;   // x,y position of sprite within tile    Sprite@ sprite; // reference sprite to draw as tile    //uint8 layerIndex;   // layer index for this tile    (0 = deepest BG)}class TileMap{    TileMap(uint totalTilesX, uint totalTilesY, uint tileWidth, uint tileHeight)    {        this.totalTilesX = totalTilesX;        this.totalTilesY = totalTilesY;        this.tileWidth = tileWidth;        this.tileHeight = tileHeight;        this.totalLayers = 10;        // Resize it to desired dimension        this.tileMap.resize(this.totalTilesX);        for(uint idxColumn = 0; idxColumn < this.totalTilesX; ++idxColumn)            this.tileMap[idxColumn].resize(totalTilesY);    }    void update(float timeSinceLastFrame)   // update tile properties    {        for(uint idxColumn = 0; idxColumn < this.totalTilesX; ++idxColumn)        {            for(uint idxLine = 0; idxLine < this.totalTilesY; ++idxLine)            {//                //print("    Updating y = "+ idxLine + "\n");//                if(this.tileMap[idxColumn][idxLine] !is null)//                {//                    //print("Updating tile ["+ idxColumn +"],["+ idxLine +"]\n");//                    Vector2 tileOffset(this.tileWidth * idxColumn, this.tileHeight * idxLine);//                    //print("tileoffset: "+ tileOffset.str() +"\n");//                    this.tileMap[idxColumn][idxLine].update(timeSinceLastFrame);//                    //print("updating ["+ idxColumn +"]["+ idxLine +"]");////                    this.tileMap[idxColumn][idxLine].draw(tileOffset);//                }            }        }        // draw tilemap grid        //this.drawTilemapGrid();        // reset all tiles "updated" status        for(uint idxColumn = 0; idxColumn < this.totalTilesX; ++idxColumn)        {            uint idxLine = 0;            for(uint idxLine = 0; idxLine < this.totalTilesY; ++idxLine)            {                //if(this.tileMap[idxColumn][idxLine] !is null)                //    this.tileMap[idxColumn][idxLine].resetHasUpdated();            }        }    }    void addTile(Tile@ tile, uint tileX, uint tileY, uint layerIndex)    // tile and tilemap x,y and layer coordinates    {        @this.tileMap[tileX][tileY] = tile;    }    void drawTilemapGrid()    {        // draw column lines        for(uint idxColumn = 0; idxColumn <= this.totalTilesX; ++idxColumn)        {            Vector2 lineStart(idxColumn * tileWidth, 0);            Vector2 lineEnd(idxColumn * tileWidth, totalTilesY * tileHeight);            basicEngine.drawLine(lineStart, lineEnd, 1.0f, 0, 0, 255, 255);        }        // draw row lines        for(uint idxLine = 0; idxLine <= this.totalTilesY; ++idxLine)        {            Vector2 lineStart(0, idxLine * tileHeight);            Vector2 lineEnd(totalTilesX * tileWidth, idxLine * tileHeight);            basicEngine.drawLine(lineStart, lineEnd, 1.0f, 0, 0, 255, 255);        }    }    // properties    Tile@[][] tileMap;  // tileMap[1] is x = 1, tilemap[1][2] is x = 1, y = 2    uint totalTilesX;    uint totalTilesY;    uint tileWidth;    uint tileHeight;    uint totalLayers;}float scrollSpeed = 300.0f;TileMap tm;void initialize(){    basicEngine.loadResourceImage("Resources/Gfx/test_anim1.png","img1");    basicEngine.loadResourceImage("Resources/Gfx/test_anim2.png","img2");    basicEngine.loadResourceImage("Resources/Gfx/test_anim3.png","img3");    basicEngine.loadResourceImage("Resources/Gfx/test_anim4.png","img4");    basicEngine.loadResourceImage("Resources/Gfx/tile1_64x.png","tile1");    BasicEngineSprite@ ball = basicEngine.createSprite("Sprite1");   // correct way to get pointers    basicEngine.addImageToSprite("Sprite1", "img1", 0.25f);    basicEngine.addImageToSprite("Sprite1", "img2", 0.25f);    basicEngine.addImageToSprite("Sprite1", "img3", 0.25f);    basicEngine.addImageToSprite("Sprite1", "img4", 0.25f);    BasicEngineSprite@ besTile1 = basicEngine.createSprite("Tile1");    basicEngine.addImageToSprite("Tile1", "tile1", 0);//    basicEngine.addButton("buttonB", "Button", Vector2(10,10), Vector2(0,0));    Sprite spriteBall(ball);    Tile tile1(Vector2(10,10), spriteBall);    Tile tile2(Vector2(10,10), spriteBall);    Tile tile3(Vector2(0,0), spriteBall);    Tile tile4(Vector2(0,0), spriteBall);    Sprite spriteWall1(besTile1);    Tile tileWall1(Vector2(0,0), spriteWall1);    tm = TileMap(1000, 10, 64, 64);    tm.addTile(tile1, 0, 0, 0);    tm.addTile(tile2, 2, 0, 0);    tm.addTile(tile3, 2, 1, 0);    tm.addTile(tile4, 2, 2, 0);    tm.addTile(tileWall1, 0, 1, 0);}void update(){    float timeSinceLastUpdate = basicEngine.getElapsedTime();    //print(" elapsed time:"+ timeSinceLastUpdate + "\n");    Vector2 mousePosition(basicEngine.getMouseX(), basicEngine.getMouseY());    //basicEngine.drawRectangle(mousePosition, Vector2(40,20), 255,100,100,255);    if(mousePosition.getX() > 800 - 50)        basicEngine.setCameraPosition(basicEngine.getCameraPosition() + Vector2(scrollSpeed * timeSinceLastUpdate,0));    else    if(mousePosition.getX() < 0 + 50)        basicEngine.setCameraPosition(basicEngine.getCameraPosition() - Vector2(scrollSpeed * timeSinceLastUpdate,0));    Vector2 mousePositionWorld = basicEngine.convertCameraToWorldCoordinates(mousePosition);    //ball.setPosition(mousePositionWorld);    // update tilemap    tm.update(timeSinceLastUpdate); // commenting this line gives about 500 fps, with it is at about 40/50    // exit on esc press    //if(basicEngine.keyPressed(Event::Input::Keyboard::Code::Escape))    //    basicEngine.endMainCycle();}

I'll look into re-compiling and trying the other tips in the weekend.

Share on other sites
WitchLord    4677
Hmm, your loops are accessing the script class members for each iteration. This may be the cause. Can you try rewriting the function like this:

    void update(float timeSinceLastFrame)   // update tile properties    {        uint totalTilesX = this.totalTilesX;        uint totalTilesY = this.totalTilesY;        for(uint idxColumn = 0; idxColumn < totalTilesX; ++idxColumn)        {            for(uint idxLine = 0; idxLine < totalTilesY; ++idxLine)            {            }        }        for(uint idxColumn = 0; idxColumn < totalTilesX; ++idxColumn)        {            for(uint idxLine = 0; idxLine < totalTilesY; ++idxLine)            {            }        }    }

This is likely going to give a great performance boost. Local variables are always the most efficient to access, as they are referencable directly by the op codes in the VM. Class members and global variables need sevaral steps to get to the final value.

I'll definitely look into optimizing the bytecode instructions to reduce the amount of steps needed for the VM to access the class members, but it will never be as quick as the local variables. So the above suggestion will always be valid.

Regards,
Andreas

Share on other sites
pjmendes    100
Indeed there is some improvement in referencing local variables following your suggestion (about 10% from a quick glance). However, since my tilemap implementation does a heavy use of accessing script class members inside several loops, the performance goes down too much for me to be able to implement this part in AngelScript, so I plan on moving it application-side.

I will keep this in mind when designing my engine, critical areas will not be done script-side, but I still wish to use the great power of AngelScripts (possibly for AI, each AI entity calling script callbacks).

Thanks for all the help, and for creating and improving and providing us with AngelScript!