Sign in to follow this  
babaliaris

Unity Devide a scene to increase fps.

Recommended Posts

babaliaris    112

I'm using python for coding.

So i'm making a game engine which the main loop is something like that:

def mainLoop(self):
    '''This loop run's every frame per second.'''
     
    #Go through all sprite objects inside this scene.
    for sprite in self.sprites:
        
        #Update the sprite inside the quadtree.
        if sprite.position.changed():
            self.quadtree.update(sprite)
        
        #Render the sprite only if it is inside the screen resolution.
        if sprite.onScreen():
            sprite.render()

        
        #----------Get possible collision----------#

        possible_collisions = self.quadtree.get()

        for object in possible_collisions:
            sprite.getCollision(object)

        #----------Get possible collision----------#
        
        #Run the logic of this sprite object.
        if sprite.hasScript():
            sprite.runScript()

Firstly i thought that the quadtree (for reducing the collision checks) was the problem. But it wasn't that, my quadtree is working perfectly. So in order to find out what was causing latency for a big amount of sprite objects into the scene, i disabled the collision detection algorithm and run the engine again in order to see if the efficiency will be better. Then i found out that it was still very slow. Finally the problem was the for loop. Even that the game loop wasn't doing to much to waste time, because the for loop was too big the frames where being delayed anyway.

 

So i thought of a solution but it costs in game mechanics. I will devide my scene using a quadtree, and i will only work with sprites inside this area (depended where the camera position is). So i will only run the scripts of those sprites. This will make the engine a lot faster but objects outside the specified area will be disabled because their scripts won't run. For example if an enemy is hunting the player, but the player moves far enough so the enemy will be placed in a quadtree node where is outside of the current node, the enemy's script will be disabled and he will freeze. In the image below, you can see how my scene will be devided.

 

Please notice that the red rectangles are not devisions of the scene but the big red rect is. The 9 rectangles shows how many windows the quadtree node is (A window has the screen (game) resolution).

Quadtree_Space.png

The collision checks will also occur only for the sprites inside this area.

 

So the main loop now will loop like this:

def mainLoop(self):

    area = self.quadtree.get()

    for sprite in area:

        #Update the sprite inside the quadtree.
        if sprite.position.changed():
            self.quadtree.update(sprite)

        #Render the sprite only if it is inside the screen resolution.
        if sprite.onScreen():
            sprite.render()

        #This is O(N^2) but i think will be ok because the area list will not be to big.
        for sp in area:
            if sp != sprite:
                sp.getCollision(sprite)

        #Run the logic of this sprite object.
        if sprite.hasScript():
            sprite.runScript()

There must  be a better solutions for this problem, so i'm hearing suggestions.

 

Thank you for your time.

 

Just from curiosity, in modern game engines like unity 3d, an enemy who is at the other side of the map while the player is in the oposite side, will the enemy's logic run? Also assume that the scene is realy big!!!

Edited by babaliaris

Share this post


Link to post
Share on other sites
Wyrframe    2426

Well, one thing you can improve within your stated architecture is that if A.getCollission(B) updates both A's and B's collision-state buffers, you only need to touch each pair once.

for( i=1; i < area.members; ++i )
  for( j=i-1; j >= 0; --j )
    area.members[i].testCollision( area.members[j] );

Presumably each sprite's 'script' logic then resolves any collisions discovered during that phase.

Share this post


Link to post
Share on other sites
babaliaris    112

Well, one thing you can improve within your stated architecture is that if A.getCollission(B) updates both A's and B's collision-state buffers, you only need to touch each pair once.

for( i=1; i < area.members; ++i )
  for( j=i-1; j >= 0; --j )
    area.members[i].testCollision( area.members[j] );

Presumably each sprite's 'script' logic then resolves any collisions discovered during that phase.

Yes i understand that and already know this. Just assume that this works perfectly. My main problem is the devision.

Edited by babaliaris

Share this post


Link to post
Share on other sites
samoan62    120

When you loop through self.sprites are you looking at every sprite in the scene even the ones that don't move? If your bottleneck is the number of sprites, maybe just checking the ones that can move would reduce the iterations of your main loop if done the first way. This way every collision will still have a dynamic sprite attached to it since your static objects don't move.

In modern game engines like Unity you can mark non-moving objects as static, and also the scripts on objects always run regardless of proximity to the camera.

Share this post


Link to post
Share on other sites
babaliaris    112

When you loop through self.sprites are you looking at every sprite in the scene even the ones that don't move? If your bottleneck is the number of sprites, maybe just checking the ones that can move would reduce the iterations of your main loop if done the first way. This way every collision will still have a dynamic sprite attached to it since your static objects don't move.

In modern game engines like Unity you can mark non-moving objects as static, and also the scripts on objects always run regardless of proximity to the camera.

 

Yes, i look every sprite. So what you are saying, is to split game objects to two lists and make the loop work something like this:

def mainLoop(self):
    
    '''staticSprites + movingSprites = all sprites inside the scene.'''

    for sprite in self.staticSprites:

        #Run static sprite scripts.
        #Render static sprite.

        for moving in self.movingSprites:

            #Run scripts of moving sprites.
            #Render moving sprites.
            #check collision using the quadtree and the moving sprites.

please notice that inside the quadtree exists all the gameobjects (static and none static)
because the quadtree is a space division 

Edited by babaliaris

Share this post


Link to post
Share on other sites
samoan62    120

 

When you loop through self.sprites are you looking at every sprite in the scene even the ones that don't move? If your bottleneck is the number of sprites, maybe just checking the ones that can move would reduce the iterations of your main loop if done the first way. This way every collision will still have a dynamic sprite attached to it since your static objects don't move.

In modern game engines like Unity you can mark non-moving objects as static, and also the scripts on objects always run regardless of proximity to the camera.

 

Yes, i look every sprite. So what you are saying, is to split game objects to two lists and make the loop work something like this:

def mainLoop(self):
    
    '''staticSprites + movingSprites = all sprites inside the scene.'''

    for sprite in self.staticSprites:

        #Run static sprite scripts.
        #Render static sprite.

        for moving in self.movingSprites:

            #Run scripts of moving sprites.
            #Render moving sprites.
            #check collision using the quadtree and the moving sprites.

please notice that inside the quadtree exists all the gameobjects (static and none static)
because the quadtree is a space division 

Yea something like that would be good although I would avoid nesting the moving sprites loop in the static loop. It didn't look like this was intentional. 

An even better idea, in my opintion, for organization purposes is to just have a

static
property on the sprites that way they can be distinguished but still be in the same list. This way they can still share a lot of the same logic. Kind of like below:

[source]

def mainLoop(self):     
    for sprite in self.sprites:
        #logic for all sprites
        if sprite.onScreen():
            sprite.render()

        if sprite.static:

            #static sprite logic here

        else:

            #logic for dynamic sprites

            if sprite.position.changed():

                self.quadtree.update(sprite)

 

            #collisions will only occur when at least one object is not static
            possible_collisions = self.quadtree.get()

            for object in possible_collisions:
                sprite.getCollision(object)
 

[/source]

Share this post


Link to post
Share on other sites
babaliaris    112

@samoan62 

I understood everything you said, but just to be sure i want to show all of you step by step what i'm doing in order to discuss the efficiency of my algorithms.

First i will show you a game with the only functionality of rendering 15000 sprite objects.

Structure:

structure.png

 

Sprite Class:

class Sprite:

    def __init__(self, image = None, pos = (0,0), tag = ""):
        '''Constructor.'''

        self.image = image #Image.
        self.pos   = pos   #Position.
        self.tag   = tag   #A tag name.

        #Width and height of the sprite.
        self.width = image.get_width()
        self.height= image.get_height()


    def render(self, screen):
        '''Is drawing this sprite's image to the screen.'''

        #Draw the image into the screen.
        if self.onScreen(screen) and self.image != None:
            screen.blit(self.image, self.pos)
            



    def onScreen(self, screen):
        '''Return's True if this sprite is inside the screen borders.'''

        #Get the size of the screen.
        screenWidth, screenHeight = screen.get_size()

        #Get coordinates of this sprite.
        x, y = self.pos

        #Do the calculation.
        if x + self.width > 0 and x < screenWidth:
            if y + self.height > 0 and y < screenHeight:
                return True

        return False

        

Scene Class:

class Scene:

    def __init__(self):
        '''Constructor.'''

        #Sprites list.
        self.sprites = []



    def addSprite(self, sprite):
        '''Add a new sprite to the scene.'''
        self.sprites.append(sprite)



    def render(self, screen):
        '''Render this scene.'''

        #Go through all sprites in this scene.
        for sprite in self.sprites:

            #Render the sprite object.
            sprite.render(screen)

Core Class:

import pygame

from pygame.locals import *

class Core:

    def __init__(self, width, height):
        '''Constructor.'''

        pygame.init()
        self.pygame = pygame
        self.screen = pygame.display.set_mode((width, height))

        self.running = True
        self.scene   = None




    def mainloop(self):
        '''Main loop of the game, runs every frame.'''

        #Pygame clock.
        clock = pygame.time.Clock()

        while self.running:

            #Fill the screen with black color.
            self.screen.fill((0,0,0))

            #Check keyboard and mouse input.
            for event in pygame.event.get():

                #Quit button pressed.
                if event.type == QUIT:
                    self.running = False
                    break


            #Render the screen.
            if self.scene != None:
                self.scene.render(self.screen)


            #Count time.
            clock.tick()

            #Display the fps to the window title.
            pygame.display.set_caption( str( int(clock.get_fps()) ) )

            #Update the screen.
            pygame.display.update()

        #Destroy the window.
        pygame.quit()

Main Program:

import pyworld

#Create a game core.
core = pyworld.Core(600,600)

#Create a scene and load it to the core.
core.scene = pyworld.Scene()

#Add 15000 sprites into the scene.
x,y = 0,0
for i in range(15000):
    
    core.scene.addSprite( pyworld.Sprite(core.pygame.image.load("enemy.jpg"),
                                         pos = (x,y),
                                         tag = "enemy"+str(i)
                                         )
                        )
    x += 100
    y += 100
    

#Run the game.
core.mainloop()

print("Game Terminated!")

How it looks like:

Capture.png

As you can see only six images are rendered to the screen (the other 14996 are off the screen).

The fps are between 50-60. My processor is: AMD FX(tm) - 6350 Six Core 3.90 GHz.

Is it efficient for this processor to run 15000 game objects without scipts and collision detection at 50 fps?

Also notice that i'm using https://www.pygame.org/docs/ref/surface.html#pygame.Surface.blit to draw images to a window. But i guess this is fast enough, i don't know it's implementation.

 

Well i think its not efficient! Grand theft auto 5 has extremly better graphics with a big open world city and its running faster that the game i just made!!! How is this possible!!!

Edited by babaliaris

Share this post


Link to post
Share on other sites
__Toz__    320

So in order to find out what was causing latency for a big amount of sprite objects into the scene, i disabled the collision detection algorithm and run the engine again in order to see if the efficiency will be better. Then i found out that it was still very slow.

If you want a reliable way of finding out what code is slow then your best bet is to profile your game. Luckily Python comes with a profiler in its standard library.

Just make a new python file and copy this into it:

import cProfile
import pstats
import your_game_module

profile = cProfile.Profile()
profile.runcall(your_game_module.main)

with open("profile_results.txt", 'w') as resultsFile:
    stats = pstats.Stats(profile, stream=resultsFile)
    stats.sort_stats('time') #cumulative
    stats.print_stats()

Replace the your_game_module with your own and run it. Then you'll have a much better idea of how much time everything takes.
 

Is it efficient for this processor to run 15000 game objects without scipts and collision detection at 50 fps?

Do you need your game to have this many objects? Keep in mind that Python is not the fastest language in the world and although I think it's a great language for making indie games with you will not be creating the next GTA V with it.

Share this post


Link to post
Share on other sites
babaliaris    112

 

So in order to find out what was causing latency for a big amount of sprite objects into the scene, i disabled the collision detection algorithm and run the engine again in order to see if the efficiency will be better. Then i found out that it was still very slow.

If you want a reliable way of finding out what code is slow then your best bet is to profile your game. Luckily Python comes with a profiler in its standard library.

Just make a new python file and copy this into it:

import cProfile
import pstats
import your_game_module

profile = cProfile.Profile()
profile.runcall(your_game_module.main)

with open("profile_results.txt", 'w') as resultsFile:
    stats = pstats.Stats(profile, stream=resultsFile)
    stats.sort_stats('time') #cumulative
    stats.print_stats()

Replace the your_game_module with your own and run it. Then you'll have a much better idea of how much time everything takes.
 

Is it efficient for this processor to run 15000 game objects without scipts and collision detection at 50 fps?

Do you need your game to have this many objects? Keep in mind that Python is not the fastest language in the world and although I think it's a great language for making indie games with you will not be creating the next GTA V with it.

 

 

No it does not, but i'm trying to make my engine be able to handle as many gameobjects per scene it can, without lagging. This is my objective.

Share this post


Link to post
Share on other sites
__Toz__    320

No it does not, but i'm trying to make my engine be able to handle as many gameobjects per scene it can, without lagging. This is my objective.

Well, if you want to render as much sprites as possible then you could look into rendering using OpenGL or Direct3D. For fast collision you could try a Python port of Box2D or maybe somebody has made a different physics package for Python.

But I'm wondering, you are going to use this engine to create a game with right? So wouldn't you rather focus on what that game needs?

Share this post


Link to post
Share on other sites
Scouting Ninja    3970

As you can see only six images are rendered to the screen (the other 14996 are off the screen).

Since you count these off screen ones does that mean you don't cull them?

rendering 15000 sprite objects.

15000 * 2 =  30000 triangles = half a mesh buffer worth of triangles; so you don't batch your meshes.

Learn batching, it is a process where you merge a lot of sprites into one as a way of saving performance.

 

If you did use batching you could get 64 000 triangles per mesh = 32 000 sprites per mesh = 32 000 * 300 (Mid range graphics card) = 96 000 00 static sprites. On a brand new graphics card you will get millions of static sprites.

 

You can start here: https://www.gamedev.net/resources/_/technical/opengl/opengl-batch-rendering-r3900

You could also search on the web, someone should have done it using Python.

Share this post


Link to post
Share on other sites
babaliaris    112

 

No it does not, but i'm trying to make my engine be able to handle as many gameobjects per scene it can, without lagging. This is my objective.

Well, if you want to render as much sprites as possible then you could look into rendering using OpenGL or Direct3D. For fast collision you could try a Python port of Box2D or maybe somebody has made a different physics package for Python.

But I'm wondering, you are going to use this engine to create a game with right? So wouldn't you rather focus on what that game needs?

 

Actually i'm trying to make something like Unity 3D but for 2D games with python :p Of course not too powerfull, just a start.

Share this post


Link to post
Share on other sites
babaliaris    112

 

As you can see only six images are rendered to the screen (the other 14996 are off the screen).

Since you count these off screen ones does that mean you don't cull them?

 

 

rendering 15000 sprite objects.

15000 * 2 =  30000 triangles = half a mesh buffer worth of triangles; so you don't batch your meshes.

Learn batching, it is a process where you merge a lot of sprites into one as a way of saving performance.

 

If you did use batching you could get 64 000 triangles per mesh = 32 000 sprites per mesh = 32 000 * 300 (Mid range graphics card) = 96 000 00 static sprites. On a brand new graphics card you will get millions of static sprites.

 

You can start here: https://www.gamedev.net/resources/_/technical/opengl/opengl-batch-rendering-r3900

You could also search on the web, someone should have done it us

So the basic idea is, instead of saying "Ok draw me 15000 rectangles", group pictures together to create one large image and draw this one on the screen? Like for example, 15000 / 50 = 300, so create 300 "big" images and draw these instead.

 

Also the method blit of pygame is not using graphics accelarator hardware.

 

But still, i don't think that this is the problem, because in this example ONLY six images are being rendered on the screen the rest 14996 are just skipped because of the if statement:

def render(self, screen):

    if self.onScreen(screen) and self.image != None:
        screen.blit( self.image, self.pos )
Edited by babaliaris

Share this post


Link to post
Share on other sites
babaliaris    112

OK! So i asked pygame developpers about the blitting method and they answered me:

(1) PyGame is essentially CPU-bound rasterization. The fastest way to do this is with the Sprite module, or so I'm told.
 
(2) Using PyOpenGL instead of PyGame (for blitting, anyway), you instead use the GPU, which is deigned for doing just this. You have to know what you're doing, but it will probably be faster.
In your case, most of your objects are off the screen, so you're probably CPU-bound anyway. If you really need more performance, and you really need to process every object every frame, you'll need to switch to a language with less overhead, like C++, or jitted like PyPy.

Share this post


Link to post
Share on other sites
Scouting Ninja    3970

So the basic idea is, instead of saying "Ok draw me 15000 rectangles", group pictures together to create one large image and draw this one on the screen? Like for example, 15000 / 50 = 300, so create 300 "big" images and draw these instead.

No that is atlasing and is part of how batching works.

 

In short every sprite object has a once off cost, known as a a draw call, that cost the same no matter how many polygons the sprite is made from.

So you can merge a lot of sprites into one, keep the same image and only have one draw call; this is how Unity get's it's performance.

Here I used your image to give a better idea of how it works.

DYLQsIb.png

 

The idea is that you merge all sprites into one batch, this allows them all to sneak in, paying only one draw call for all of them to get in.

To understand you should know that each sprite is drawn on a quad. Each quad is made of two triangles.

If you make a object for each quad, then each quad will have it's own draw call. If you make only one object for a lot of quads, they share a draw call.

Using PyOpenGL instead of PyGame
 

I advice this also, I don't know if it is possible to batch using PyGame. Making your own engine using PyGame, would be near impossible to match Unity.

You could still use Python, however if speed is your focus then you will need to change to a other language.

Share this post


Link to post
Share on other sites
__Toz__    320

Also the method blit of pygame is not using graphics accelarator hardware.

Yup, which means it'll never be as fast as using a sprite renderer that uses OpenGL or Direct3D.

But still, i don't think that this is the problem, because in this example ONLY six images are being rendered on the screen the rest 14996 are just skipped because of the if statement:

Why don't you profile your code using the script I provided? Then you'll know for sure if it's the sprite blitting that's slow or the onScreen check or something else.

Share this post


Link to post
Share on other sites
babaliaris    112

 

Also the method blit of pygame is not using graphics accelarator hardware.

Yup, which means it'll never be as fast as using a sprite renderer that uses OpenGL or Direct3D.

But still, i don't think that this is the problem, because in this example ONLY six images are being rendered on the screen the rest 14996 are just skipped because of the if statement:

Why don't you profile your code using the script I provided? Then you'll know for sure if it's the sprite blitting that's slow or the onScreen check or something else.

 

I tried but it shows me so many other methods that python runs and its a mess. Isn't there any way to tell the profiler "ok, show me only my methods"?

Share this post


Link to post
Share on other sites
__Toz__    320

I tried but it shows me so many other methods that python runs and its a mess. Isn't there any way to tell the profiler "ok, show me only my methods"?

You could modify the script to take the lines of the profile results and delete every line that doesn't have the path of your project in it. Although the profile results being a mess shouldn't be a problem, you only need to look at the top results to figure out what is slowest.

The column of interest is the 'tottime' column which shows the amount of time spent in a function but not its sub-functions. So that way the main() function of the program doesn't show on top, since we already know that everything else is indirectly called from it.

So what functions does it show on top? Is it something like {pygame.sprite.draw}?

Share this post


Link to post
Share on other sites
SeanMiddleditch    17565

Someone else mentioned the word "culling" which is a key part of the equation here. This is related to those for loops.

If 15,000 objects aren't on your screen, there's no need to loop over them when drawing. You already have a quad tree for physics so you've got the basics of the solution in place already: use a scene partitioning system to efficiently find the subset of objects that are actually visible. If 6 sprites are visible and 15,000 are not, your rendering code should only even try to draw those 6 sprites and entirely ignore the other 15,000. A good scene graph (or even just a quad-tree) helps here a massive amount. You can hypothetically reduce your O(N) rendering to O(log(N)) or less which is a huge win.

One of your for loops then basically goes away. You never need to iterate over the static objects. Never. Both graphics and physics should be using an efficient graph of some kind to find relevant objects. (Note that physics and graphics might use _different_ graphs, because what makes the most sense for physics isn't necessarily true for graphics.)

You can then also heavily reduce the load on your dynamic objects for loop. After all, collision detection is already using an efficient data structure and you'll be changing your graphics to also use a tree of some kind, so the only work for dynamic objects that remains is to apply velocity. Luckily, that work only needs to be done on objects that _have_ a non-zero velocity. You can thus split your dynamic objects into two sets: active and inactive objects. The active objects are the ones that are currently moving.

After that split, you might notice that there's no longer any practical difference between static and dynamic objects, so you can perhaps remove that entire concept from the code. Objects with velocity (or some other forces) are put into the active set and are removed from the active set when they stop moving.

The physics code need only loop over the active objects to apply velocity and forces. The collision system uses an efficient graph. Rendering uses an efficient graph. Unless you (foolishly) make all 15,000 objects active at the same time, you never need to loop over all your objects even once, much less twice.

The next big step then is the batching improvements others have recommended.

The third big step I'd recommend is to use an accelerator like PyPy or whatever's in vogue today. Remember that even the simplest operation like adding two small integers should just be a single ADD instruction on the CPU, but in plain ol' CPython that same operation can take hundreds of CPU instructions, branches, memory accesses, and so on. Python does not translate to CPU code very well and requires a sophisticated JIT engine to do even marginally well in terms of performance, and Python does not include such a JIT engine out of the box like JavaScript engines do, so you have to use an add-on/replacement like PyPy.

Share this post


Link to post
Share on other sites
babaliaris    112

 

Someone else mentioned the word "culling" which is a key part of the equation here. This is related to those for loops.

If 15,000 objects aren't on your screen, there's no need to loop over them when drawing. You already have a quad tree for physics so you've got the basics of the solution in place already: use a scene partitioning system to efficiently find the subset of objects that are actually visible. If 6 sprites are visible and 15,000 are not, your rendering code should only even try to draw those 6 sprites and entirely ignore the other 15,000. A good scene graph (or even just a quad-tree) helps here a massive amount. You can hypothetically reduce your O(N) rendering to O(log(N)) or less which is a huge win.

One of your for loops then basically goes away. You never need to iterate over the static objects. Never. Both graphics and physics should be using an efficient graph of some kind to find relevant objects. (Note that physics and graphics might use _different_ graphs, because what makes the most sense for physics isn't necessarily true for graphics.)

You can then also heavily reduce the load on your dynamic objects for loop. After all, collision detection is already using an efficient data structure and you'll be changing your graphics to also use a tree of some kind, so the only work for dynamic objects that remains is to apply velocity. Luckily, that work only needs to be done on objects that _have_ a non-zero velocity. You can thus split your dynamic objects into two sets: active and inactive objects. The active objects are the ones that are currently moving.

After that split, you might notice that there's no longer any practical difference between static and dynamic objects, so you can perhaps remove that entire concept from the code. Objects with velocity (or some other forces) are put into the active set and are removed from the active set when they stop moving.

The physics code need only loop over the active objects to apply velocity and forces. The collision system uses an efficient graph. Rendering uses an efficient graph. Unless you (foolishly) make all 15,000 objects active at the same time, you never need to loop over all your objects even once, much less twice.

The next big step then is the batching improvements others have recommended.

The third big step I'd recommend is to use an accelerator like PyPy or whatever's in vogue today. Remember that even the simplest operation like adding two small integers should just be a single ADD instruction on the CPU, but in plain ol' CPython that same operation can take hundreds of CPU instructions, branches, memory accesses, and so on. Python does not translate to CPU code very well and requires a sophisticated JIT engine to do even marginally well in terms of performance, and Python does not include such a JIT engine out of the box like JavaScript engines do, so you have to use an add-on/replacement like PyPy.

 

 

You are right. I aldready thought of it, but my problem was this: If i use a quadtree to render only the sprites which are "close" to the camera then the other sprites which i will skip, will also skip their logic (if those sprites have a script attached on them). So if an enemy is hunting the player but the player goes far enough from him (so the enemy will be placed somewhere else in the tree), then the sprite objects which are going to be choosed for rendering will not contain the enemy and also his script. So the enemy's logic will be paused until the player go back to the region where the enemy is inside the quadtree. This is why i had a for loop to go through all sprite objects into the scene and run the scipts.

 

But an hour ago i though of something really simple! Not all of the 15000 objects will have a script!!! Most of these objects will be something like a tree or a house or something else which do not require a logic. So why do i need to loop through all objects and run the scipt of each (if they have on)?

Anyway i reached to this conclution:

I will make a list containing all the sprite objects with scripts attached on them (Brained Objects).                                                                       I will also make a list containing moving objects (Moving Objects).                                                                                                                       Then i will create a spatial hash which its bucket will point to a quadtree. This mixed data structure will divide my whole space.

I will consider that my whole world is a square with length 2^32 .

Finally my main loop should look something like this (I'm using only quadtree in here without spatial hashing):

def mainloop(self):

    #Get the sprite objects which are close to the camera.
    area = self.quadtree.get_area_for_render(self.camera.position)

    #Render those sprites.
    for sprite in area:
        sprite.render(self.screen)
        pass

    #Run scripts.
    for brained_sprite in self.Brained:
        brained_sprite.run_scripts()
        pass


    #Check collision.
    for moving_sprite in self.Moving:

        #Check the moving sprite if collides with all the sprites
        #which are close to him (using the quadtree).
        self.quadtree.checkCollision(moving_sprite)

        

So i guess this will be better.

Share this post


Link to post
Share on other sites
__Toz__    320

Yup, the time is in seconds. It's seems like the rendering and onScreen checking are taking up most of the time, so those should be optimized. Like Sean said, instead of having to check if every sprite is on the screen you could instead ask your quad-tree object which sprites are visible with the current view port. Then only those have to be rendered. 

But further it seems a bit odd to me that just drawing 6 sprites would be that slow. If you just draw 6 sprites in a program without doing anything else, is it also slow?

Share this post


Link to post
Share on other sites
babaliaris    112

 

Someone else mentioned the word "culling" which is a key part of the equation here. This is related to those for loops.

If 15,000 objects aren't on your screen, there's no need to loop over them when drawing. You already have a quad tree for physics so you've got the basics of the solution in place already: use a scene partitioning system to efficiently find the subset of objects that are actually visible. If 6 sprites are visible and 15,000 are not, your rendering code should only even try to draw those 6 sprites and entirely ignore the other 15,000. A good scene graph (or even just a quad-tree) helps here a massive amount. You can hypothetically reduce your O(N) rendering to O(log(N)) or less which is a huge win.

One of your for loops then basically goes away. You never need to iterate over the static objects. Never. Both graphics and physics should be using an efficient graph of some kind to find relevant objects. (Note that physics and graphics might use _different_ graphs, because what makes the most sense for physics isn't necessarily true for graphics.)

You can then also heavily reduce the load on your dynamic objects for loop. After all, collision detection is already using an efficient data structure and you'll be changing your graphics to also use a tree of some kind, so the only work for dynamic objects that remains is to apply velocity. Luckily, that work only needs to be done on objects that _have_ a non-zero velocity. You can thus split your dynamic objects into two sets: active and inactive objects. The active objects are the ones that are currently moving.

After that split, you might notice that there's no longer any practical difference between static and dynamic objects, so you can perhaps remove that entire concept from the code. Objects with velocity (or some other forces) are put into the active set and are removed from the active set when they stop moving.

The physics code need only loop over the active objects to apply velocity and forces. The collision system uses an efficient graph. Rendering uses an efficient graph. Unless you (foolishly) make all 15,000 objects active at the same time, you never need to loop over all your objects even once, much less twice.

The next big step then is the batching improvements others have recommended.

The third big step I'd recommend is to use an accelerator like PyPy or whatever's in vogue today. Remember that even the simplest operation like adding two small integers should just be a single ADD instruction on the CPU, but in plain ol' CPython that same operation can take hundreds of CPU instructions, branches, memory accesses, and so on. Python does not translate to CPU code very well and requires a sophisticated JIT engine to do even marginally well in terms of performance, and Python does not include such a JIT engine out of the box like JavaScript engines do, so you have to use an add-on/replacement like PyPy.

 

 

You are right. I aldready thought of it, but my problem was this: If i use a quadtree to render only the sprites which are "close" to the camera then the other sprites which i will skip, will also skip their logic (if those sprites have a script attached on them). So if an enemy is hunting the player but the player goes far enough from him (so the enemy will be placed somewhere else in the tree), then the sprite objects which are going to be choosed for rendering will not contain the enemy and also his script. So the enemy's logic will be paused until the player go back to the region where the enemy is inside the quadtree. This is why i had a for loop to go through all sprite objects into the scene and run the scipts.

 

But an hour ago i though of something really simple! Not all of the 15000 objects will have a script!!! Most of these objects will be something like a tree or a house or something else which do not require a logic. So why do i need to loop through all objects and run the scipt of each (if they have on)?

Anyway i reached to this conclution:

I will make a list containing all the sprite objects with scripts attached on them (Brained Objects).                                                                       I will also make a list containing moving objects (Moving Objects).                                                                                                                       Then i will create a spatial hash which its bucket will point to a quadtree. This mixed data structure will divide my whole space.

I will consider that my whole world is a square with length 2^32 .

Finally my main loop should look something like this (I'm using only quadtree in here without spatial hashing):

def mainloop(self):

    #Get the sprite objects which are close to the camera.
    area = self.quadtree.get_area_for_render(self.camera.position)

    #Render those sprites.
    for sprite in area:
        sprite.render(self.screen)
        pass

    #Run scripts.
    for brained_sprite in self.Brained:
        brained_sprite.run_scripts()
        pass


    #Check collision.
    for moving_sprite in self.Moving:

        #Check the moving sprite if collides with all the sprites
        #which are close to him (using the quadtree).
        self.quadtree.checkCollision(moving_sprite)

        

So i guess this will be better.

The only think i dont know is how to create the spatial hash function. I already searched about spatial hashing and l learn that spatial hashing is dividing your space in squares with a length x and its square is actually a bucket of your array hash table. So the hash function must take as parameter a sprite object, and depending the x, y, width and height of this object (inculding the x value which is the length of its bucket) will return you a number between 0 and max_length_of_hash_table - 1 so you will know where excactly must place your sprite object. How do i make this mathematical function, i have no clue and i cant find something on the internet (Even if i'm good with maths)

Share this post


Link to post
Share on other sites
babaliaris    112

Yup, the time is in seconds. It's seems like the rendering and onScreen checking are taking up most of the time, so those should be optimized. Like Sean said, instead of having to check if every sprite is on the screen you could instead ask your quad-tree object which sprites are visible with the current view port. Then only those have to be rendered. 

But further it seems a bit odd to me that just drawing 6 sprites would be that slow. If you just draw 6 sprites in a program without doing anything else, is it also slow?

 

Yes a little faster:

Capture.png

 

Code:

import pygame

from pygame.locals import *

pygame.init()

screen = pygame.display.set_mode((600,600))

running = True

image = pygame.image.load("enemy.jpg")

clock = pygame.time.Clock()

while running:

    clock.tick()

    #Fill the screen with black.
    screen.fill((0,0,0))

    #Keyboard events.
    for event in pygame.event.get():

        if event.type == QUIT:
            running = False
            break

    #Loop.
    x,y = 0,0
    for i in range(15000):

        #If the image is for sure out of the window, just skip.
        if x > 700 or x < -100 and y > 700 or y < -100:
            pass

        #Else draw the image on the this position.
        else:
            screen.blit( image, (x,y) )
            
        x += 100
        y += 100

        

    #Set for window title the fps.
    pygame.display.set_caption( str( int(clock.get_fps()) ) )

    pygame.display.update()


pygame.quit()


But the problem is solved i guess. I just need to to delete this for loop and run a quadtree instead.

Edited by babaliaris

Share this post


Link to post
Share on other sites
babaliaris    112

Anyway! With all of your help i think i've got the solution!

Just tell me how to create the hash function of a spatial hashing and i'm good to go!!!

I will report back to you with my new results!

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this