• entries
    436
  • comments
    1179
  • views
    763965

More python

Sign in to follow this  

126 views

After a little more fiddling around, I want to revise my stance on Python just a little bit. I still think it's fantastic, but there are a few things about it that, coupled with a fundamental lack of understanding on my part, cause a few problems.

The biggest hitch is performance, and the understanding of the workings of the language required to scrape the last possible fps out of a rendering sequence. Of course, a language such as python isn't going to get the performance that a 'lower' level language such as C++ can get, especially when dealing with something like OpenGL which so often gains it's largest performance boosts from bare-metal array and pointer based programming. In Python, I am currently running into a large number of unforeseen hitches in trying to optimize a very simple tilemap rendering routine; even with Numeric or numarray, there are still weird things going on (copying, memory allocation, who knows what else; this is where my lack of understanding of the deepest fundamentals of Python really bites me in the ass) that have a major impact on our framerate.

One annoying little issue cropped up yesterday, and is still in a state of only semi-resolution. We have a separate texture class that handles loading a .TGA file and binding it as an OpenGL texture. In the main map rendering loop, the tileset texture is bound and tiles are drawn by using UV coords to snip out squares of the texture. Simple enough, works like a charm, but our frame rate takes a 5x hit simply by changing the line "from Texture import *" to "import Texture" and using fully-qualified identifiers to work with textures. Why? Who knows.

Both Fruny and I have pored over the code, trying a dozen iterations and revisions. It may even be a problem with the Linux implementation of Python, since Fruny doesn't experience the problem to anywhere near the degree that I do. That simple difference, for me, can mean a change from ~50FPS down to ~8FPS. That such a drastic difference occurs serves only to indicate how little I really grok what is going on in layers of the language below the surface.

And on the issue of FPS, even without this annoying little issue, 50FPS is horribly bad for what we are doing. A simple tile map, one texture bind to set the tileset. True, we are using immediate mode calls (glVertex, glTexCoord, etc) to draw the tiles, but even so, drawing a 14x7 or so block of 64x64 pixel tiles(a screen's worth, in 800x600 mode), even with the simple vertex lighting calculations we are performing, should get far better performance on my GF6800 than 50FPS. I know for a fact that I can achieve rates in the ballpark of 300FPS doing the exact same thing using C++. How do I know this? I've done it before, using pretty much the same general methodology as this Python experiment uses. Again, there are some behind-the-scenes performance hits going on that I lack the understanding to try to optimize out. I performed some refactoring this very evening that I thought was going to speed things up, but which in actuality lost me another ~10FPS on average.

Bear in mind, too, that this is with only test sprite entities that all bind the same sprite sheet. And there are only 300 test entities spawned on a map 193x193 tiles. At any given time, only 6 or so are every drawn on the screen at once. I shudder to think what the performance will be like when more and varied entities are added, with AI and game mechanic calculations thrown in to the mix.

I'm going to try using arrays and indexed primitives rather than immediate mode calls and see if that helps. I've been avoiding it, due to the fact that each tile will need 4 unique vertices (effectively quadrupling the size of the map) and also due to the fact that having each vertex in the grid duplicated 4 times could impose more performance penalties on the lighting calculation. But even without the performance penalties I introduced this evening, there still just isn't enough framerate to spare. We haven't even started AI; hell, we haven't even implemented an update loop to move those 300 or so test entities around; they just sit there.

I'm really starting to understand why so often people will implement engine code in C++ and only implement superficial 'glue' code on top of the engine in Python. This is really very frustrating. For games up to a certain level of complexity, it is just fine, but try to push beyond that and lots of little hidden issues rise up to wreak havoc.
Sign in to follow this  


3 Comments


Recommended Comments

I'm suprised that people could ever write engine code in a fully interpreted language [wink]

Not really, it takes me back to how I began on the Amiga, with AMOS Pro. It made creating games pretty simple but the speed was nowhere like you'd get hitting the bare metal. That said, even on the 7Hz machines you could make games that ran a stable 30fps - so it's probably just python that is teh sukc [grin].

Share this comment


Link to comment
Hmmmm, I think the immedate mode OGL calls are going to give you alot of your problems. They tend to suck CPU time at the best of times on C++ (efectively removing all advantage your HW T&L on the GPU as well as it will spend most of its time idling), so via Python I would expect it to become a problem faster.

Ofcourse, this is purely guess work on my part, but it does seem logical.

Share this comment


Link to comment
Immediate mode sucks, but like I said I have been able to get framerates of nearly 300 fps with immediate mode calls before (C++); case in point, the initial versions of the Accidental Engine v1 used immediate mode, and I still got FPS in the ballpark of 250.

Share this comment


Link to comment

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now