a speed question

Started by
15 comments, last by Zahlman 17 years, 4 months ago
I'm curious as to if this will be slow?

int x = 0;
int foo[max];

for (int j=0; j<max; j++)
{
    if (foo[j].x > x) 
        x = foo[j].x;
}


would this be slow? when max could be upwards of around 3000+ a graphical question, assumeing 2D using quads(2 triangles)when using indexbuffers to batch. how much should I be batching? Should I only be batching say the top two used textures or should I batch everything? I was thinking if I batch everything, along with z ordering Alot of it really wouldn't be batched that much, or I would have alot of overlapping, and that would be a problem.
-----------------------------------------------The ZoloProject
Advertisement
As far as that code goes, there's nothing particularly slow about it. The compiler will optimize it into something extremely fast.

I can't really help with the other questions I'm afraid.
Quite an appropriate name for such a question, speedie. ;)

Anyway, why don't you test the performance first? Optimizations can be done later on, optimizing too early on can make things far more complex than they need to be. Well, since you're asking for performance, sorting the list once and taking the last of first, depending on sorting order, is of course much faster, with the cost of a single sort at the start of course. But that depends on the situation: what are you using this list for, what are the requirements?

I can't say much on the batching, I've never done it in my OpenGL programs so far, but I'd test the performance with a representative level first before optimizing. That way you can spot what you need to optimize rather than doing it because you've heard it's a good thing. Someone else may give you some better pointers on this though. :)
Create-ivity - a game development blog Mouseover for more information.
Re: batching.

I personally think that, while premature optimisation is normally a waste of time, you should still batch as much as you possibly can due to the vast range of different graphics hardware set ups you will want your program to run under.

What seems like a pointless and premature optimisation on your set up might in fact make a significant speed difference on a different machine with different graphics hardware.

I'm not sure the usually perfectly sensible argument about "Optimise only when you have identified a bottleneck" applies so well to the area of hardware accelerated graphics.

I'm prepared to stand corrected though if I'm talking rubbish.
I wouldn't say it's premature, Incorparating indexbuffers, is my optimization. I just realy confused as to how I would sort everything while not having images overlap previously drawn images. I seems I would have to incorporate a z order, but I would think after sorting everything by the z axis alot of it per z layer wouldn't share the same texture.
-----------------------------------------------The ZoloProject
The runtime for your loop is O( n ) I believe.
Programs that require a lot of processing power will not worry too much about wether the loop iterates 10 times or 3000 times. The "big O" notation is still O( n ).

If you want to be considerably faster you will have to come up with faster data structures and algorithms.
As an example, binary search trees will give you O( log(n) ). (Correct me if Im wrong again)

If you are talking about optimizing the loop, I have heard that pointer arithmetics is faster than array indexing.
Example:
int foo[max];int *px=foo, x = 0;for (int j=0; j<max; ++j, ++px)  // ++j is faster than j++ if were gona be picky{    if (px->x > x)        x = px->x;}


Wether this is a bottleneck in your program is another question. If not, optimizing like this is not worth it.
You can't access an int array with px->x... Your compiler would throw an error saying that px is not a struct, class, etc... However you may have been inclined to write this because the OP used foo[j].x which is also incorrect

You would need to access like
*(px + x) to get the value at foo[x]

Anyway, why would pointer arithmetic be any faster that array index notation. I would think the compiler would generate exactly the same code for both. In addition you are also incrementing two variables. This may prove insignificant to run time however, although the compiler might find a way to optimise out the second variable.

Also why would ++j be faster than j++. It is true for iterators that the prefix notation is faster but I don't think that it would faster for plain integer types.

If you were to write a more efficient way, you could use a pointer for the loop counter.

int x=0;int foo[max];for (int* px=foo; px<&foo[max]; px++){  if (*px > x)   x=*px;}
About batching, it depends on the number of texture switches you are going to have. However, all else equal, does batching cost you anything? I mean, even if you draw 2 Sprites in a batch, switch states, then draw 2 more, you'll probably be slightly faster as you'll be saving yourself an extra draw primitive call.

Since you'll be sorting things by Z, your render breaks will likely be dependent on texture swaps. This will give you further incentive to use a sprite sheet rather than individual textures for each object.
Absolutely Batch. Changing state is the most expensive graphical operation you can perform. Unless you know that a large majority (say 75% or more) of the batches will only contain a single instance, then its probably a win. This would most definately be a corner-case though, and there's no way to compute (at run time) whether this is the case for this frame without doing much of the work required to batch in the first place. BAtching should be a win in all but the most essoteric cases.

throw table_exception("(? ???)? ? ???");

Understand that if 'max' is very large, you may run out of stack memory. Instead, if you anticipate that you will need to allocate a large number of int's, do it dynamically.

This topic is closed to new replies.

Advertisement