Sign in to follow this  
Spa8nky

Is there any way to combine the following into one loop?

Recommended Posts

Hi there, I've got a loop that creates a polygon as follows:
            // For each polygon face
            for (int i = 0; i < faces; i++)
            {
                if (i > 0)
                {
                    edge[i - 1] = vertices[i] - vertices[i - 1];

                    if (CCW)
                    {
                        Vector2 normal_RH = new Vector2();
                        normal_RH.X = -edge[i - 1].Y;
                        normal_RH.Y = edge[i - 1].X;

                        normal[i - 1] = Vector2.Normalize(normal_RH);
                    }
                    else
                    {
                        Vector2 normal_LH = new Vector2();
                        normal_LH.X = edge[i - 1].Y;
                        normal_LH.Y = -edge[i - 1].X;

                        normal[i - 1] = Vector2.Normalize(normal_LH);
                    }
                }

                // Set vertices/indices for drawing the polygon
                Vector3 position = new Vector3();
                position.X = vertices[i].X;
                position.Y = vertices[i].Y;
                position.Z = 0f;

                vertices_Drawable[i] = new VertexPositionColor(position, Color.Red);
                indices[i] = (short)i;

                // Calculate polygon centroid
                centre += vertices[i];
            }


The problem is that the last side can't be currently calulated in the loop and has to be placed outside the loop:
            // Add the first vertex to the end of the indices to close the line strip
            indices[faces] = 0;

            // Compute final edge and face normal
            edge[faces - 1] = vertices[0] - vertices[faces - 1];

            if (CCW)
            {
                Vector2 normal_RH = new Vector2();
                normal_RH.X = -edge[faces - 1].Y;
                normal_RH.Y = edge[faces - 1].X;

                normal[faces - 1] = Vector2.Normalize(normal_RH);
            }
            else
            {
                Vector2 normal_LH = new Vector2();
                normal_LH.X = edge[faces - 1].Y;
                normal_LH.Y = -edge[faces - 1].X;

                normal[faces - 1] = Vector2.Normalize(normal_LH);
            }


Is there any way to combine both calculations into one loop? Thanks.

Share this post


Link to post
Share on other sites
- Expand the array by one, copy 0-th element last place, ignore it elsewhere
- use (i%faces) instead of just i as index
- refactor the function to take (Face a, Face b) parameters (cuts down the amount of duplicated code

Pick any of the above.

Generalized description of the problem: How to process consecutive pairs of array elements with wrapping.

Share this post


Link to post
Share on other sites
My apologies to whoever wrote the awesome blog post that explained how to do this because I forgot who he was, but he cribbed it from someone else so I don't feel too bad. I assume Knuth did it originally, because, well, he's Knuth. This is a thing that every graphics programmer should know how to do.

Suppose you need to iterate over the following pairs: (0,1), (1,2), (2, 3) ... (n-2, n-1), (n-1, 0). The key is to put the last pair first. Here's the loop:

for(int a=n-1, b=0; b<n; a=b, b++)
{
printf("(%d, %d)\n", a, b);
}

Beautiful, no?

Share this post


Link to post
Share on other sites
Quote:
Original post by Sneftel
My apologies to whoever wrote the awesome blog post that explained how to do this because I forgot who he was, but he cribbed it from someone else so I don't feel too bad. I assume Knuth did it originally, because, well, he's Knuth. This is a thing that every graphics programmer should know how to do.

Suppose you need to iterate over the following pairs: (0,1), (1,2), (2, 3) ... (n-2, n-1), (n-1, 0). The key is to put the last pair first. Here's the loop:

for(int a=n-1, b=0; b<n; a=b, b++)
{
printf("(%d, %d)\n", a, b);
}

Beautiful, no?


That is cool.

Share this post


Link to post
Share on other sites

for(int a=n-1, b=0; b<n; a=b, b++)
{
printf("(%d, %d)\n", a, b);
}


That's a pretty cool trick. You'll get two cache misses right off the bat as you aren't accessing in sequence, but the usual approach will probably cache miss when accessing the final edge anyway, as the very first edge would have been kicked out by that time.

Share this post


Link to post
Share on other sites
Quote:
Original post by Sneftel
My apologies to whoever wrote the awesome blog post that explained how to do this because I forgot who he was, but he cribbed it from someone else so I don't feel too bad. I assume Knuth did it originally, because, well, he's Knuth. This is a thing that every graphics programmer should know how to do.

Suppose you need to iterate over the following pairs: (0,1), (1,2), (2, 3) ... (n-2, n-1), (n-1, 0). The key is to put the last pair first. Here's the loop:

for(int a=n-1, b=0; b<n; a=b, b++)
{
printf("(%d, %d)\n", a, b);
}

Beautiful, no?
I prefer (and always do) the (i+1)%n stuff, as Antheus suggested, that trick is hard to read, and has 2 variables. But if it was just irony, that sorry, I'm tired to see that.

Share this post


Link to post
Share on other sites
Quote:
Original post by szecs
I prefer (and always do) the (i+1)%n stuff, as Antheus suggested, that trick is hard to read, and has 2 variables.


modulus tends to be slow. The only advantage it has is that it works universally for m elements (think wrapped convolution kernel computation).

For pairs, for loop is quite clean, although if I encountered it in code without knowing context, I'd be scratching my head for a while.

Either way, I find that this type of problem, while algorithmically elegant, might be clearest to express explicitly:
for (i:0..n-1) handle(i, i+1);
handle(i+1, 0);


IMHO: if performance is paramount, then it is likely that some other method will work much faster anyway, and special case handling will likely work best, or use something like loop unrolling anyway. For everything else, clarity should probably be a priority. For this reason, I'd suggest moving the body out of the loop in either case, since it makes the intent much more explicit (we're doing something with a and b).

Share this post


Link to post
Share on other sites
Quote:
Original post by Antheus
Quote:
Original post by szecs
I prefer (and always do) the (i+1)%n stuff, as Antheus suggested, that trick is hard to read, and has 2 variables.


modulus tends to be slow. The only advantage it has is that it works universally for m elements (think wrapped convolution kernel computation).

For pairs, for loop is quite clean, although if I encountered it in code without knowing context, I'd be scratching my head for a while.

Either way, I find that this type of problem, while algorithmically elegant, might be clearest to express explicitly:
for (i:0..n-1) handle(i, i+1);
handle(i+1, 0);


IMHO: if performance is paramount, then it is likely that some other method will work much faster anyway, and special case handling will likely work best, or use something like loop unrolling anyway. For everything else, clarity should probably be a priority. For this reason, I'd suggest moving the body out of the loop in either case, since it makes the intent much more explicit (we're doing something with a and b).
Thanks for that, I didn't know modulus is slow. I use it heavily, I should rethink it.

Share this post


Link to post
Share on other sites
Quote:
Original post by szecs
Thanks for that, I didn't know modulus is slow. I use it heavily, I should rethink it.


OTOH, the usual advice about premature optimization applies. And on the other other hand, you could also try something like if (++x == n) { x = 0; }.

Share this post


Link to post
Share on other sites
Quote:
I prefer (and always do) the (i+1)%n stuff, as Antheus suggested, that trick is hard to read, and has 2 variables.
Not only is modulus slow, it's only applicable to random-access containers. I'm surprised you find it hard to read -- prev=cur; increment cur is a pretty common idiom -- but I guess I've seen this idiom enough that I couldn't really give a useful opinion on how easy it is to understand the first time. The 2 variables thing is a complete red herring, though. If you're using (i+1)%n all over your loop body, (i+1)%n is a variable, whether you call it that or not.

Share this post


Link to post
Share on other sites
Okay guys, I'm not a programmer. So I don't know about common idioms. I learned (almost) everything by myself without tutorials. So I have a general advice for everyone: don't learn like the way I did!

Share this post


Link to post
Share on other sites
Quote:
Original post by Sneftel
Quote:
I prefer (and always do) the (i+1)%n stuff, as Antheus suggested, that trick is hard to read, and has 2 variables.
Not only is modulus slow, it's only applicable to random-access containers. I'm surprised you find it hard to read -- prev=cur; increment cur is a pretty common idiom -- but I guess I've seen this idiom enough that I couldn't really give a useful opinion on how easy it is to understand the first time. The 2 variables thing is a complete red herring, though. If you're using (i+1)%n all over your loop body, (i+1)%n is a variable, whether you call it that or not.


If you didn't use single-letter variable names in your example and used "cur" and "prev" like you did when explaining it aloud, it probably wouldn't be quite as obtuse.

Share this post


Link to post
Share on other sites
Hmm? But "prev" and "cur" isn't what they are. That was just a reference to an idiom that has a similar structure to the code I posted.

Share this post


Link to post
Share on other sites
Huge thank you for all your help guys.

I went for readibility over clever programming after toying with the ideas you posted:


/// <summary>
/// Create a 2D polygon
/// </summary>
/// <param name="CCW">Are vertices ordered counter clockwise?</param>
/// <param name="vertex">n vertices that make up polygon</param>
public CD_Polygon(bool CCW, params Vector2[] points)
{
// A polygon must have at least 3 vertices
if (points.Length < 3)
{
return;
}

vertices = points;
faces = vertices.Length;
edge = new Vector2[faces];
normal = new Vector2[faces];

// Drawing parameters
vertices_Drawable = new VertexPositionColor[faces];
indices = new short[faces + 1];

for (int i = 0; i < faces; ++i)
{
int i2 = i + 1 < faces ? i + 1 : 0;

edge[i] = Vertices[i2] - Vertices[i];

// Ensure the edges have non-zero length
//Debug.Assert(edge[i].LengthSquared() > Constants.EPSILON * Constants.EPSILON);

// Compute normals
// Assume +ve Y = "Up"
Vector2 n = new Vector2();

if (CCW)
{
n.X = edge[i].Y;
n.Y = -edge[i].X;
}
else
{
n.X = -edge[i].Y;
n.Y = edge[i].X;
}

// Normalise and set normal for current edge
n.Normalize();
normal[i] = n;

// Calculate polygon centroid
centroid += vertices[i];

// Set vertices/indices for drawing the polygon
Vector3 position = new Vector3();
position.X = vertices[i].X;
position.Y = vertices[i].Y;
position.Z = 0f;

vertices_Drawable[i] = new VertexPositionColor(position, Color.Red);
indices[i] = (short)i;
}

centroid /= faces;

type = Collidable2D_Type.POLYGON;
}



Let me know what you think.

Thanks again.

Share this post


Link to post
Share on other sites
Why do you reserve one extra element in 'indices'? It doesn't look to me like it gets filled in. Also, don't you have constructors taking arguments for Vector2 and Vector3?


// Some random suggestions
public CD_Polygon(bool CCW, params Vector2[] points) {
// Bad input shouldn't pass silently
if (points.Length < 3) {
throw new ArgumentException("Must supply at least 3 points");
}

vertex_count = points.Length; // 'faces' makes it sound like data for each
// face rather than a count of them. And 2D figures don't really have faces anyway.

// Using plurals for collection names is a good idea; be consistent.
edges = new Vector2[vertex_count];
normals = new Vector2[vertex_count];

// Storing both the original vertices and the drawing information is
// redundant; the former can be found from the latter, and if you're actually
// doing 3D work, you're going to want the Z components anyway, even if
// they're all zero.
drawableVertices = new VertexPositionColor[vertex_count];
indices = new short[vertex_count];

// This is clearer than adding to 'centroid', and we didn't zero that out anyway...
Vector2 vertex_sum = new Vector2();

for (int i = 0; i < vertex_count; ++i) {
Vector2 v1 = points[i];
Vector2 v2 = points[i + 1 < vertex_count ? i + 1 : 0];
Vector2 v = v2 - v1;
edges[i] = v;

int scale = CCW ? -1 : 1;
// 'normalized' - note descriptive rather than imperative - would return
// a new value.
normals[i] = new Vector2(v.X * scale, v.Y * -scale).normalized();

vertex_sum += v1;

drawableVertices[i] = new VertexPositionColor(new Vector3(v1.X, v1.Y, 0f), Color.Red);
indices[i] = (short)i;
}
centroid = vertex_sum / faces;
type = Collidable2D_Type.POLYGON;
}

Share this post


Link to post
Share on other sites
Quote:
Original post by Antheus
modulus tends to be slow.

Really? I had assumed there was a machine instruction for it or something. >_>

Do you have any further information? I might see if I can speed up a few toys of mine to corroborate this.

On-topic, I don't mind the loop Sneftel posted, but I probably would've refactored the loop contents into a subroutine and left the final iteration outside of the loop.

Also, I'd need empirical proof of things like cache misses before refactoring an algorithm around them, largely because the majority of performance anecdotes of old just don't apply these days.

Share this post


Link to post
Share on other sites
Quote:
Original post by Fenrisulvur
Quote:
Original post by Antheus
modulus tends to be slow.

Really? I had assumed there was a machine instruction for it or something.
There is. It takes several clock cycles.
Quote:
Also, I'd need empirical proof of things like cache misses before refactoring an algorithm around them, largely because the majority of performance anecdotes of old just don't apply these days.
While memory systems have changed, calling cache line misses "anecdotes of old" is backwards. It's only recently that cache misses have become such a huge problem, because processing speed now far outstrips memory latency. If you're curious as to how long your program spends waiting for cache lines to come in, though, check out VTune or CodeAnalyst.

Share this post


Link to post
Share on other sites
Quote:
Original post by Sneftel
It's only recently that cache misses have become such a huge problem, because processing speed now far outstrips memory latency.

Herb Sutter did a great talk on this topic.

Share this post


Link to post
Share on other sites
Quote:
Original post by Zahlman
Why do you reserve one extra element in 'indices'? It doesn't look to me like it gets filled in.


This because if I am drawing the polygon as a triangle strip then I need a final index of 0 to draw the polygon correctly:


public override void Draw()
{
// Get graphics device and effect
GraphicsDevice graphics_Device = Game1.Instance.graphics_Device;
Effect effect_Current = Game1.Instance.effect_Manager.effect_Debug;

effect_Current.CurrentTechnique = effect_Current.Techniques["Default"];
effect_Current.Parameters["World"].SetValue(World_Transform);

effect_Current.Begin();
effect_Current.CurrentTechnique.Passes[0].Begin();

graphics_Device.VertexDeclaration = Game1.Instance.effect_Manager.vd_VertexPositionColor;
graphics_Device.DrawUserIndexedPrimitives<VertexPositionColor>(PrimitiveType.LineStrip, vertices_Drawable, 0, vertices_Drawable.Length, indices, 0, vertices_Drawable.Length);

effect_Current.CurrentTechnique.Passes[0].End();
effect_Current.End();
}



Quote:

Also, don't you have constructors taking arguments for Vector2 and Vector3?



Vector2 n = new Vector2();
n.X = edges[i].Y * scale;
n.Y = edges[i].X * -scale;


is faster than


Vector2 n = new Vector2(edges[i].Y * scale, edges[i].X * -scale);


Quote:


// Storing both the original vertices and the drawing information is
// redundant; the former can be found from the latter, and if you're actually
// doing 3D work, you're going to want the Z components anyway, even if
// they're all zero.



When dealing with collision tests involving two polygons, is it not quicker to obtain the stored Vector2 vertices than using the X and Y components of the Vector3 vertices?

Share this post


Link to post
Share on other sites
Quote:
Original post by Sneftel
Quote:
I prefer (and always do) the (i+1)%n stuff, as Antheus suggested, that trick is hard to read, and has 2 variables.
Not only is modulus slow, it's only applicable to random-access containers.
Another problem is that on most architectures modulo won't do anything remotely sensible with negative indices. So watch out for expressions like (i-1)%n or the use of i%n on signed variables to guard against buffer overruns.

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this