Sign in to follow this  
vaneger

which would be best?

Recommended Posts

I have a scoring function that compares two vectors and returns a % of how closely they match. It gets called alot, so I'm trying to figure out the best way to code it for speed. currently it looks like this :
float score(vector<char> &goal,vector<char> &test)
{
	int i = 0;
	float score = 0.000;
	for(i=0;i<1024;i++)
		if(test[i] == goal[i])
			score++;
	return( (score / 1024.000) * 100.000 );

}

but would something like this be faster ?
float score(vector<char> &goal,vector<char> &test)
{
	int i = 0;
	float score = 0.000;
	for(i=0;i<1024;i+=4)
        {
	     if(test[i] == goal[i])
		score++;
	     if(test[i+1] == goal[i+1])
               score++;
	     if(test[i+2] == goal[i+2])
               score++;
	     if(test[i+3] == goal[i+3])
               score++;
        }
	return( (score / 1024.000) * 100.000 );

}

Share this post


Link to post
Share on other sites
I'd guess they were just as fast/slow as one an other just the second one is pointlessly complicated.

You need to compare each pair in turn and increment a counter if their equal, not really alot you caan do to speed that up. If you could describe what your doing perhaps there's a faster way to do this.

Share this post


Link to post
Share on other sites
How do the vectors change over time? If they change infrequently, you might be able to cache the results and only recalculate when the values change.

Are the vectors sliding windows? You might be able to keep track of the number of matches that are added and removed rather than iterating across the entire array.
Quote:

but would something like this be faster ?

You need to profile the new and old version, and to compare the results. It might be insightful to look at the generated assembly under a variety of compiler switches.

Share this post


Link to post
Share on other sites
Quote:
Original post by vaneger
It gets called alot

I think that's the crux of the problem, the operation itself is simple in fairly inexpensive. Somehow removing the need to compare all the elements every single time would be the most obvious improvement.

Quote:
but would something like this be faster ?

Well, you can speed this up: [wink]
return (score / 10.24f); *

* Jokes aside, obviously any reasonably smart compiler would already optimize this in some way, especially with a power of two division in there.

Share this post


Link to post
Share on other sites
I am pretty sure it would be slower

but difference between them would be very small.

MAybe you should do that, whenever you change a value in arrays, recalculate score but you don't have to scan the whole list again. just check the element you are changing. something like


int score;

void changeelement(int i, int newvalue){
if(test[i] == goal[i]){
score--; //if they were equal before changing the value decrease the score
}
test[i] = newvalue; //or goal[i] = newvalue
if(test[i] == goal[i]){
score++; //if they are equal after changing the value increase the score
}
}



but if you recrate the vertices rather than changing single values this would be even slower

Share this post


Link to post
Share on other sites
You should be asking "what profiler application can I get for free and, if none exists, how can I make my own profiling code?" By searching for "Free C++ Code Profiler" the first result showed this. Unless there's a reason you can't use a free piece of software. Even if you don't have an AMD processor you can still use it, I believe.

Share this post


Link to post
Share on other sites
I grabbed verySleepy from a link in that site that was posted.

it seems 45% of my time is spend on vector operator[] and 45% is spent on vector begin().
out of my own code 8% is spent on score() no matter which version I use.

So now I'm writing up an evaluation function that will determine if a 4 index section has 2 or more items that can be placed in order and if so then I'll move forward with swapping and scoring.

Share this post


Link to post
Share on other sites
I don't think you can speed up that function enough to make it work well in your compression program. At least you're not passing those vector by value now though [smile].

I think the only way you'll get it fast would be to update the score as you perform the rotations and swaps that update the block. You'd count haw many values go from matching to non-matching and vice-versa, and add/subtract that from the previous score. That would then make this function redundant of course.

What about my suggestion of scoring it by how many inversions there are instead. Did you try that?

I gave up on this myself yesterday. I'm convinced that the amount of information you need to store to undo the sorting of a block exceeds the amount of data required to store the original block even for what would normally be fairly compressible data, no matter how you do it. The closest I got was a 19% increase in size.
The amount of unsorting data required becomes about equal to the original data if you drop your block size to around 256 bytes, but then representing the sorted data takes too much space when you get that low because you get very few repeated bytes, so it still ends up bigger. I can send you my updated code if you want, but I didn't end up implementing the grid sorting method you're trying because I couldn't find any approach that was even close to feasible in theory, let alone in practice.

Well at least I ended up fixing up some wikipedia page as a result of playing with this, so something good came out of it.

Share this post


Link to post
Share on other sites
We'll see - it may yet be futile to try this. Once I get the evaluation function coded I may be able to use that as an AI of sorts in choosing which swaps to make and reduce the number of swaps needed enough to cause compression.

And what do you mean by inversions ?

Share this post


Link to post
Share on other sites
The compiler will implement basic roll unlooping for you. You may gain something by doing that manual unfolding and keeping 4 separate counters, which you'll add at the end. That reduces dependencies and gives the processor more opportunities for parallelization.

Of course, only profiling your program with fairly realistic data will tell you if this is an improvement or not.

Also, you should probably use integers for counting.

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this