Archived

This topic is now archived and is closed to further replies.

q2guy

Profiling and cache

Recommended Posts

if you want to test a part of your code, and want to fill cache with random data, so you want that your code won''t be in cache, you have to read several MB from memory before each iteration of looptest ?
int FillCache()
{
	int tr=0;
	int mem = 1024*1024;
	int *dir = (int *)malloc(sizeof(int)*mem+4);

	for (int i=0;i<mem;i+=6)
	{
		tr += *(dir+i);
		tr -= *(dir+i+1);
		tr += *(dir+i+2);
		tr -= *(dir+i+3);
		tr += *(dir+i+4);
		tr -= *(dir+i+5);
	}

	free(dir);
	return tr;
}

// +,- and return are for vc not full-optimize it and remove the useless loop

This code will fill the cache ? And, if you want your code resident in cache, you must execute the code before do the looptest ? Thanks

Share this post


Link to post
Share on other sites
ok, but IMHO that''s not the best way to go about profiling your code. the typical way to profile your code is to run it as the full game. the profiler them points our to you in which areas of your code the program spends the most time. from there you optimize the areas that are the bottlenecks, preferably by changing algorithms and only finally as optimizing the actual code. while profiling as you are talking about might tell you some interesting things about the code, it''s not really going to tell you anything interesting about how that code performs when it''s actually running in the context of your game.

search around for other profiling threads. there was a really nice, free profiler that you can download and use mentioned in another thread. the name escapes me, but i''d go that route rather than your route.

anyway, AFAIK, the cache is going to get emptied and refilled with your code as soon as you start running the code that follows your cache filling. so what you are trying to do (run the code entirely out of cache) has no hope of working. the processor will just dump your fake data in preference to the actual data that''s being processed.

-me

Share this post


Link to post
Share on other sites
Thanks for your reply, I want to profile a library alone, not within a game, so I can''t profile all to see bottlenecks, what I want is to profile the functions that provide the library, and select the optimum methods and descarts the rest.

If I measure fillcache function, I execute fillcache inside the looptest, and later to the total time/clocks subtract the (fillcache_time*number_of_loops) will tell me an accurate result ?

Share this post


Link to post
Share on other sites
If I were you I''d look into gprof. It is the GNU profiler and works very seamlessly with gcc. I don''t remember the specifics, but I believe you could tell it to do exactly what you wnat. Just tell it to profile the code in specific files. Then just make a bench marking program which uses this code, and run the profiler with it. You will get all of the timings of every function, how much time was spent in each, and how much in functions called in it... how many times fn X was called from fc Y, etc. It given most everything you need to know.

I think that it would be more beneficial for you to learn to use gprof so when you need something similar later, you know just what to do and can apply it quickly. I dont know how reusable your code will be/is.

You know whats best through...

My home page!!!
Find out about my diy LCD projector and programming projects!

Share this post


Link to post
Share on other sites
I''m no expert on CPUs but from what I do understand I can''t see how you could *ever* guarantee the cache does not contain code.

As such, learn to use your profiler to discard results from other libraries and focus on a specific one. It is easy in DevPartner.

Share this post


Link to post
Share on other sites