HELP! Timing from nanoseconds to hours in C++

Started by
19 comments, last by Bacterius 11 years, 3 months ago

Please help, I can't find a solution to fit my needs!

I've got 18MB of array I'm doing operations on.. I'm comparing POSIX threads to not-threads on search sort read write etc.

I am trying to obtain timing information for these activities.

Some take < nanoseconds (read, write), some take hours (duplicate search, bubble sort)

I need to get timing info and put it in a std::string.

Using

clock_t end = start - clock() ; // leaves me with accuracy +/- 1 second - not good enough

time_t end, start; time(&start); time(&end); (double diffms = end - start * 1000) / CLOCKS_PER_SEC ; //goes all hex-y on me

Help!

I'm wrapping this up into timer.start() and timer.stop(). timer.stop pushes the time taken into a vector of 'timestamps'.

I fear I'm confusing you now.

Please help! Suggestions?

Advertisement
There are many ways to get a high-resolution timer. Unfortunately, none of them were part of the C++98 standard.

What platform are you working with? Can you use C++11? How about Boost?

If you are using C++11, you can get very precise timing using the standard <chrono> library.

Linux has clock_gettime()

Use it with CLOCK_MONOTONIC, and it should give you nanosecond resolution.

My Gamedev Journal: 2D Game Making, the Easy Way

---(Old Blog, still has good info): 2dGameMaking
-----
"No one ever posts on that message board; it's too crowded." - Yoga Berra (sorta)

Linux has clock_gettime()

Use it with CLOCK_MONOTONIC, and it should give you nanosecond resolution.

Hmmm... That's what I use, but with CLOCK_REALTIME. Do you know why one is preferable over the other?

EDIT: After a bit of searching, it seems CLOCK_MONOTONIC will behave better if the clock is adjusted (say, by ntpd).

[quote name='Álvaro' timestamp='1357144129' post='5016737']
What platform are you working with? Can you use C++11? How about Boost?
[/quote]

+1

If an operation is too quick (e.g. nanoseconds) you won't be able to get proper timing anyway because of processor interrupts, thread scheduling, etc.. why not run them lots of times and take the average? Another way is to let the operation run for X seconds, stop after that time and count how many times N the operation could complete, then the average speed (in ops per second) is just N / X.

“If I understand the standard right it is legal and safe to do this but the resulting value could be anything.”

Chances are your clock_gettime also supports CLOCK_PROCESS_CPU_TIME_ID and CLOCK_THREAD_PROCESS_CPU_TIME_ID. Those should only measure the amount of time spent in either your process or a particular thread.

Using the regular realtime or monotonic clock will screw up your results, because of the above.

f@dzhttp://festini.device-zero.de

Hi all, thanks for the many replies - it represents the Journey of discovery I've been on!

Okay..

[quote name='Álvaro' timestamp='1357144129' post='5016737']
What platform are you working with? Can you use C++11? How about Boost?
[/quote]

I'm running on an i586 netbook with Ubuntu, I have GNU libraries available, and I'm compiling with G++ and ICPC (no intel C++ libraries though)..

However I'm also compiling on my android phone with G++ + Bionic libc, which is apparently like a half-assed C++ library for Android from Google.

I'd prefer not to use boost, as I'd like to keep it as simple and unconvoluted and generic as possible - my code's already a bit messy and I'm running out of time before I have to give it in :/

I have no idea if I have C++ 11 - is there a pretty little macro or something to discover this?

I discovered CLOCK_MONOTONIC_RAW, which gives actual processor time not adjusted by ntpd (?)

Thread Interrupts!! Why did this not occur to me?

Heres the scruffy little wrapper I've thrown all my thinking into..

It's not too bad, though I'd like a floating point seconds value, So I can get minutes out of it.

I think because I'm testing some 600,000 unsigned long longs, it takes a while, and the long type of tv_nsecs isn't big enough maybe?

#pragma once
#include <time.h>
#include <iostream>
#include <string>
#include <vector>
#include <algorithm>
#include <fstream>


//Some help with timing

#ifdef CLOCK_MONOTONIC_RAW
#define USED_CLOCK CLOCK_MONOTONIC_RAW
#else
#define USED_CLOCK CLOCK_MONOTONIC
#endif

#define BILLION 1000000000 // 1E9 // ..for nano seconds to seconds

struct timestamp { // we will stack these in the vector
double mins;
time_t secs;
long nsecs;
std::string desc;

timestamp(double m, time_t s, long n, std::string d) {
mins = m; secs = s; nsecs = n;
desc = d;
}
};

// This learned from here
//http://www.guyrutenberg.com/2007/09/22/profiling-code-using-clock_gettime/
//
timespec diff(timespec start, timespec end) {

timespec tmp;

if( (end.tv_sec - start.tv_sec) < 0 ) {
tmp.tv_sec = end.tv_sec - start.tv_sec - 1;
tmp.tv_nsec = BILLION + end.tv_nsec - start.tv_nsec;
}
else {
tmp.tv_sec = end.tv_sec - start.tv_sec ;
tmp.tv_nsec = end.tv_nsec - start.tv_nsec ;
}

return tmp;
}

//
// TWO GLOBAL VECTORS! SORRY!
// Need all timers to report onto one list so they can be gotten to elsewhere
//

std::vector<timestamp*> g_nonthread_tests, g_threaded_tests;
std::vector<timestamp*>::iterator g_index;

class Timer
{
private:
timespec beg ;
timespec end ;

public:

std::string desc;

void start() {
clock_gettime( USED_CLOCK, &beg );
}

void stop( bool is_thr ) {

clock_gettime( USED_CLOCK, &end ) ;

long nsecs = diff( beg, end ).tv_nsec ;
time_t secs = diff( beg, end ).tv_sec ; // nsecs / BILLION;
double mins = 0 ;

if( secs > 60.0f ) mins = secs / 60.0f ;

std::cout << " took " << mins << "mins -> " << secs << "s" << std::endl ;

timestamp * t = new timestamp( mins, secs, nsecs, desc);

if( is_thr ) g_threaded_tests.push_back( t );
else g_nonthread_tests.push_back( t );
}

void PublishFinalReportToFile( std::string filename, bool thr )
{
std::fstream output;
output.open( filename.c_str(), std::ios::out | std::ios::trunc );

if( output.is_open() && output.good() )
{ int count = 0 ;

if( thr ) {
for ( g_index = g_threaded_tests.begin(); g_index != g_threaded_tests.end(); ++g_index, count++ )
{
output << "\n===============================================================";
output << "\n::\tTest:\t" << count ;
output << "\n::\tDesc:\t" << (*g_index)->desc ;
output << "\n::\tMins: " << (*g_index)->mins ;
output << "\n::\tSecs: " << (*g_index)->secs ;
output << "\n::\tNanoSecs: "<< (*g_index)->nsecs;
}
output << "\n===============================================================";
}

else {
for ( g_index = g_nonthread_tests.begin(); g_index != g_nonthread_tests.end(); ++g_index, count++ )
{
output << "\n===============================================================";
output << "\n::\tTest:\t" << count ;
output << "\n::\tDesc:\t" << (*g_index)->desc ;
output << "\n::\tMins: " << (*g_index)->mins ;
output << "\n::\tSecs: " << (*g_index)->secs ;
output << "\n::\tNanoSecs: "<< (*g_index)->nsecs;;
}
output << "\n===============================================================";
}
}

output.close();
}
};

[quote name='Trienco' timestamp='1357201424' post='5017013']
Using the regular realtime or monotonic clock will screw up your results, because of the above.
[/quote]

--B3comes change 1 - thing is I'm comparing threads to not-threads.. but I could work it into an if/else

[quote name='Bacterius' timestamp='1357191851' post='5016987']
If an operation is too quick (e.g. nanoseconds) you won't be able to get proper timing anyway because of processor interrupts, thread scheduling, etc.. why not run them lots of times and take the average?
[/quote]

Okay fair shout.. but it's a big array - 36MB! Read and write are much faster than search duplicates for example..

[quote name='mynameisnafe' timestamp='1357223926' post='5017114']
Okay fair shout.. but it's a big array - 36MB! Read and write are much faster than search duplicates for example..
[/quote]

No but you only do lots of runs for the very fast operations which are difficult to time accurately. The slow operations, by virtue of being slow, have very little relative random error in their timing (error / total time), so you don't need as many runs for those to get an accurate reading. For instance, if a duplicate search operation takes on average 3.4 seconds, you won't care about whether it takes ten milliseconds less or more. But if your insert (or whatever) operation, on the other hand, takes 1.5 milliseconds, you are going to want to get it to within at least 0.1 milliseconds.

“If I understand the standard right it is legal and safe to do this but the resulting value could be anything.”

This topic is closed to new replies.

Advertisement