Sign in to follow this  
Followers 0
mynameisnafe

HELP! Timing from nanoseconds to hours in C++

20 posts in this topic

Please help, I can't find a solution to fit my needs!

 

I've got 18MB of array I'm doing operations on.. I'm comparing POSIX threads to not-threads on search sort read write etc.

 

I am trying to obtain timing information for these activities.

 

Some take < nanoseconds (read, write), some take hours (duplicate search, bubble sort)

 

I need to get timing info and put it in a std::string.

 

Using

clock_t end = start - clock() ; // leaves me with accuracy +/- 1 second - not good enough

 

time_t end, start; time(&start); time(&end); (double diffms = end - start * 1000) / CLOCKS_PER_SEC ; //goes all hex-y on me

 

Help!

 

I'm wrapping this up into timer.start() and timer.stop(). timer.stop pushes the time taken into a vector of 'timestamps'.

 

I fear I'm confusing you now.

 

Please help! Suggestions?

0

Share this post


Link to post
Share on other sites
There are many ways to get a high-resolution timer. Unfortunately, none of them were part of the C++98 standard.

What platform are you working with? Can you use C++11? How about Boost?
1

Share this post


Link to post
Share on other sites
Linux has clock_gettime()
 
Use it with CLOCK_MONOTONIC, and it should give you nanosecond resolution.

Hmmm... That's what I use, but with CLOCK_REALTIME. Do you know why one is preferable over the other?

EDIT: After a bit of searching, it seems CLOCK_MONOTONIC will behave better if the clock is adjusted (say, by ntpd). Edited by Álvaro
0

Share this post


Link to post
Share on other sites

[quote name='Álvaro' timestamp='1357144129' post='5016737']
What platform are you working with? Can you use C++11? How about Boost?
[/quote]

 

+1

0

Share this post


Link to post
Share on other sites

If an operation is too quick (e.g. nanoseconds) you won't be able to get proper timing anyway because of processor interrupts, thread scheduling, etc.. why not run them lots of times and take the average? Another way is to let the operation run for X seconds, stop after that time and count how many times N the operation could complete, then the average speed (in ops per second) is just N / X.

3

Share this post


Link to post
Share on other sites

Chances are your clock_gettime also supports CLOCK_PROCESS_CPU_TIME_ID and CLOCK_THREAD_PROCESS_CPU_TIME_ID. Those should only measure the amount of time spent in either your process or a particular thread.

 

Using the regular realtime or monotonic clock will screw up your results, because of the above.

0

Share this post


Link to post
Share on other sites

Hi all, thanks for the many replies - it represents the Journey of discovery I've been on!

 

Okay..

 

[quote name='Álvaro' timestamp='1357144129' post='5016737']
What platform are you working with? Can you use C++11? How about Boost?
[/quote]

 

I'm running on an i586 netbook with Ubuntu, I have GNU libraries available, and I'm compiling with G++ and ICPC (no intel C++ libraries though)..

 

However I'm also compiling on my android phone with G++ + Bionic libc, which is apparently like a half-assed C++ library for Android from Google.

 

I'd prefer not to use boost, as I'd like to keep it as simple and unconvoluted and generic as possible - my code's already a bit messy and I'm running out of time before I have to give it in :/

 

I have no idea if I have C++ 11 - is there a pretty little macro or something to discover this?

I discovered CLOCK_MONOTONIC_RAW, which gives actual processor time not adjusted by ntpd (?)

 

Thread Interrupts!! Why did this not occur to me?

 

Heres the scruffy little wrapper I've thrown all my thinking into..

 

It's not too bad, though I'd like a floating point seconds value, So I can get minutes out of it.

 

I think because I'm testing some 600,000 unsigned long longs, it takes a while, and the long type of tv_nsecs isn't big enough maybe?

 

[CODE]#pragma once
#include <time.h>
#include <iostream>
#include <string>
#include <vector>
#include <algorithm>
#include <fstream>


//Some help with timing

#ifdef  CLOCK_MONOTONIC_RAW
    #define USED_CLOCK CLOCK_MONOTONIC_RAW
#else
    #define USED_CLOCK CLOCK_MONOTONIC
#endif

#define BILLION 1000000000 // 1E9 // ..for nano seconds to seconds

struct timestamp { // we will stack these in the vector
    double mins;
    time_t secs;
    long nsecs;
    std::string desc;
    
    timestamp(double m, time_t s, long n, std::string d) {
            mins = m; secs = s; nsecs = n;
            desc = d;
    }
};

// This learned from here
//http://www.guyrutenberg.com/2007/09/22/profiling-code-using-clock_gettime/
//
timespec diff(timespec start, timespec end) {

    timespec tmp;
    
    if( (end.tv_sec - start.tv_sec) < 0 ) {
        tmp.tv_sec  = end.tv_sec - start.tv_sec - 1;
        tmp.tv_nsec = BILLION + end.tv_nsec - start.tv_nsec;
    }
    else {
        tmp.tv_sec = end.tv_sec - start.tv_sec ;
        tmp.tv_nsec = end.tv_nsec - start.tv_nsec ;
    }
    
    return tmp;
}

//
//     TWO GLOBAL VECTORS! SORRY!
//    Need all timers to report onto one list so they can be gotten to elsewhere
//

std::vector<timestamp*>             g_nonthread_tests, g_threaded_tests;
std::vector<timestamp*>::iterator     g_index;

class Timer
{
private:
        timespec beg ;
        timespec end ;
                
public:

    std::string desc;

    void start() {    
        clock_gettime( USED_CLOCK, &beg );
    }
    
    void stop( bool is_thr ) {
    
        clock_gettime( USED_CLOCK, &end ) ;
        
        long     nsecs = diff( beg, end ).tv_nsec ;
        time_t     secs  = diff( beg, end ).tv_sec  ; // nsecs / BILLION;  
        double     mins  = 0 ;
        
        if( secs > 60.0f ) mins = secs / 60.0f ;        
                
        std::cout << " took " << mins << "mins -> " << secs << "s" << std::endl ;        
                
        timestamp * t = new timestamp( mins, secs, nsecs, desc);
        
        if( is_thr )    g_threaded_tests.push_back( t );
        else            g_nonthread_tests.push_back( t );
    }
    
    void PublishFinalReportToFile( std::string filename, bool thr )
    {
        std::fstream output;
        output.open( filename.c_str(), std::ios::out | std::ios::trunc );
    
        if( output.is_open() && output.good() )
        {    int count = 0 ;
        
            if( thr ) {
                for ( g_index = g_threaded_tests.begin(); g_index  != g_threaded_tests.end(); ++g_index, count++  )
                {
                    output << "\n===============================================================";
                    output << "\n::\tTest:\t"     << count ;
                    output << "\n::\tDesc:\t"     << (*g_index)->desc ;
                    output << "\n::\tMins: "    << (*g_index)->mins ;
                    output << "\n::\tSecs: "    << (*g_index)->secs ;
                    output << "\n::\tNanoSecs: "<< (*g_index)->nsecs;
                }
                    output << "\n===============================================================";
            }
            
            else {
                for ( g_index = g_nonthread_tests.begin(); g_index  != g_nonthread_tests.end(); ++g_index, count++  )
                {
                    output << "\n===============================================================";
                    output << "\n::\tTest:\t"     << count ;
                    output << "\n::\tDesc:\t"     << (*g_index)->desc ;
                    output << "\n::\tMins: "    << (*g_index)->mins ;
                    output << "\n::\tSecs: "    << (*g_index)->secs ;
                    output << "\n::\tNanoSecs: "<< (*g_index)->nsecs;;
                }
                    output << "\n===============================================================";
            }    
        }
        
        output.close();
    }
};[/CODE]

 

[quote name='Trienco' timestamp='1357201424' post='5017013']
Using the regular realtime or monotonic clock will screw up your results, because of the above.
[/quote]

--B3comes change 1 - thing is I'm comparing threads to not-threads.. but I could work it into an if/else

 

[quote name='Bacterius' timestamp='1357191851' post='5016987']
If an operation is too quick (e.g. nanoseconds) you won't be able to get proper timing anyway because of processor interrupts, thread scheduling, etc.. why not run them lots of times and take the average?
[/quote]

Okay fair shout.. but it's a big array - 36MB! Read and write are much faster than search duplicates for example..

0

Share this post


Link to post
Share on other sites

[quote name='mynameisnafe' timestamp='1357223926' post='5017114']
Okay fair shout.. but it's a big array - 36MB! Read and write are much faster than search duplicates for example..
[/quote]

No but you only do lots of runs for the very fast operations which are difficult to time accurately. The slow operations, by virtue of being slow, have very little relative random error in their timing (error / total time), so you don't need as many runs for those to get an accurate reading. For instance, if a duplicate search operation takes on average 3.4 seconds, you won't care about whether it takes ten milliseconds less or more. But if your insert (or whatever) operation, on the other hand, takes 1.5 milliseconds, you are going to want to get it to within at least 0.1 milliseconds.

1

Share this post


Link to post
Share on other sites

Okay fair shout.. but it's a big array - 36MB! Read and write are much faster than search duplicates for example..

 

No but you only do lots of runs for the very fast operations which are difficult to time accurately. The slow operations, by virtue of being slow, have very little relative random error in their timing (error / total time), so you don't need as many runs for those to get an accurate reading. For instance, if a duplicate search operation takes on average 3.4 seconds, you won't care about whether it takes ten milliseconds less or more. But if your insert (or whatever) operation, on the other hand, takes 1.5 milliseconds, you are going to want to get it to within at least 0.1 milliseconds.

 

That does make sense.. and would explain this kind of output 6) executes pretty quickly, but this number is a long (for seconds). 7) takes more time, and gives more feasible output, i.e. a positive number:

 

[CODE]
6) Portioned Duplicate-Search: 4 sequential function calls.
 took 0mins -> -134560216s
    1 Duplicates were found.
 took 0mins -> -134560213s
    0 Duplicates were found.
 took 0mins -> -134560210s
    0 Duplicates were found.
 took 0mins -> -134560207s
    0 Duplicates were found.

7) Portioned Bubble Sort: 4 sequential function calls.
 took 0mins -> 35s
 took 1.26667mins -> 76s
 took 1.86667mins -> 112s[/CODE]

 

Okay, so agreed, the quicker ops will have to benchmarked as an average.. though I wouldn't mind a decimal seconds value like 0.01s or something.

 

My main concern is these negative numbers, appearing.. what've I done wrong?

0

Share this post


Link to post
Share on other sites

[quote name='mynameisnafe' timestamp='1357225609' post='5017128']
My main concern is these negative numbers, appearing.. what've I done wrong?
[/quote]

This often happens when the operation was completed so quickly that the time calculation underflows, possibly because of clock monotonicity issues as highlighted by previous posters. But your timing code is unusual, lots of manipulations happening there, there could be a bug. Do the negative values ever come up for long operations (those who take a measurable time to execute, say on the order of a second)?

0

Share this post


Link to post
Share on other sites

My main concern is these negative numbers, appearing.. what've I done wrong?

This often happens when the operation was completed so quickly that the time calculation underflows, possibly because of clock monotonicity issues as highlighted by previous posters. But your timing code is unusual, lots of manipulations happening there, there could be a bug. Do the negative values ever come up for long operations (those who take a measurable time to execute, say on the order of a second)?

 

Monotonicity?

They do usually appear with ops between say 1-3 seconds? I agree it's a bit convuluted- essentially each threadable function holds a Timer: a description is composed, then timer.start() called, before the operation that is being timed, immediately following the timed op is timer.stop(), and since these functions use clock_gettime() right at the start respectively (as in the code way above), there shouln't be too much overhead.

 

My gut tells me that the timespec's 'long tv_secs' isn't able to hold large-enough unsigned numbers as I require maybe

I dropped the RAW from monotonic, and my negative values dissappeared.. wtf? Here's some sample output from the test I just ran, things are neater in the output file - look at nanosecs compared to secs and my confusion is evident.. this is from the non-threaded, sequential-function-call test.

 

PS I cut down the array size so I can test quicker - waiting 2 hours to see your timing data get smooshed is just way beyond uncool. lol

 

===============================================================
::    Test:    18
::    Desc:    Bubble Sort, in range 0-25000
::    Mins: 0
::    Secs: 8
::    NanoSecs: 518027650
===============================================================
::    Test:    19
::    Desc:    Bubble Sort, in range 25000-50000
::    Mins: 0
::    Secs: 18
::    NanoSecs: 326836672
===============================================================
::    Test:    20
::    Desc:    Bubble Sort, in range 50000-75000
::    Mins: 0
::    Secs: 27
::    NanoSecs: 18446744073391663619
===============================================================
::    Test:    21
::    Desc:    Bubble Sort, in range 75000-100000
::    Mins: 0
::    Secs: 37
::    NanoSecs: 18446744073256571601
===============================================================
::    Test:    22
::    Desc:    Sequential function-call Benchmarking test.
::    01:07:25 Jan  3 2013
::    Testing integer array of size = 100000 @ 6MB.
::    Test Executed Successfully.
::    Mins: 1.55
::    Secs: 93
::    NanoSecs: 48852266
===============================================================

 

+

+ or from the threaded test:

+

===============================================================
::    Test:    17
::    Desc:    Duplicate Search, in range 0-25000
::    1 duplicates were found.
::    Mins: 0
::    Secs: 2
::    NanoSecs: 18446744072997939107
===============================================================
::    Test:    18
::    Desc:    Bubble Sort, in range 0-25000
::    Mins: 0
::    Secs: 12
::    NanoSecs: 3231151
===============================================================

 

How is the large number of ms smaller than the small no of seconds?

 

 

I would use nanoseconds as the unit of time, in a 64-bit integer. It's much harder to make mistakes that way.

 

like a _uint64 ? How would that work? Also makes me wonder - is my #define BILLION signed or unsigned?

0

Share this post


Link to post
Share on other sites
Macros (things defined with #define) are not part of the type system: They are just text-replacement tools. So your `BILLION' is the same as the string `1000000000' anywhere you use it in the code. That probably means it's signed wherever you use it. You could just append `u' or `ul' to make the constant unsigned or unsigned long. But you should probably use a proper variable instead:
unsigned long const BILLION = 1000000000ul;
0

Share this post


Link to post
Share on other sites

There's something horribly wrong with the diff() function. It leads to either negative seconds, or negative nanoseconds. These negative nanoseconds are most likely being casted to some unsigned long somewhere along the line, which underflows into large 18446744072997... values, which are very close to 2^64. As for the negative seconds, I'm not sure yet what the cause is, but apparently you fixed that by changing the clock type, so we can figure it out later.

0

Share this post


Link to post
Share on other sites

Macros (things defined with #define) are not part of the type system: They are just text-replacement tools. So your `BILLION' is the same as the string `1000000000' anywhere you use it in the code. That probably means it's signed wherever you use it. You could just append `u' or `ul' to make the constant unsigned or unsigned long. But you should probably use a proper variable instead:

unsigned long const BILLION = 1000000000ul;

 

 Okay this sounds good, we'll try it! I would of thought a compiler would say 'it's unsigned unless it's signed' O_o

 

There's something horribly wrong with the diff() function. It leads to either negative seconds, or negative nanoseconds. These negative nanoseconds are most likely being casted to some unsigned long somewhere along the line, which underflows into large 18446744072997... values, which are very close to 2^64. As for the negative seconds, I'm not sure yet what the cause is, but apparently you fixed that by changing the clock type, so we can figure it out later.

 

I know right, just to look at it it seems overcomplicated, but the author of the page it's from is using it with CLOCK_REALTIME, CLOCK_PROCESS_CPUTIME_ID, and CLOCK_THREAD_CPUTIME_ID,.. which given what my C/W is to do sounds ideal, but they lead to some strange numbers..

 

I sought help of a guy on in my class and he recommended  CLOCK_MONOTONIC_RAW .. it was dropping the RAW that gave me some accurate figures from that diff() function..?

Okay, heres what I'm gonna do. I'm gonna go trial all these clock type Macros; the CPUTIME ones and the MONOTONIC, and report back what I get, tell you if I magically fixes it, and we'll be able to see the differences between the clock-types. In the mean time, could somebody have a scan of what's on the guys page and let me know if it's correct?

 

Looking at this again, I think it's supposed to say 'if the diff in seconds is negative, make positive, else chill'?

 

[CODE]if( (end.tv_sec - start.tv_sec) < 0 ) {         tmp.tv_sec  = end.tv_sec - start.tv_sec - 1;         tmp.tv_nsec = BILLION + end.tv_nsec - start.tv_nsec;     }     else {         tmp.tv_sec = end.tv_sec - start.tv_sec ;         tmp.tv_nsec = end.tv_nsec - start.tv_nsec ;     }[/CODE]

 

I'll be back with more soon

0

Share this post


Link to post
Share on other sites
I'll say it again: Use a 64-bit integer type for a number of nanoseconds, and you'll save yourself some trouble.
0

Share this post


Link to post
Share on other sites

My netbook f******* hates me for this. Well it can blow me :P (Y)

Okay.. I had been testing each clock-type and seeing what results I got.. before 'it' just died on me and I lost my results I had in this here text box, but I learned that:

  • My system doesn't have MONOTONIC RAW, and that
  • MONOTONIC gave accurate results for threaded but not non-threaded timing - i.e, use this on a thread and the timer class won't give back crazy negative numbers

 

  • The definition of BILLION after the preprocessor and explicitly as a ul has definitely had an impact on all calculations in a good way (Y)

 

  • I was testing multithreaded with CLOCK_REALTIME - when it died on me - I have to be patient with it and not copy and paste to here 'til the whole test finishes..
    CLOCK_REALTIME gives me some hex in the mins and a massive amount of seconds.. on an op of around 1-2 seconds perhaps is not what I'm after ) however it gives accurate results when it returns something back like 34s.. See the copy 'n' paste below..
    [CODE]
    4) Portioned Read: 4 sequential function calls.
     took 0mins -> 0s
     took 0mins -> 0s
     took 0mins -> 0s
     took 0mins -> 0s

    5) Portioned Search: 4 sequential function calls.
     took 0mins -> 0s
     took 0mins -> 0s
     took 0mins -> 0s
     took 0mins -> 0s

    6) Portioned Duplicate-Search: 4 sequential function calls.
     took 2.03782e+07mins -> 1222693435s
        1 Duplicates were found.
     took 2.03782e+07mins -> 1222693438s
        0 Duplicates were found.
     took 2.03782e+07mins -> 1222693441s
        0 Duplicates were found.
     took 2.03782e+07mins -> 1222693444s
        0 Duplicates were found.

    7) Portioned Bubble Sort: 4 sequential function calls.
     took 0mins -> 35s
     took 1.26667mins -> 76s
     took 1.8mins -> 108s
     took 2.48333mins -> 149s
     took 6.33333mins -> 380s
    [/CODE]

    Under concurrent threaded execution, gave perfect-looking results: (sorry about bullet points)

  • [CODE]
    5) Portioned Search: 4 concurrent threads.
     took 0mins -> 0s
     took  took 0mins -> 0s
  • 0mins -> 0s
     took 0mins -> 0s

    6) Portioned Duplicate-Search: 4 concurrent threads.
     took 0mins -> 6s
        0 Duplicates were found.
     took 0mins -> 6s
        1 Duplicates were found.
     took 0mins -> 6s
        0 Duplicates were found.
     took 0mins -> 6s
        0 Duplicates were found.

    7) Portioned Bubble Sort: 4 concurrent threads
     took 0mins -> 45s
     took 1.46667mins -> 88s
     took 2.51667mins -> 151s
     took 3.11667mins -> 187s
     took 3.21667mins -> 193s

    [/CODE]

I'm now going to try the THREAD_CPU_ID and PROCESS_CPU_ID and I'll report back about those next. Does anybody know why I got such strange results with REALTIME?

0

Share this post


Link to post
Share on other sites

OKAY, Hang my head in shame! In duplicate search function I wasn't even calling clock.start().. THATS why it was giving me gip. Oh god forgive me! *hangs head in shame and horror*

However the advice and guidance you guys have given me has been invaluable, thank you! :)

 

Heres how I decided to do it in the end..

 

[CODE]//Some help with timing, I've been trying to work out which for what purpose
#define CLK_CM_     CLOCK_MONOTONIC
//############################################
#define CLK_RT_     CLOCK_REALTIME
//############################################
#define CLK_THR_     CLOCK_THREAD_CPUTIME_ID
#define CLK_PRC_     CLOCK_PROCESS_CPUTIME_ID
//############################################
/*
    Here we define the macros we are actually using
    depending on whether we are timing
    thread or function execution time
*/
#define THR_CLOCK CLK_THR_
#define PRC_CLOCK CLK_PRC_

const unsigned long BILLION = 1000000000ul;  // ns -> s       timespec diff( timespec start, timespec end )
    {
        timespec tmp;
    
        if( ( end.tv_sec - start.tv_sec ) < 0 )  
        {
        
            tmp.tv_sec  = end.tv_sec - start.tv_sec - 1;
            tmp.tv_nsec = BILLION + end.tv_nsec - start.tv_nsec;
            
        }
        else  
        {
        
            tmp.tv_sec  = end.tv_sec - start.tv_sec ;
            tmp.tv_nsec = end.tv_nsec - start.tv_nsec ;
            
        }
        
        return tmp;
    }       void start( bool is_thr )        // Start the timer
    {    
        if(is_thr)        clock_gettime( THR_CLOCK, &beg );
        else            clock_gettime( PRC_CLOCK, &beg );
    }
    
    void stop( bool is_thr )         // Stop the timer
    {
        /*
            Use the right clock type
        */
        
        if( is_thr )    clock_gettime( THR_CLOCK, &end );
        else            clock_gettime( PRC_CLOCK, &end );
        
        /*
            Declare minutes, seconds, nanoseconds
             - calculate time elapsed
        */
        
        double     mins  = 0 ;
        time_t     secs  = diff( beg, end ).tv_sec  ;
        long     nsecs = diff( beg, end ).tv_nsec ;
        
        /*
            Do we need mins?
        */
        
        if( secs > 60.0f ) mins = secs / 60.0f ;        
                
        /*
            Spit out details
        */
        
        std::cout << " took " << mins << "mins -> " << secs << "s"
            << std::endl ;        
    }[/CODE]

 

It was while I was testing all the different types out when I noticed my error, not even in the timer class.. for shame!

0

Share this post


Link to post
Share on other sites
OKAY, Hang my head in shame! In duplicate search function I wasn't even calling clock.start().. THATS why it was giving me gip. Oh god forgive me! *hangs head in shame and horror*

 

hangman.gif

1

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!


Register a new account

Sign in

Already have an account? Sign in here.


Sign In Now
Sign in to follow this  
Followers 0