Storing function arguments

Started by
7 comments, last by rick_appleton 13 years, 1 month ago
For a logging utility I'm making I want to log function calls, with the parameters, and optionally the return value. Since the apps using this are usually high-performance, I don't want to take the time converting the parameters to a string inline with the function call. I've decided I want to store this data, and then in another thread I'll retrieve it and do the printing/analysing.

I think one of the fastest ways to store this data is to have a large block of memory, and simply 'push' the function pointer, the arguments, and the return value onto this sequentially. The question I'm now pondering is how to do this efficiently, and in clear code. Each function that will be logged this way can have a different number and type of arguments. Since arguments can have different sizes (bool vs double), I suspect I'll need to use some template or define magic to get this done nicely. However, so far I've not been able to think of a nice way to do it other than writing it out for each argument.

Basically I want to find a way to achieve the following with only a single line of code (or two when adding the return value):

int currentPositionInBlock;
int MyFunction( bool arg1, char* arg2 )
{
void* fncPtr = &MyFunction;
memcpy((char*)currentPositionInBlock, &fncPtr, sizeof(fncPtr)); currentPositionInBlock += sizeof(fncPtr);
memcpy((char*)currentPositionInBlock, &arg1, sizeof(arg1)); currentPositionInBlock += sizeof(arg1);
memcpy((char*)currentPositionInBlock, &arg2, sizeof(arg2)); currentPositionInBlock += sizeof(arg2);

int result = ...

memcpy((char*)currentPositionInBlock, &result, sizeof(result)); currentPositionInBlock += sizeof(result);


Of course I'll also need to retrieve this data and handle it, but since that isn't time sensitive and I know what function takes what parameters, I'm sure I'll manage that.

Kind regards,
Rick
Advertisement
AFAIK, in C call, a function has no way of knowing, how many parameters it got, it can only hope that it got at least the amount it needs.

However if you are willing to supply the number of parameters of each function by hand, you could write a piece of code that walks back the stack (or extracts them from the registers, whatever calling convention you use), and writes them to your buffer in a loop.

However, stuff like pointers or references will be just that. Pointers to some memory location that is probably no longer valid, when you check the log. But the same goes for your code above.

AFAIK, in C call, a function has no way of knowing, how many parameters it got, it can only hope that it got at least the amount it needs.

However if you are willing to supply the number of parameters of each function by hand, you could write a piece of code that walks back the stack (or extracts them from the registers, whatever calling convention you use), and writes them to your buffer in a loop.

However, stuff like pointers or references will be just that. Pointers to some memory location that is probably no longer valid, when you check the log. But the same goes for your code above.


Thank you for your reply. This is pretty much the other option, and certainly more generic. Luckily the functions are plain C functions, and I do know the number of arguments, so I could certainly specify the number by hand. I already expect that however I end up solving this I'll always need to do that, and possibly explicitly note if there is a return value.

Not having access to the memory pointed to by pointers is fine.
Is this C or C++? In C++ you can use templates to solve this in a single line. So you would have definitions like

class Log {
template< typename ReturnT >
void log( ReturnT r ) { ... }

template< typename ReturnT, typename Arg1T >
void log( ReturnT r, Arg1T a1 ) { ... }

template< typename ReturnT, typename Arg1T, typename Arg2T >
void log( ReturnT r, Arg1T a1, Arg2T a2 ) { ... }

template< typename ReturnT, typename Arg1T, typename Arg2T, typename Arg3T >
void log( ReturnT r, Arg1T a1, Arg2T a2, Arg3T a3 ) { ... }

etc
}

and then in your function you let function overloading select the right version

int MyFunction( bool arg1, char* arg2 ) {
log_instance.log( result, arg1, arg2 );
}

Writing log functions for an arbitrary number of arguments will of course be a PITA, but that is where Boost.Preprocessor comes in. It automates this kind of copy-paste code.

What to do with different types of arguments can also be automated via template specialization. Start with a simple case for handling build-in types, and specialize it to handle any custom types you might decide to pass in.

template< typename T >
struct Converter {
size_t operator()( int ptr, T arg, size_t offset ) {
memcpy((char*)offset, &arg, sizeof(arg));
return offset + sizeof( arg );
}
}

Since the apps using this are usually high-performance, I don't want to take the time converting the parameters to a string inline with the function call. I've decided I want to store this data, and then in another thread I'll retrieve it and do the printing/analysing.

This is generally a bad idea.

If processing the logs takes longer than it does to generate them, then logging thread will lag behind the main application. Since it stores unprocessed logs, it will consume increasing amounts of memory.

And if logs can be processed faster than they are generated, then it makes no sense to shuffle them to another thread, just do it at call site.

I think one of the fastest ways to store this data is to have a large block of memory, and simply 'push' the function pointer, the arguments, and the return value onto this sequentially.[/quote]True - what happens when this storage is exhausted?

Do you block until memory is available? If so - then having a separate thread doesn't help - average throughput will be limited by how long it takes to process logs. With extra thread, there is also additional overhead so it's worse than just processing logs directly.

And for high performance applications, operations should take O(1).

Since some operations cannot be done in such time, different approach is often taken. Java VM, for example, is unsuitable for real time control due to garbage collection. Apart from being non-deterministic it also has unbounded running time, meaning that application can spend seconds, in worst case even minutes doing the GC pass.
Isn't this where varargs become useful? Or am I misunderstanding what you're trying to do.

http://publications.gbdirect.co.uk/c_book/chapter9/stdarg.html
http://msdn.microsoft.com/en-us/library/fxhdxye9%28v=VS.100%29.aspx

Is this C or C++? In C++ you can use templates to solve this in a single line. So you would have definitions like

class Log {
template< typename ReturnT >
void log( ReturnT r ) { ... }

template< typename ReturnT, typename Arg1T >
void log( ReturnT r, Arg1T a1 ) { ... }

template< typename ReturnT, typename Arg1T, typename Arg2T >
void log( ReturnT r, Arg1T a1, Arg2T a2 ) { ... }

template< typename ReturnT, typename Arg1T, typename Arg2T, typename Arg3T >
void log( ReturnT r, Arg1T a1, Arg2T a2, Arg3T a3 ) { ... }

etc
}

and then in your function you let function overloading select the right version

int MyFunction( bool arg1, char* arg2 ) {
log_instance.log( result, arg1, arg2 );
}

Writing log functions for an arbitrary number of arguments will of course be a PITA, but that is where Boost.Preprocessor comes in. It automates this kind of copy-paste code.

What to do with different types of arguments can also be automated via template specialization. Start with a simple case for handling build-in types, and specialize it to handle any custom types you might decide to pass in.

template< typename T >
struct Converter {
size_t operator()( int ptr, T arg, size_t offset ) {
memcpy((char*)offset, &arg, sizeof(arg));
return offset + sizeof( arg );
}
}


Thanks for this. I'll likely go for something along these lines.


[quote name='rick_appleton' timestamp='1298365240' post='4777449']
Since the apps using this are usually high-performance, I don't want to take the time converting the parameters to a string inline with the function call. I've decided I want to store this data, and then in another thread I'll retrieve it and do the printing/analysing.

This is generally a bad idea.

If processing the logs takes longer than it does to generate them, then logging thread will lag behind the main application. Since it stores unprocessed logs, it will consume increasing amounts of memory.

And if logs can be processed faster than they are generated, then it makes no sense to shuffle them to another thread, just do it at call site.

I think one of the fastest ways to store this data is to have a large block of memory, and simply 'push' the function pointer, the arguments, and the return value onto this sequentially.[/quote]True - what happens when this storage is exhausted?

Do you block until memory is available? If so - then having a separate thread doesn't help - average throughput will be limited by how long it takes to process logs. With extra thread, there is also additional overhead so it's worse than just processing logs directly.

And for high performance applications, operations should take O(1).

Since some operations cannot be done in such time, different approach is often taken. Java VM, for example, is unsuitable for real time control due to garbage collection. Apart from being non-deterministic it also has unbounded running time, meaning that application can spend seconds, in worst case even minutes doing the GC pass.
[/quote]
I'm not sure I'm following you Antheus. In general I expect applications using this to fall in two categories: either high-performance multi-core, or high-performance single-core. The functions I will be logging will generally be accessed from a single, or at most two threads. The applications will be doing other stuff than just my logging, so on the main thread the logging will take a small amount of time compared to the total program time. It is this small amount of time that I want to minimize. If it's a single core application, then other cores will be free to be 'heavy' logging/analytics. If the application is truely CPU bound on all the cores, then the user will just have to take the resulting performance hit, and it won't really matter on which thread I do what. I'll be swapping blocks of memory, to keep the consumption down. However, I will take your comment into account. If the system is fully loaded, I expect it might be possible for the logs to be processed slower than they are generated, so I'll need to do something about the memory spiraling out of control.


Isn't this where varargs become useful? Or am I misunderstanding what you're trying to do.

http://publications....er9/stdarg.html
http://msdn.microsof...=VS.100%29.aspx

As far as I can tell varargs are useful when accessing a variable amount of arguments, not for storing them. I have many functions, each of which I know the exact number and type of arguments, and I want to store each call such that I can analyse it later. varargs might be useful at the other end though, when I decode this data.

I also remembered an article by Mark Jawad about deferred function calling. It's in Game Programming Gems 7, so I'll take another look at that.
Using sizeof( variable-name ) is poor coding practice, particular when you already know the size of the argument. I recommend, in your example, you use sizeof(bool) and sizeof(char*). sizeof() is not a function, it is a compile-time operator. Using a data-type as an argument for sizeof() rather than a variable-name will help remind you what you're really coding.

I'm not sure of the intent of your example function. Just a comment that it might be better to copy the actual data to the log, rather than a pointer to the data. I.e.,

memcpy( logmemory, &arg2, sizeof( arg2 ) ); // copies 4 bytes
// maybe better is..
memcpy( logmemory, arg2, amountOfDataToCopy );


For instance, from your code, it appears that you expect arg2 to point to memory that will not change by the time you use whatever it is that arg2 points to. That is, are you certain that the string pointed to by arg2 (in your example) will still exist when you attempt to retrieve/interpret the string later?

I.e.,

// this is okay
char *myStr = "Here's a log message."; // static string
MyFunction( true, myStr );

// however
char *myStr2 = new char[256]; // dynamic string
strcpy( myStr2, ..someOtherString.. );
MyFunction( true, myStr2 );
delete[] myStr2; // now you're in trouble
// if you later attempt to read from the myStr2 pointer you wrote to the log,
// you'll probably get an error or access violation


An alternative would be to read the string itself into your log file. I.e., use strlen(arg2), strcpy( yourlog, arg2 ), etc., rather than relying on the memory pointed to by arg2 to exist for the life of the app. Yeah, strlen and strcpy are deprecated, but I think you get the idea.

Please don't PM me with questions. Post them in the forums for everyone's benefit, and I can embarrass myself publicly.

You don't forget how to play when you grow old; you grow old when you forget how to play.


Using sizeof( variable-name ) is poor coding practice, particular when you already know the size of the argument. I recommend, in your example, you use sizeof(bool) and sizeof(char*). sizeof() is not a function, it is a compile-time operator. Using a data-type as an argument for sizeof() rather than a variable-name will help remind you what you're really coding.

I'm not sure of the intent of your example function. Just a comment that it might be better to copy the actual data to the log, rather than a pointer to the data. I.e.,

memcpy( logmemory, &arg2, sizeof( arg2 ) ); // copies 4 bytes
// maybe better is..
memcpy( logmemory, arg2, amountOfDataToCopy );


For instance, from your code, it appears that you expect arg2 to point to memory that will not change by the time you use whatever it is that arg2 points to. That is, are you certain that the string pointed to by arg2 (in your example) will still exist when you attempt to retrieve/interpret the string later?

I.e.,

// this is okay
char *myStr = "Here's a log message."; // static string
MyFunction( true, myStr );

// however
char *myStr2 = new char[256]; // dynamic string
strcpy( myStr2, ..someOtherString.. );
MyFunction( true, myStr2 );
delete[] myStr2; // now you're in trouble
// if you later attempt to read from the myStr2 pointer you wrote to the log,
// you'll probably get an error or access violation


An alternative would be to read the string itself into your log file. I.e., use strlen(arg2), strcpy( yourlog, arg2 ), etc., rather than relying on the memory pointed to by arg2 to exist for the life of the app. Yeah, strlen and strcpy are deprecated, but I think you get the idea.

You are indeed correct. For now I'm not really interested in the contents of the pointers, I'm just going to print the value out. I realize this is not really useful info to have, but I can't very well leave out arguments if I'm logging the function calls. Later on I will indeed copy any info that is passed as a pointer. Likely I'll use a scratch heap for that as well, similar to how I plan to use scratch heaps to store the arguments on.

This topic is closed to new replies.

Advertisement