Sign in to follow this  
Kest

Bending const rules

Recommended Posts

I have a small dilemma. I want to implement a scrambling technique for my file write methods. The saving method currently look like this: savedata(const void *data,uint size) { stream.write( (const char*) data, size ); } Every file input / output in my game will use this. The function is called per integer, per bool, etc (file.savedata(&hp_points,sizeof(int))). But to implement the scrambling, I need to do this: savedata(const void *data,uint size) { void *temp_data = new char[size]; memcpy( temp_data, data, size ); Scramble( temp_data, size ); stream.write( (const char*) temp_data, size ); delete [] temp_data; } Or I could do this: savedata(const void *data,uint size) { void *goes_out_exactly_like_it_came_in = (void*) data; Scramble( goes_out_exactly_like_it_came_in, size ); stream.write( (const char*) data, size ); Unscramble( goes_out_exactly_like_it_came_in, size ); } The allocation and deallocation of so much data will kill performance. I could write a resizable pool for the file object to allow temporary storage of the scrambled data. But it would add a lot of complication. I already know everyone will spit at me and tell me to obey the rules. So there's really no need to reply unless you believe that breaking the rules may not be that bad in this type of situation, or you have another way around this. Thanks much for any help.

Share this post


Link to post
Share on other sites
Quote:

I already know everyone will spit at me and tell me to obey the rules.


The rules you have setup yourself? If you're going to cast data to (void*) and change it, then why have you declared it as "const void*" from the beginning? It's like trying to trick yourself.

Quote:

The allocation and deallocation of so much data will kill performance.


I doubt that calling Unscramble would be any faster.

Quote:

I already know everyone will spit at me and tell me to obey the rules. So there's really no need to reply unless you believe that breaking the rules may not be that bad in this type of situation, or you have another way around this.


Another way around what? You want to compute and save a scrambled version of a buffer to a file, but also keep the original. The faster way to do that is to create a temp object. If you're worried about performance, then why don't you try to save data in bigger chunks instead of calling savedata multiple times for every little bit of info?

Share this post


Link to post
Share on other sites

void Scramble( void* output, const void* input, uint size );

savedata(const void *data,uint size)
{
static std::vector<char> temp_buffer;

temp_buffer.resize( size );

void *goes_out_exactly_like_it_came_in = (void*) data;
Scramble( temp_buffer.begin(), goes_out_exactly_like_it_came_in, size );
stream.write( temp_buffer.begin(), size );
}


where Scramble copies from the input buffer to the output buffer.

this way you allocate/deallocate a small finite number of times, (only as size grows toward its maximum value) since resize() doesn't deallocate.

Share this post


Link to post
Share on other sites
Since it sounds like you're only calling it on smallish types, how about:
template <typename Data>
void savedata(const Data &data) {
Data temp = data;
Scramble(temp, sizeof temp);
stream.write((const char*)&temp, sizeof temp);
}

My C++ is rusty so that may not be correct, but you get the idea. Or, just do this:
template <typename Data>
void savedata(Data data) {
Scramble(data, sizeof data);
stream.write((const char*)&data, sizeof data);
}


[Edited by - Way Walker on July 20, 2006 10:20:12 AM]

Share this post


Link to post
Share on other sites
The allocation/de-allocation would be an insignificant performance impact compared to the actual File IO.

Why don't you write out as you scramble, instead of putting it in a buffer? Then you only have to buffer whatever block size you scramble in? Do you scramble each byte? Each 4 bytes? etc... Then your scrambler can read in a block, scramble it, and write it straight out to the file. You don't need any temp buffers in this case, other than the byte or byte array(block size) that you scramble at once. No dynamic memory, just a fixed size local var in your scrambler. Allow your scrambler to take various outputs into the constructor, so you can have it scramble to a memory buffer, to a file, to cout, etc...

Share this post


Link to post
Share on other sites
Quote:
Original post by Kest
Or I could do this:

savedata(const void *data,uint size)
{
void *goes_out_exactly_like_it_came_in = (void*) data;
Scramble( goes_out_exactly_like_it_came_in, size );
stream.write( (const char*) data, size );
Unscramble( goes_out_exactly_like_it_came_in, size );
}


And later on you'll call it like this:

char * p = "This is a test";
savedata(p, strlen(p));

And your program will crash because "This is a test" lives in a read-only data segment...

Usually in cases like this there is a size where the vast majority of your data is below that size and very rarely you'll get something bigger. In that case it doesn't add much complication to have a stack buffer that you use only if the data will fit and revert to allocating only for the exceptional case.

Another obvious thing to do is to change Scramble to not scramble in-place. Then you don't have to do the memcpy.

Share this post


Link to post
Share on other sites
Quote:
Original post by mikeman
Quote:
The allocation and deallocation of so much data will kill performance.


I doubt that calling Unscramble would be any faster.

Why would you have any doubt at all?

I didn't put that into words very well. Allocating and deallocating thousands of tiny bits of data in Windows will completely destroy the performance to the point where you'll need a backup game to play while you wait for mine to load the next three yards of data into the map. Unscramble just performs simple value adjustments per byte. The difference isn't even worth comparing. Something like 1 to 80.

Quote:
Original post by DrEvil
The allocation/de-allocation would be an insignificant performance impact compared to the actual File IO.

I think file IO is buffered in the background and flushed out in large chunks. Memory allocations are not buffered at all as far as I can tell. Allocations per member variable blows file IO performance hits out of the water. Saving a single small map area of my game results in about 3,000 ints being shoved out. That's 3,000 allocations and deletions per small area. The file IO buffer would probably jam that onto the hard drive in one single 12KB crunch.

Quote:
Why don't you write out as you scramble, instead of putting it in a buffer? Then you only have to buffer whatever block size you scramble in? Do you scramble each byte? Each 4 bytes? etc... Then your scrambler can read in a block, scramble it, and write it straight out to the file. You don't need any temp buffers in this case, other than the byte or byte array(block size) that you scramble at once. No dynamic memory, just a fixed size local var in your scrambler. Allow your scrambler to take various outputs into the constructor, so you can have it scramble to a memory buffer, to a file, to cout, etc...

This is probably exactly what I should do. Instead of calling Scramble() if scrambling is enabled, I'll just diverge to that as the entire writing call instead of the normal call. Saving individual bytes to file at a time might hurt a bit, but nothing compared to most alternatives. I really hope fstream has an internal buffer.

Thanks for all of the advice.

Share this post


Link to post
Share on other sites
Quote:
Original post by Kest
Quote:
Original post by mikeman
Quote:
The allocation and deallocation of so much data will kill performance.


I doubt that calling Unscramble would be any faster.

Why would you have any doubt at all?

I didn't put that into words very well. Allocating and deallocating thousands of tiny bits of data in Windows will completely destroy the performance to the point where you'll need a backup game to play while you wait for mine to load the next three yards of data into the map.


Did you try it? How do you know how "thousands of tiny bits" get allocated and how (in)efficient it might be?

Even then, did you consider implementing an allocator to improve upon that?

Quote:
Unscramble just performs simple value adjustments per byte. The difference isn't even worth comparing. Something like 1 to 80.


Then writing-encrypted-as-you-go should be even easier.


template<typename T>
// btw, just say no to void* in C++
void savedata(const T& thing) {
char* data = reinterpret_cast<char*>(&thing);
for (int i = 0; i < sizeof(T); ++i) {
stream.put(scramble_byte(data[i]));
}
};


Quote:

Quote:
Why don't you write out as you scramble, instead of putting it in a buffer? Then you only have to buffer whatever block size you scramble in? Do you scramble each byte? Each 4 bytes? etc... Then your scrambler can read in a block, scramble it, and write it straight out to the file. You don't need any temp buffers in this case, other than the byte or byte array(block size) that you scramble at once. No dynamic memory, just a fixed size local var in your scrambler. Allow your scrambler to take various outputs into the constructor, so you can have it scramble to a memory buffer, to a file, to cout, etc...

This is probably exactly what I should do. Instead of calling Scramble() if scrambling is enabled, I'll just diverge to that as the entire writing call instead of the normal call. Saving individual bytes to file at a time might hurt a bit, but nothing compared to most alternatives. I really hope fstream has an internal buffer.


Keep scrambling separated out into a separate function. SRP, you know.

Also please note that with templates, you can avoid having to pass in the size value (as shown above). You gain type safety and a simpler interface.

Also please note that using this sort of technique to write basically anything that isn't a non-pointer primitive or pointer-less POD struct is Bad News(TM) (and even then...).

And yes, of course fstream has an internal buffer. But even if it didn't, it wouldn't be hard to implement your own buffering into a *static* buffer instead of allocating for every write. (Also have a look at the class used for fstream's buffer, std::streambuf.) It might actually be the best thing to implement a derived streambuf that 'scrambles' its contents when it is about to 'overflow', and swap that streambuf into the fstream when scrambling is desired.

Share this post


Link to post
Share on other sites
Quote:

Why would you have any doubt at all?

I didn't put that into words very well. Allocating and deallocating thousands of tiny bits of data in Windows will completely destroy the performance to the point where you'll need a backup game to play while you wait for mine to load the next three yards of data into the map. Unscramble just performs simple value adjustments per byte. The difference isn't even worth comparing. Something like 1 to 80.


That's a pretty bold statement. According to my benchmarks(performing 1 million calls of savedata with an integer as data) in my system, it's something like 1 to 1.4. And I'm talking about a Scramble() that only performs an addition, and an UnScramble() that just performs a subtraction. Never assume things like that before benchmarking yourself. If Scramble/Unscramble are even more complex than that(which I assume they are), the performance may equal or even tip over to the other side.

Share this post


Link to post
Share on other sites
Quote:
Original post by Zahlman

template<typename T>
void savedata(const T& thing) {
char* data = reinterpret_cast<char*>(&thing);
for (int i = 0; i < sizeof(T); ++i) {
stream.put(scramble_byte(data[i]));
}
};

Wouldn't this make it impossible to send arrays of data?

What about..
template<typename T>
void savedata(const T *thing, uint count=1) ?

Though what would happen in this type of situation?
int mylist[32];
file.savedata( mylist, 32 );

Does typename become int or int[32]?

I appreciate the advice.

Quote:
Original post by mikeman
That's a pretty bold statement. According to my benchmarks(performing 1 million calls of savedata with an integer as data) in my system, it's something like 1 to 1.4.

Try mixing up the sizes. Allocate or save random sizes between 1 and 50. Also, the file IO can't really be benchmarked the same way as allocation. The data should be sent almost instantly, and should start crunching away to the hard drive after the game would have already returned to normal operation.

Share this post


Link to post
Share on other sites
Quote:

Try mixing up the sizes. Allocate or save random sizes between 1 and 50.


Already tried it, almost same performance. I don't understand why you're so convinced how this works. It doesn't mean that every time you call "new" that it is actually translated to an API call to VirtualAlloc() or whatever. The compiler has its own optimized internal alllocator for exactly this kind of situation, which could very likely use a resizeable pool like you mentioned in your first post or even a more effective scheme. And as I said, benchmark it and see it for yourself!

Quote:

Also, the file IO can't really be benchmarked the same way as allocation. The data should be sent almost instantly, and should start crunching away to the hard drive after the game would have already returned to normal operation.


And calling UnScramble() after the stream and before the function returns is going to help that? delete returns almost instantly.

Share this post


Link to post
Share on other sites
Quote:
Original post by mikeman
Already tried it, almost same performance. I don't understand why you're so convinced how this works.

Even though the argument is pointless, I'm still concerned with your results.

* allocate 4 byte buffer
* copy 4 bytes to buffer
* write 4 bytes to file
* delete 4 byte buffer

1,000,000 times

..is better or even nearly the same performance result as..

* write 4 bytes to file
* use addition operator four times

1,000,000 times?

If that's the results you're describing, I'm pretty sure there must have been a mistake at some point.

Share this post


Link to post
Share on other sites
Quote:
Original post by Kest
Quote:
Original post by Zahlman

template<typename T>
void savedata(const T& thing) {
char* data = reinterpret_cast<char*>(&thing);
for (int i = 0; i < sizeof(T); ++i) {
stream.put(scramble_byte(data[i]));
}
};

Wouldn't this make it impossible to send arrays of data?

What about..
template<typename T>
void savedata(const T *thing, uint count=1) ?

Though what would happen in this type of situation?
int mylist[32];
file.savedata( mylist, 32 );

Does typename become int or int[32]?

I appreciate the advice.


I believe you can do an array version like:


template<typename T, size_t n>
void savedata(const T(&)[n] thing) {
// I don't know why the (&) is in brackets. Ask MaulingMonkey; this is
// basically cribbed from his industry::size_of. :)
char* data = reinterpret_cast<char*>(&thing);
for (int i = 0; i < sizeof(T) * n; ++i) {
stream.put(scramble_byte(data[i]));
}
};


Using a reference in a template function will prevent the array from 'decaying' to a pointer. Apparently. o_O

I would still look into the streambuf idea instead, though.

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this