Jump to content

  • Log In with Google      Sign In   
  • Create Account


Member Since 05 Mar 2004
Offline Last Active Oct 07 2015 12:59 AM

Posts I've Made

In Topic: vector push_back creates error when multi threading

03 October 2015 - 02:33 PM

And ideally, you pass the mutex as a parameter.


Statics/globals are the enemy of multi-threading, and having that singular mutex pretty much ensures that you are not, in fact parallelising much of anything.

Or the better option would be for this doit function to be a member function instead, and the mutex is a member variable.

That way the caller can't pass in the wrong mutex.

In Topic: Is it possible to optimize this? (keeping a small array in registers)

28 September 2015 - 01:55 AM

Possible ideas that haven't been mentioned yet:

Consider switching Arrays of structures to structures of arrays.

Consider adding prefetch inline assembly language instructions or intrinsics.

You may be able to calculate partial sums of the values in the dist array, keeping track elsewhere which portions of the array have been touched, or heaven forbid you update the sum upon every change to an item in the dist array if it changes rarely.

Okay I'm starting to get in the territory of wild asz guesses now. Context is key. The more context we have, the more optimisations we can apply. There are plenty of things I could suggest, but they would only work under specific conditions which you have not specified. The less info you provide, the more you lose out. You're making this challenge like doing keyhole surgery blindfolded. But here's the kicker... you can't know what we will need to know in order to pick the best ideas. Really, your only option is to be as open as possible and provide as much contextual information as you can.


Believe me I've done plenty of micro-optimisation. If you've heard of John Carmack's famous parallel divide where you get the floating point perspective divide essentially for free as it is done in parallel with integer instructions... well I've managed to achieve exactly that directly in carefully crafted C++ code, without having to resort to inline asm, but I digress.


I've also seen the problem of summing up all values in an array being parallelised in such as way as to actually gain performance by summing up into more than one variable at once, then summing those at the end. I think this was somewhat compiler specific. This should jog someone else's memory to fill in the details for you.

In Topic: lame question how to delete something from vector

05 September 2015 - 05:32 PM

Is the Erase-Remove Idiom still valid programming practice?

Always will be.

Any time you may be removing more than one item, it's probably the better option.

In Topic: Most efficient way of designing a vector class in 3D

05 September 2015 - 05:28 PM

This has been solved by others previously. Here's a thread I had bookmarked about it, with a nice solution:


In Topic: C++ | Fixed Byte Size

22 August 2015 - 05:22 PM


You don't. Trying to map a struct to some sequence of bytes on disk or from the network is _wrong_.

It's only really wrong if you're trying to write 100% portable code that will work everywhere.


Lots of engines write data structures that exactly map to their own custom file formats (with manual padding/alignment), as it allows you to completely skip the "deserialization" step after streaming data from disk. You can just stream complex data structures into RAM and start using them immediately.

Yep, this line of discussion is completely off-topic from implementing a ZIP file decoder though! laugh.png

If you do ever need to do make manually-aligned structures, almost all compilers will accept the pack pragma:

#pragma pack(push)
#pragma pack(1) // one-byte alignment - AKA no padding
struct MyUnalignedFudgetry
#pragma pack(pop)

Even these suggestions for uin16_t and such are technically wrong. The C/C++ standard does not guarantee that sizeof(uint16_t) == 2.

So in your non-portable, platform-specific data-structure code, you should include the line static_assert( sizeof(uint16_t) == 2 ); smile.png


Just thought I'd add that although you can generally use pragma pack 1, if the architecture you are using does not allow misaligned reads, (e.g. the ARM we use at work gives a bus error) then it won't behave as desired.

I'm not sure what the exact behaviour would be, but I expect it would either cause a bus error when attempting to access the misaligned structures, ignore the request to not add padding, or refuse to compile the code.