What does the standard say about this?

Started by
7 comments, last by Rattrap 10 years ago

I had implemented several hash functions in C++ a while back, and I was going back through them to clean them up a little In the example below (MD5), I converted the input bytes (plus appropriate padding) to an array of 16 elements to be processed by the actual hash algorithm. The FromLittleEndian function reads the number of bytes required to fill the first template value from the input (pointer to element in MessageBlock) and returns the result in the native endian format (in this case std::uint32_t).

[source lang="c++"]

typedef std::array<std::uint32_t, 16> messagechunk_type;

const messagechunk_type ProcessBlock =
{
FromLittleEndian<std::uint32_t, CharArray::value_type>(&MessageBlock[0]),

FromLittleEndian<std::uint32_t, CharArray::value_type>(&MessageBlock[4]),

FromLittleEndian<std::uint32_t, CharArray::value_type>(&MessageBlock[8]),

FromLittleEndian<std::uint32_t, CharArray::value_type>(&MessageBlock[12]),

FromLittleEndian<std::uint32_t, CharArray::value_type>(&MessageBlock[16]),

FromLittleEndian<std::uint32_t, CharArray::value_type>(&MessageBlock[20]),

FromLittleEndian<std::uint32_t, CharArray::value_type>(&MessageBlock[24]),

FromLittleEndian<std::uint32_t, CharArray::value_type>(&MessageBlock[28]),

FromLittleEndian<std::uint32_t, CharArray::value_type>(&MessageBlock[32]),

FromLittleEndian<std::uint32_t, CharArray::value_type>(&MessageBlock[36]),

FromLittleEndian<std::uint32_t, CharArray::value_type>(&MessageBlock[40]),

FromLittleEndian<std::uint32_t, CharArray::value_type>(&MessageBlock[44]),

FromLittleEndian<std::uint32_t, CharArray::value_type>(&MessageBlock[48]),

FromLittleEndian<std::uint32_t, CharArray::value_type>(&MessageBlock[52]),

FromLittleEndian<std::uint32_t, CharArray::value_type>(&MessageBlock[56]),

FromLittleEndian<std::uint32_t, CharArray::value_type>(&MessageBlock[60])
};[/source]

I haven't tested the code below to see if the result is the same, but I was more curious if the behavior is well defined by the standard. I know that the use of commas like this can produce unexpected results in other cases.

[source lang="c++"]

typedef std::array<std::uint32_t, 16> messagechunk_type;

int Index = 0;

const messagechunk_type ProcessBlock =
{
FromLittleEndian<std::uint32_t, CharArray::value_type>(&MessageBlock[Index]), //0 - Thanks Alpha_ProgDes

FromLittleEndian<std::uint32_t, CharArray::value_type>(&MessageBlock[Index += sizeof(std::uint32_t)]), //4

FromLittleEndian<std::uint32_t, CharArray::value_type>(&MessageBlock[Index += sizeof(std::uint32_t)]), //8

FromLittleEndian<std::uint32_t, CharArray::value_type>(&MessageBlock[Index += sizeof(std::uint32_t)]), //12

FromLittleEndian<std::uint32_t, CharArray::value_type>(&MessageBlock[Index += sizeof(std::uint32_t)]), //16

FromLittleEndian<std::uint32_t, CharArray::value_type>(&MessageBlock[Index += sizeof(std::uint32_t)]), //20

FromLittleEndian<std::uint32_t, CharArray::value_type>(&MessageBlock[Index += sizeof(std::uint32_t)]), //24

FromLittleEndian<std::uint32_t, CharArray::value_type>(&MessageBlock[Index += sizeof(std::uint32_t)]), //28

FromLittleEndian<std::uint32_t, CharArray::value_type>(&MessageBlock[Index += sizeof(std::uint32_t)]), //32

FromLittleEndian<std::uint32_t, CharArray::value_type>(&MessageBlock[Index += sizeof(std::uint32_t)]), //36

FromLittleEndian<std::uint32_t, CharArray::value_type>(&MessageBlock[Index += sizeof(std::uint32_t)]), //40

FromLittleEndian<std::uint32_t, CharArray::value_type>(&MessageBlock[Index += sizeof(std::uint32_t)]), //44

FromLittleEndian<std::uint32_t, CharArray::value_type>(&MessageBlock[Index += sizeof(std::uint32_t)]), //48

FromLittleEndian<std::uint32_t, CharArray::value_type>(&MessageBlock[Index += sizeof(std::uint32_t)]), //52

FromLittleEndian<std::uint32_t, CharArray::value_type>(&MessageBlock[Index += sizeof(std::uint32_t)]), //56

FromLittleEndian<std::uint32_t, CharArray::value_type>(&MessageBlock[Index]) //60
};[/source]

Also, which would you consider more readable (assuming the result is correct in the first place). I'm conflicted since I consider the original version more readable, but something about having the index numbers hard coded bugs me. Or should I just replace this with a for loop and not try to initialize the array like this? I was trying to avoid the double initialization, since I believe that std::array will initialize all of the values to 0 (please correct me if I'm wrong here).

"I can't believe I'm defending logic to a turing machine." - Kent Woolworth [Other Space]

Advertisement
Out of curiosity, should your first line be this:

    typedef std::array<std::uint32_t, 16> messagechunk_type;
    int Index = 0;
    const messagechunk_type ProcessBlock =
     
    {
     
        FromLittleEndian<std::uint32_t, CharArray::value_type>(&MessageBlock[Index]),  // <-- Since Index starts at 0
        FromLittleEndian<std::uint32_t, CharArray::value_type>(&MessageBlock[Index += sizeof(std::uint32_t)]), //4
        FromLittleEndian<std::uint32_t, CharArray::value_type>(&MessageBlock[Index += sizeof(std::uint32_t)]), //8
        FromLittleEndian<std::uint32_t, CharArray::value_type>(&MessageBlock[Index += sizeof(std::uint32_t)]), //12
        FromLittleEndian<std::uint32_t, CharArray::value_type>(&MessageBlock[Index += sizeof(std::uint32_t)]), //16
        FromLittleEndian<std::uint32_t, CharArray::value_type>(&MessageBlock[Index += sizeof(std::uint32_t)]), //20
        FromLittleEndian<std::uint32_t, CharArray::value_type>(&MessageBlock[Index += sizeof(std::uint32_t)]), //24
        FromLittleEndian<std::uint32_t, CharArray::value_type>(&MessageBlock[Index += sizeof(std::uint32_t)]), //28
        FromLittleEndian<std::uint32_t, CharArray::value_type>(&MessageBlock[Index += sizeof(std::uint32_t)]), //32
        FromLittleEndian<std::uint32_t, CharArray::value_type>(&MessageBlock[Index += sizeof(std::uint32_t)]), //36
        FromLittleEndian<std::uint32_t, CharArray::value_type>(&MessageBlock[Index += sizeof(std::uint32_t)]), //40
        FromLittleEndian<std::uint32_t, CharArray::value_type>(&MessageBlock[Index += sizeof(std::uint32_t)]), //44
        FromLittleEndian<std::uint32_t, CharArray::value_type>(&MessageBlock[Index += sizeof(std::uint32_t)]), //48
        FromLittleEndian<std::uint32_t, CharArray::value_type>(&MessageBlock[Index += sizeof(std::uint32_t)]), //52
        FromLittleEndian<std::uint32_t, CharArray::value_type>(&MessageBlock[Index += sizeof(std::uint32_t)]), //56
        FromLittleEndian<std::uint32_t, CharArray::value_type>(&MessageBlock[Index]) //60
     
    };

And honestly, I think a for-loop would be far more readable. Unless you're unrolling the loop for performance reasons as well.

Beginner in Game Development?  Read here. And read here.

 


Out of curiosity, should your first line be this:

Yeah it should be. I had accidently written the Index with the addition on the first line and put the 0 in instead of index when I corrected it.


Unless you're unrolling the loop for performance reasons as well.

Performance is definitely a consideration here (thus trying to avoid the double initialization in the first place), since the function this belongs in is called for every block in the hash. But that being said, I haven't even remotely tried to profile this, so speed enhancements is not my goal here. Partly it is curiosity as far as is the incrementing of index in a single array initialization going to give an equivalent result.

I'll probably just switch to using a for loop, since that is probably much more readable and less prone to human errors in the indexes, and most of these arrays in similar hash functions aren't much bigger than this (16-64 elements on average).

"I can't believe I'm defending logic to a turing machine." - Kent Woolworth [Other Space]

I'd say the first is more readable, and I'd hazard a guess that the compiler would have an easier time optimizing the first version. I think the second version is legal due to the comma operator defining a sequence point between each assignment, leaving the result well-defined.

Edit: I don't have time to find the chapter and verse right now, but #10 on this list: http://en.cppreference.com/w/cpp/language/eval_order

According to the standard, "The full expressions in an initializer-clause are evaluated in the order which they appear." Section 8.5.1 paragraph 17 of C++11. Also, the elements of std::array<> constructed without an initializer are default initialized. In the case of a std::array of primitives this means no initialization is performed.

According to the standard, "The full expressions in an initializer-clause are evaluated in the order which they appear." Section 8.5.1 paragraph 17 of C++11.

Note that correct evaluation order in initializer lists is bugged in Visual Studio: http://connect.microsoft.com/VisualStudio/feedback/details/793638/wrong-evaluation-order-in-initializer-list


According to the standard, "The full expressions in an initializer-clause are evaluated in the order which they appear." Section 8.5.1 paragraph 17 of C++11. Also, the elements of std::array<> constructed without an initializer are default initialized. In the case of a std::array of primitives this means no initialization is performed.

Thanks. Well that pretty much says I should go with the looped initialization then.

"I can't believe I'm defending logic to a turing machine." - Kent Woolworth [Other Space]

Why don't you make a function to handle those

FromLittleEndian<std::uint32_t, CharArray::value_type>(&MessageBlock[Index += sizeof(std::uint32_t)]),

and then use a loop to fill the array? Just saying because, there's clearly a pattern here smile.png. It would be much cleaner this way imo.

Something like this perhaps?


messagechunk_type GetMsgChunkType(int Indx)
{
	return FromLittleEndian<std::uint32_t, CharArray::value_type>(&MessageBlock[Indx]);
}

...

for(int i = 0 i < 16; i++)
	messagechunk_type[i] = GetMsgChunkType(i * sizeof(std::uint32_t));


Why don't you make a function to handle those

I probably would have if I hadn't reduced it to a for loop now.

[source lang="cpp"]

messagechunk_type ProcessBlock;

MessageBlockIndex = 0; // declared elsewhere

for(auto& Element : ProcessBlock)

{

Element = FromLittleEndian<messagechunk_type::value_type>(&MessageBlock[MessageBlockIndex]);

MessageBlockIndex += sizeof(messagechunk_type::value_type);

}

[/source]

Switched to using range based for loop. Also I realized the second template argument of FromLittleEndian can be deduced by the compiler, so I took it out to reduce the code a bit.

"I can't believe I'm defending logic to a turing machine." - Kent Woolworth [Other Space]

This topic is closed to new replies.

Advertisement