&++x Lvalue but &x++ not?

Started by
16 comments, last by TheComet 10 years, 2 months ago

You can use your compiler to generate some assembly, especially if like me you're not an assembly wizard (forgive me for any errors, basic or not, in the following post!).

Let us take the following sample program:


#include <ctime>
#include <cstdio>
#include <cstdlib>
 
int main() {
    std::srand(static_cast<unsigned>(std::time(nullptr)));
    int x = std::rand();
    int n = std::rand();
    std::printf("x: %d\n", x);
    for(int i = 0 ; i < n ; ++i) {
        std::printf("++x: %d\n", ++x);
    }
    std::printf("x: %d\n", x);
    for(int i = 0 ; i < n ; i++) {
        std::printf("x++: %d\n", x++);
    }
    std::printf("x: %d\n", x);
    return 0;
}

I used some randomness so the compiler wouldn't optimise stuff away or unroll loops, and the lines of interest are in loops because otherwise the compiler was being quite clever and not really giving me the kind of output I wanted to talk about. C style I/O was used, because in my experience C++ I/O tends to clutter the assembly.

The full assembly listing will be at the bottom of this post, but first let us look at some interesting snippets. Here is a basic printing of X:


; 13   :  std::printf("x: %d\n", x);
 
push esi
push OFFSET ??_C@_06PLOJGBHI@x?3?5?$CFd?6?$AA@
call ebx
add esp, 8

The address of printf is stored in ebx earlier. The crazy OFFSET value is the address of the format string literal. The register esi contains the current value of X. We push the value to be printed, the format string address, and call printf() - then cleanup the stack.

Now, the printing of pre-incremented X looks like this:


; 11   :  std::printf("++x: %d\n", ++x);
 
inc esi
push esi
push OFFSET ??_C@_08CEDEMING@?$CL?$CLx?3?5?$CFd?6?$AA@
call ebx
add esp, 8

Very similar, and as expected the only real change is an increment instruction before the printing instructions.

The printing of post-increment of X is also quite similar:


; 15   :  std::printf("x++: %d\n", x++);
 
push esi
push OFFSET ??_C@_08HNNGEGEP@x?$CL?$CL?3?5?$CFd?6?$AA@
call ebx
add esp, 8
inc esi

The increment instruction now appears after the printing instructions.

So here, no temporary was needed. The compiler saw that the same effect could be generated by moving the increment instruction before or after the printing instructions.

The observant might also notice that one loop used pre-increment of i and another used the post-increment of i. In this case, the code for the two loops looks identical - the maximum number of iterations being stored in edi, this value being decremented each iteration.

First loop, using pre-increment:


; 10   :  for(int i = 0 ; i < n ; ++i) {
 
test edi, edi
jle SHORT $LN4@main
npad 2
$LL6@main:
 
; loop body omitted
dec edi
jne SHORT $LL6@main
mov edi, DWORD PTR _n$1$[ebp]
$LN4@main:
 
; 12   :  }

Second loop, using post-increment:


; 14   :  for(int i = 0 ; i < n ; i++) {
 
test edi, edi
jle SHORT $LN1@main
$LL3@main:
 
; loop body omitted
dec edi
jne SHORT $LL3@main
$LN1@main:
 
; 16   :  }

Essentially the same, the differences are some padding and the instruction required to reset the loop counter for the second loop (the instructions to initialise the loop counter is omitted).

Of course, what have we learned about C++ from this? Very little. This output is not representative of real code where a given value might be used in several complex expressions, rather than just incremented in a loop. But it does illustrate ApochPiQ's point about how a compiler is not restricted to a naive mapping of C++ concepts onto the generated machine instructions.

Here is the full assembly listing


; Listing generated by Microsoft (R) Optimizing Compiler Version 17.00.60315.1 
 
TITLE C:\...\Help.cpp
.686P
.XMM
include listing.inc
.model flat
 
INCLUDELIB OLDNAMES
 
PUBLIC ??_C@_06PLOJGBHI@x?3?5?$CFd?6?$AA@ ; `string'
PUBLIC ??_C@_08CEDEMING@?$CL?$CLx?3?5?$CFd?6?$AA@ ; `string'
PUBLIC ??_C@_08HNNGEGEP@x?$CL?$CL?3?5?$CFd?6?$AA@ ; `string'
EXTRN __imp___time64:PROC
EXTRN __imp__srand:PROC
EXTRN __imp__rand:PROC
EXTRN __imp__printf:PROC
; COMDAT ??_C@_08HNNGEGEP@x?$CL?$CL?3?5?$CFd?6?$AA@
CONST SEGMENT
??_C@_08HNNGEGEP@x?$CL?$CL?3?5?$CFd?6?$AA@ DB 'x++: %d', 0aH, 00H ; `string'
CONST ENDS
; COMDAT ??_C@_08CEDEMING@?$CL?$CLx?3?5?$CFd?6?$AA@
CONST SEGMENT
??_C@_08CEDEMING@?$CL?$CLx?3?5?$CFd?6?$AA@ DB '++x: %d', 0aH, 00H ; `string'
CONST ENDS
; COMDAT ??_C@_06PLOJGBHI@x?3?5?$CFd?6?$AA@
CONST SEGMENT
??_C@_06PLOJGBHI@x?3?5?$CFd?6?$AA@ DB 'x: %d', 0aH, 00H ; `string'
PUBLIC _main
; Function compile flags: /Ogtp
; File c:\program files (x86)\microsoft visual studio 11.0\vc\include\time.inl
; COMDAT _time
_TEXT SEGMENT
_time PROC ; COMDAT
; __Time$dead$ = ecx
 
; 133  :     return _time64(_Time);
 
push 0
call DWORD PTR __imp___time64
add esp, 4
 
; 134  : }
 
ret 0
_time ENDP
_TEXT ENDS
; Function compile flags: /Ogtp
; File c:\....\help.cpp
; File c:\program files (x86)\microsoft visual studio 11.0\vc\include\time.inl
; File c:\....\help.cpp
; COMDAT _main
_TEXT SEGMENT
_n$1$ = -8 ; size = 4
tv193 = -4 ; size = 4
_main PROC ; COMDAT
 
; 5    : int main() {
 
push ebp
mov ebp, esp
sub esp, 8
push ebx
push esi
push edi
; File c:\program files (x86)\microsoft visual studio 11.0\vc\include\time.inl
 
; 133  :     return _time64(_Time);
 
push 0
call DWORD PTR __imp___time64
; File c:\....\help.cpp
 
; 6    :  std::srand(static_cast<unsigned>(std::time(nullptr)));
 
push eax
call DWORD PTR __imp__srand
 
; 7    :  int x = std::rand();
 
mov edi, DWORD PTR __imp__rand
call edi
mov esi, eax
 
; 8    :  int n = std::rand();
 
call edi
 
; 9    :  std::printf("x: %d\n", x);
 
mov ebx, DWORD PTR __imp__printf
mov edi, eax
push esi
push OFFSET ??_C@_06PLOJGBHI@x?3?5?$CFd?6?$AA@
mov DWORD PTR _n$1$[ebp], edi
call ebx
add esp, 16 ; 00000010H
 
; 10   :  for(int i = 0 ; i < n ; ++i) {
 
test edi, edi
jle SHORT $LN4@main
npad 2
$LL6@main:
 
; 11   :  std::printf("++x: %d\n", ++x);
 
inc esi
push esi
push OFFSET ??_C@_08CEDEMING@?$CL?$CLx?3?5?$CFd?6?$AA@
call ebx
add esp, 8
dec edi
jne SHORT $LL6@main
mov edi, DWORD PTR _n$1$[ebp]
$LN4@main:
 
; 12   :  }
; 13   :  std::printf("x: %d\n", x);
 
push esi
push OFFSET ??_C@_06PLOJGBHI@x?3?5?$CFd?6?$AA@
call ebx
add esp, 8
 
; 14   :  for(int i = 0 ; i < n ; i++) {
 
test edi, edi
jle SHORT $LN1@main
$LL3@main:
 
; 15   :  std::printf("x++: %d\n", x++);
 
push esi
push OFFSET ??_C@_08HNNGEGEP@x?$CL?$CL?3?5?$CFd?6?$AA@
call ebx
add esp, 8
inc esi
dec edi
jne SHORT $LL3@main
$LN1@main:
 
; 16   :  }
; 17   :  std::printf("x: %d\n", x);
 
push esi
push OFFSET ??_C@_06PLOJGBHI@x?3?5?$CFd?6?$AA@
call ebx
add esp, 8
 
; 18   :  return 0;
 
xor eax, eax
pop edi
pop esi
pop ebx
 
; 19   : }
 
mov esp, ebp
pop ebp
ret 0
_main ENDP
_TEXT ENDS
END

Note: I did regenerate the assembly, after tweaking the code, while writing this post, I hope I haven't invalidated some of the earlier segments on the way. In any case, the idea should still be clear.

Advertisement

Just like x = x+5 won't work but x += 5 will work in a for loop

UNREAL ENGINE 4:
Total LOC: ~3M Lines
Total Languages: ~32

--
GREAT QUOTES:
I can do ALL things through Christ - Jesus Christ
--
Logic will get you from A-Z, imagination gets you everywhere - Albert Einstein
--
The problems of the world cannot be solved by skeptics or cynics whose horizons are limited by the obvious realities. - John F. Kennedy


Just like x = x+5 won't work but x += 5 will work in a for loop

Can you explain what you mean by won't work?

Thats why preincrement is usually tiny bit faster than postincrement when used in for-loop. Because it doesn't need intermediate temporary variable to store the result of the operation.


Only for types with a non-trivial preincrement/postincrement operator implementation, such as some iterators; for things like ints, it's in practice exactly the same, because the compiler sees that nothing uses the intermediate temporary and elides it.

I haven't said that it is somehow significantly faster, just pointed out the way it actually works.

Just like x = x+5 won't work but x += 5 will work in a for loop

Can you explain what you mean by won't work?
When i tried it weeks ago, it didn't work but when i tried it after i posted that, it worked.
That's why it's edited out.

UNREAL ENGINE 4:
Total LOC: ~3M Lines
Total Languages: ~32

--
GREAT QUOTES:
I can do ALL things through Christ - Jesus Christ
--
Logic will get you from A-Z, imagination gets you everywhere - Albert Einstein
--
The problems of the world cannot be solved by skeptics or cynics whose horizons are limited by the obvious realities. - John F. Kennedy

That's ok. We'd prefer if you didn't edit earlier posts, as it can make the thread more difficult to follow for others. A better approach would be to edit the post and strike through any errors in the original, and add a explanation.

Here is an example where I corrected a mistake I made in this way. I'm going to restore your earlier post like that.

That's ok. We'd prefer if you didn't edit earlier posts, as it can make the thread more difficult to follow for others. A better approach would be to edit the post and strike through any errors in the original, and add a explanation.

Here is an example where I corrected a mistake I made in this way. I'm going to restore your earlier post like that.

I think it would be beneficial if he stated what compiler he was using. I've used MinGW and GCC for years and both methods have always worked for me for as long as I can remember. This way we know if it is just a fluke that happened or if a compiler, for whatever reason, is playing (for lack of a better word) willy-nilly with the standard and how the code works.

What people are referring to as "a temporary value" is known in C/C++ as an rvalue. The postfix unary increment returns an rvalue, whereas the prefix unary increment returns an lvalue.

This can be seen if you run the following code, compiled with C++11:


#include <iostream>

void print( const int& value )
{
    std::cout << "this is an lvalue" << std::endl;
}
void print( const int&& value )
{
    std::cout << "this is an rvalue" << std::endl;
}

int main()
{
    int x = 0;
    print( ++x );
    print( x++ );
    return 0;
}

A more detailed explanation on lvalues and rvalues can be obtained here:

http://blogs.msdn.com/b/vcblog/archive/2009/02/03/rvalue-references-c-0x-features-in-vc10-part-2.aspx

Another approach in explaining this behaviour can be done by examining unary increment overloads in a custom class:


class X {
    int m_Value;
public:
    X& operator++()
    {
        ++m_Value;
        return *this;
    }
    X operator++(int)
    {
        X tmp(*this);
        operator++();
        return tmp;
    }
};

Your example using the class:


X x;
X* p1 = &++x; // this calls operator++(), which returns a reference to itself (return *this;), i.e. getting the address is valid.
X* p2 = &(x++); // this calls operator++(int), which returns the temporary after incrementing, i.e. getting the address is invalid.
"I would try to find halo source code by bungie best fps engine ever created, u see why call of duty loses speed due to its detail." -- GettingNifty

This topic is closed to new replies.

Advertisement