Back to General and Gameplay Programming

Out Of order execution

General and Gameplay Programming Programming

Started by nuclear123 December 13, 2010 09:54 AM

3 comments, last by Adam_42 13 years, 4 months ago

nuclear123

119

Author

December 13, 2010 09:54 AM

im curious as to why i see assembly code like this

mov eax,DWORD PTR[0x3434]; <--- recieve value from memory and store in register
add eax,eax; <--- add value and store result in eax

is this inefficient code when people do this? for example, the CPU will need to stall to allow the latency of sending data from memory to a register. Either that or this code will yield incorrect results due to the value not being in a register yet. Anyways, would it be more efficient to write this code out of order, for example

mov eax,DWORD PTR[0x3434]; <--- recieve value from memory and store in register
.....
......
.........
...........
add eax,eax; <--- add value and store result in eax

would this cause the more efficient use? by the time u get to the add instruction, you will already have the value loaded into the register and good to go. -thx

KulSeran

3,272

December 13, 2010 10:18 AM

It entierly depends on the processor.
A regular desktop processor does Out-Of-Order instruction re-scheduling. It has a large queue of instructions that it can schedule at any one time. If it can move them around, it will try to remove as many stalls as it can.
Something like a netbook Atom processor is In-Order, and won't reschedule it, resulting in a stall.
Then there is other technology, like hyperthreading. Your thread may stall at the mov, but the other active thread on this core may have instructions that can run.

My Blog

Antheus

2,410

December 13, 2010 10:34 AM

Quote:Original post by nuclear123
would this cause the more efficient use?

It's a superset of an NP-complete problem.

Compilers try to optimize this for small cases, but general optimal solution isn't known. Even a simpler subset of the problem is very hard to solve.

megus

122

December 13, 2010 10:44 AM

As explained above, OOO execution is managed by hardware, and not the programmer.

Given you're programming in a higher language, any standard, well established compiler would do software ILP (re-scheduling), with the consideration of any potential dependencies/hazards.

If you don't write in a higher language, but actually hand-code in assmebly language, then what you suggested may optimize the pipeline throughput, but keep in mind that you would have to be highly aware of the instructions you interject as well, taking in account the various stall penalties between different types of instructions.

Adam_42

3,664

December 13, 2010 11:39 AM

It's also worth noting that it's generally impossible to schedule instructions far apart enough to hide the length of stall you get from actually reading main memory - it could be hundreds of clock cycles. To avoid those stalls you need prefetching either manual with prefetch instructions or automatic via the CPUs built in prefetch logic (if it has any). Instruction schedulers generally assume that data is already in the L1 cache on the CPU.

Out Of order execution

This topic is closed to new replies.

Popular Topics

Recommended Tutorials

Out Of order execution

This topic is closed to new replies.

Popular Topics

Recommended Tutorials

Reticulating splines