# __allshr and bitboard

This topic is 4434 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

## Recommended Posts

I am currently rewritting my chess engine and I use bitboard in order to have a quite fast move generation. I have just profile my code and it appears that 40% of the code is spend in __allshr, is this just because of the right shifts ? I use Visualt Studio 2005 (Beta 2) and it generates nearly 16 000 000 moves / sec which is not so bad, but nothing compare to the 40 000 000 moves / sec of GNU Chess. Is there something to do with theses __allshr ?

##### Share on other sites
Here is an assembler listing for __allshr. The instructions match a disassembly of the same function exported by ntoskrnl.exe on XP. I doubt your code imports the function from that same place, I just mention that to say that the code in the link is an accurate reflection of the function.

At any rate, I don't think a faster version of that function can be crafted. If you think that's the bottleneck, you might want to reorganize your code to be less reliant on that function. Maybe you don't need to rely on right shifts so much?

##### Share on other sites
Do you really need an arithmetic (signed) right shift for a bitboard? A simple unsigned shift should be much faster.
Quote:
 Original post by LessBreadAt any rate, I don't think a faster version of that function can be crafted.
That may be true for the general case but does the OP really need to cover all three cases in the code?
If all that's required is a variable shift between 0-31 steps then two (albeit slow) instructions would be enough.

##### Share on other sites
I don't know enough about the code to say whether the general case applies or not. If it doesn't, then there's the code to use to craft a specialized version of the function that doesn't contend with all three cases.

##### Share on other sites
Quote:
 Original post by doynaxDo you really need an arithmetic (signed) right shift for a bitboard? A simple unsigned shift should be much faster.

The disassembly of aullshr isn't very different from allshr.

;********************************************************************************; _allshr (1404)0x4026AC: 80F940         CMP      CL,0x40                       0x4026AF: 7316           JAE      0x4026C7                      0x4026B1: 80F920         CMP      CL,0x20                       0x4026B4: 7306           JAE      0x4026BC                      0x4026B6: 0FADD0         SHRD     EAX,EDX,CL                    0x4026B9: D3FA           SAR      EDX,CL                        0x4026BB: C3             RET                                    ;********************************************************************************0x4026BC: 8BC2           MOV      EAX,EDX                       ; <==0x004026B4(*-0x8)0x4026BE: C1FA1F         SAR      EDX,0x1F                      0x4026C1: 80E11F         AND      CL,0x1F                       0x4026C4: D3F8           SAR      EAX,CL                        0x4026C6: C3             RET                                    ;********************************************************************************0x4026C7: C1FA1F         SAR      EDX,0x1F                      ; <==0x004026AF(*-0x18)0x4026CA: 8BC2           MOV      EAX,EDX                       0x4026CC: C3             RET                                    ;********************************************************************************;********************************************************************************; _aullshr (1408)0x40283F: 80F940         CMP      CL,0x40                       0x402842: 7315           JAE      0x402859                      0x402844: 80F920         CMP      CL,0x20                       0x402847: 7306           JAE      0x40284F                      0x402849: 0FADD0         SHRD     EAX,EDX,CL                    0x40284C: D3EA           SHR      EDX,CL                        0x40284E: C3             RET                                    ;********************************************************************************0x40284F: 8BC2           MOV      EAX,EDX                       ; <==0x00402847(*-0x8)0x402851: 33D2           XOR      EDX,EDX                       0x402853: 80E11F         AND      CL,0x1F                       0x402856: D3E8           SHR      EAX,CL                        0x402858: C3             RET                                    ;********************************************************************************0x402859: 33C0           XOR      EAX,EAX                       ; <==0x00402842(*-0x17)0x40285B: 33D2           XOR      EDX,EDX                       0x40285D: C3             RET                                    ;********************************************************************************

##### Share on other sites
Quote:
Quote:
 Original post by doynaxDo you really need an arithmetic (signed) right shift for a bitboard? A simple unsigned shift should be much faster.

The disassembly of aullshr isn't very different from allshr.
Too bad.. I suppose you could get rid of the >64 case at least.
The MMX instruction set as a PSRLQ instrunction for logical 64-bit right shifts.
But just getting an inlined version should help a lot (no need to preserve any registers, among other things).

edit: If an unsigned shift works then the new bits shifted in obviously doesn't matter. So a simple SHRD for the lower word and SHR for the higher should work too.