Archived

This topic is now archived and is closed to further replies.

Algorithm Question(s)

This topic is 5146 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

Hello everyone. I have been reading through the strlen.asm file of VC++.NET. I understand how the algorithm works except for this: The code below I know reads in 4-bytes and checks if there is a null-terminator in those four bytes, but how does it work?
main_loop:
        mov     eax,dword ptr [ecx]     ; read 4 bytes
        mov     edx,7efefeffh
        add     edx,eax
        xor     eax,-1
        xor     eax,edx
        add     ecx,4
        test    eax,81010100h
        je      short main_loop
Why is does the strlen.asm file also have aligning? I kindof understand what it is but sometimes the tutorials and code confuses me with memory aligning and instruction aligning. Could some one explain that to me?
        mov     ecx,string              ; ecx -> string
        test    ecx,3                   ; test if string is aligned on 32 bits
        je      short main_loop

str_misaligned:
        ; simple byte loop until string is aligned
        mov     al,byte ptr [ecx]
        add     ecx,1
        test    al,al
        je      short byte_3
        test    ecx,3
        jne     short str_misaligned

        add     eax,dword ptr 0         ; 5 byte nop to align label below

        align   16                      ; should be redundant
Thanks! // Last Attacker \\---=( LA )=---// LA Extreme \\

ICQ Number : 120585863

E-mail: laextr@icqmail.com

Share this post


Link to post
Share on other sites
1. The trick is all in the ADD and the value being added:

a)
7E+00 = 7E (0 0111 1110) = bit 8 and 1 unchanged
7E+01 = 7F (0 0111 1111) = bit 1 changed
7E+02 = 80 (0 1000 0000) = bit 8 changed
7E+4B = C9 (0 1100 1001) = bit 8 and bit 1 changed
7E+AA = 128 (1 0010 1000) = carry/overflow

b)
FE+00 = FE (0 1111 1110) = bit 1 unchanged
FE+01 = FF (0 1111 1111) = bit 1 changed
FE+02 = 100 (1 0000 0000) = all bits changed + carry
FE+4B = 149 (1 0100 1001) = many bits changed + carry
...

c)
FF+00 = FF (0 1111 1111) = no bits changed
FF+01 = 100 (1 0000 0000) = all changed + carry
FF+02 = 101 (1 0000 0001)...


2. The XORs find which bits have CHANGED compared to the original.


3. The TEST ensures that *all* those bits have changed.


4. The alignment of the code is likely to be for instruction cache efficiency and possibly branch delay slots.


5. The alignment of the data reads is likely to be for data cache efficiency.


6. There''s no substitute for single stepping this kind of thing in a debugger and working it through on paper to see what''s going on.

--
Simon O''Connor
3D Game Programmer &
Microsoft DirectX MVP

Share this post


Link to post
Share on other sites