Jump to content

  • Log In with Google      Sign In   
  • Create Account


#Actualcr88192

Posted 22 March 2013 - 03:08 AM

@Shogun: yes, this is also a possible reason to write an assembler.

 

then you can dynamically assembler whatever, without having to manually invoke the Intel docs and generate the machine code sequences by hand.

also, in the simple case, writing an assembler isn't really hard, in my case most of the "work" was the long/tedious job of transcribing the instruction-listings and similar.

 

(initially, I wrote it over the course of a few days, but the assembler has expanded a bit since then).

 

though, yes, I have still used manually-generated machine code in a few places, as for certain uses this is more useful (the assembler isn't entirely free...).

 

 

as an example, from my listing:

add
        04,ib            al,i8
        X80/0,ib        rm8,i8        aleph
        WX83/0,ib        rm16,i8        aleph
        TX83/0,ib        rm32,i8        aleph
        X83/0,ib        rm64,i8
        X02/r            r8,rm8        aleph
        X00/r            rm8,r8        aleph
        W05,iw        ax,i16
        WX81/0,iw        rm16,i16        aleph
        WX03/r        r16,rm16        aleph
        WX01/r        rm16,r16        aleph
        T05,id        eax,i32
        TX81/0,id        rm32,i32        aleph
        TX01/r        rm32,r32        aleph
        TX03/r        r32,rm32        aleph
        X05,id        rax,i32
        X81/0,id        rm64,i32
        X03/r            r64,rm64
        X01/r            rm64,r64

... 

inc
        W40|r            r16            leg
        T40|r            r32            leg
        XFE/0            rm8            aleph
        WXFF/0        rm16            aleph
        TXFF/0        rm32            aleph
        XFF/0            rm64

 

where 'X' basically means where to put the REX prefix, W/T where to insert the operand-size prefix (when relevant), ...

(other letters have other meanings, like V/S for address-size prefix, H/I/J/K/L for VEX and XOP forms, ...).

 

likewise: '/r' means "insert ModRM here", ',id' means immediate dword, ...

 

and the last line is basically for flags, such as to indicate when/where sequences are valid (CPU modes, ...).

 

'leg', basically means "legacy only", 'long' means long-mode only, and 'aleph' was basically for a past x86 subset.

I had once considered having a verifiable x86 subset sort of like NaCl (but more like the JVM verifier), but this idea didn't really go anywhere.

 

 

when the assembler is compiled, a tool basically takes the instruction listings, and converts them into the relevant C source files and headers (basically, it converts them into big prebuilt tables).


#1cr88192

Posted 22 March 2013 - 03:07 AM

@Shogun: yes, this is also a possible reason to write an assembler.

 

then you can dynamically assembler whatever, without having to manually invoke the Intel docs and generate the machine code sequences by hand.

also, in the simple case, writing an assembler isn't really hard, in my case most of the "work" was the long/tedious job of transcribing the instruction-listings and similar.

 

(initially, I wrote it over the course of a few days, but the assembler has expanded a bit since then).

 

though, yes, I have still used manually-generated machine code in a few places, as for certain uses this is more useful (assembler isn't entirely free...).

 

 

as an example, from my listing:

add
        04,ib            al,i8
        X80/0,ib        rm8,i8        aleph
        WX83/0,ib        rm16,i8        aleph
        TX83/0,ib        rm32,i8        aleph
        X83/0,ib        rm64,i8
        X02/r            r8,rm8        aleph
        X00/r            rm8,r8        aleph
        W05,iw        ax,i16
        WX81/0,iw        rm16,i16        aleph
        WX03/r        r16,rm16        aleph
        WX01/r        rm16,r16        aleph
        T05,id        eax,i32
        TX81/0,id        rm32,i32        aleph
        TX01/r        rm32,r32        aleph
        TX03/r        r32,rm32        aleph
        X05,id        rax,i32
        X81/0,id        rm64,i32
        X03/r            r64,rm64
        X01/r            rm64,r64

... 

inc
        W40|r            r16            leg
        T40|r            r32            leg
        XFE/0            rm8            aleph
        WXFF/0        rm16            aleph
        TXFF/0        rm32            aleph
        XFF/0            rm64

 

where 'X' basically means where to put the REX prefix, W/T where to insert the operand-size prefix (when relevant), ...

(other letters have other meanings, like V/S for address-size prefix, H/I/J/K/L for VEX and XOP forms, ...).

 

likewise: '/r' means "insert ModRM here", ',id' means immediate dword, ...

 

and the last line is basically for flags, such as to indicate when/where sequences are valid (CPU modes, ...).

 

'leg', basically means "legacy only", 'long' means long-mode only, and 'aleph' was basically for a past x86 subset.

I had once considered having a verifiable x86 subset sort of like NaCl (but more like the JVM verifier), but this idea didn't really go anywhere.

 

 

when the assembler is compiled, a tool basically takes the instruction listings, and converts them into the relevant C source files and headers (basically, it converts them into big prebuilt tables).


PARTNERS