Getting into Windows

posted in Journal of Caitlin

Published November 05, 2005

Ok, so I've been writing ASM code for all kinds of processors since I was like 14. Most of them have been from the x86 family, and when I have programmed them in a computer system it was usually DOS. As much as I don't like it I think it is about time to get into Windows programming.

What I'm currently working on is an ASM program thats composed of objects. It might sound strange at first but its a really simple concept to understand, and has probably already been done by someone.

Basically each object has code that controls it. Its shape, , texture, movement, reactions to other objects, etc. are all controlled by the object itself rather than a traditional engine. I have been working on this for about two years now, mostly when I have nothing else to do. The only major thing that an engine is in control of is integrating all of the interactions of different objects together to form a single output stream to the user. Other things include inputs, constant forces, timekeeping, networking, etc.

While browsing the site I have noticed a number of posts regarding the problem of calculating the distance between 2 points so I decided I should write a quick ASM routine to do this while it was still fresh in my mind. I don't have a library of 3D and graphics related functions yet so this would be a good place to start. So here is the code I threw together in about 10 minutes, unoptimized, untested, and basically unchecked. I will do all that later but have just thought that I need to also make a routine that would compute the distance between multiple sets of 2D points.

dist_between_points: ;calculates the distance between two points in 3D space

MOVAPS XMM0, point 2 ;load XMM0 with point 2 (packed as n, x2, y2, z2)
MOVAPS XMM1, point 1 ;load XMM1 with point 1 (packed as n, x1, y1, z1)
SUBPS XMM0, XMM1 ;subtract XMM1 from XMM0 (subtract x1, y1, z1 from x2, y2, z2)
MULPS XMM0, XMM0 ;multiply XMM0 by itself (square x, y, and z)
HADDPS XMM0, XMM0 ;add FP values in XMM0 horizontally (n + x, y + z)
HADDPS XMM0, XMM0 ;add horizontally again ((n + x) + (y + z))
SQRTPS XMM0, XMM0 ;calculate the square root to get our distance
MOVAPS distance, XMM0 ;store the distance to use later
EMMS ;free the FPU for other operations
RET ;return to caller

...and the code for two sets of 2D points....

2D_dist_two_points ;calculates the distances between two sets of points in 2D space

MOVAPS XMM0, set 2 ;load XMM0 with point set 2 (packed as x2a, y2a, x2b, y2b)
MOVAPS XMM0, set 1 ;load XMM1 with point set 1 (packed as x1a, y1a, x1b, y1b)
SUBPS SMM0, XMM1 ;subtract XMM1 from XMM0 (x2a - x1a, y2a - y1a, x2b - x1b, y2b - y1b)
MULPS XMM0, XMM0 ;multiply XMM0 by itself (square xa, ya, xb, yb)
HADDPS XMM0, XMM0 ;add FP values in XMM0 horizontally (xa + ya, xb + yb)
SQRTPS XMM0, XMM0 ;calculate the square roots to get distances
MOVAPS XMM1, XMM0 ;copy the distances
PUNPCKHDQ XMM0, XMM0 ;unpack distance a (making the distance xx, yy)
PUNPCKLDQ XMM1, XMM1 ;unpack distance b (making the distance xx, yy)
MOVAPS distance_a, XMM0 ;store distance a
MOVAPS distance_b, XMM1 ;store distance b
EMMS ;free FPU for other operations
RET ;return to caller

After messing around with trying to multiply two matricies together I decided to take a break and write something easy, a routine to scale a matrix by a scaling value:

scale_matrix: ;scales a matrix by scaling value xs, ys, zs, n

MOVAPS XMM4, scalar ;load XMM4 with the scaling value (packed as xs, ys, zs, n)
MOVAPS XMM0, column 0 ;load XMM0 with first column (packed as xs, ys, zs, n)
MOVAPS XMM1, column 1 ;load XMM1 with second column (packed same as above)
MULPS XMM0, XMM4 ;multiply first column by scalar
MOVAPS XMM2, column 2 ;load XMM2 with third column (packed same as above)
MULPS XMM1, XMM4 ;multiply second column by scalar
MOVAPS XMM3, column 3 ;load XMM2 with fourth column (packed same as above)
MULPS XMM2, XMM4 ;multiply third column by scalar
MOVAPS column 0, XMM0 ;store first column
MULPS XMM3, XMM4 ;multiply fourth column by scalar
MOVAPS column 1, XMM1 ;store second column
MOVAPS column 2, XMM2 ;store third column
MOVAPS column 3, XMM3 ;store fourth column
EMMS ;free the FPU
RET ;return to caller

...and naturally the translation of a matrix....

translate_matrix: ;translates a matrix by translation value xt, yt, zt, n

MOVAPS XMM4, translate ;load XMM4 with the translation value (packed as xt, yt, zt, n)
MOVAPS XMM0, column 0 ;load XMM0 with first column (packed as x, y, z, n)
MOVAPS XMM1, column 1 ;load XMM1 with second column (packed same as above)
ADDPS XMM0, XMM4 ;add translation value to first column
MOVAPS XMM2, column 2 ;load XMM2 with third column (packed same as above)
ADDPS XMM1, XMM4 ;add translation value to second column
MOVAPS XMM3, column 3 ;load XMM2 with fourth column (packed same as above)
ADDPS XMM2, XMM4 ;add translation value to third column
MOVAPS column 0, XMM0 ;store first column
ADDPS XMM3, XMM4 ;add translation value to fourth column
MOVAPS column 1, XMM1 ;store second column
MOVAPS column 2, XMM2 ;store third column
MOVAPS column 3, XMM3 ;store fourth column
EMMS ;free the FPU
RET ;return to caller

So after running a couple short errands and wasting some time I thought of what I could add to my rountines. The next step I thought about was to make 'streaming' versions of the scaling and translation routines:

scale_matrix_stream: ;scales a number of matricies by scaling values xs, ys, zs, n

MOVAPS XMM7, scalar ;load XMM4 with the scaling value (packed as xs, ys, zs, n)
MOV ECX, repititions ;load ECX with the number of matricies to process
SHR ECX, 1 ;shift LSB of EAX into CF
JNC, evennumber ;the number of matricies to process is even; process matricies
MOVAPS XMM0, column 0 ;load XMM0 with first column (packed as xs, ys, zs, n)
MOVAPS XMM1, column 1 ;load XMM1 with second column (packed same as above)
MULPS XMM0, XMM7 ;multiply first column by scalar
MOVAPS XMM2, column 2 ;load XMM2 with third column (packed same as above)
MULPS XMM1, XMM7 ;multiply second column by scalar
MOVAPS XMM3, column 3 ;load XMM2 with fourth column (packed same as above)
MULPS XMM2, XMM7 ;multiply third column by scalar
MOVAPS column 0, XMM0 ;store first column
MULPS XMM3, XMM7 ;multiply fourth column by scalar
MOVAPS column 1, XMM1 ;store second column
MOVAPS column 2, XMM2 ;store third column
MOVAPS column 3, XMM3 ;store fourth column
TEST ECX, FFFFFFFF ;see if ECX = 0
JZ, streamdone ;finish if ECX = 0
evennumber:
MOVAPS XMM0, column 0 ;load XMM0 with set 0, first column (packed as xs, ys, zs, n)
MOVAPS XMM1, column 1 ;load XMM1 with set 0, second column (packed same as above)
MULPS XMM0, XMM7 ;multiply set 0, first column by scalar
MOVAPS XMM2, column 2 ;load XMM2 with set 0, third column (packed same as above)
MULPS XMM1, XMM7 ;multiply set 0, second column by scalar
MOVAPS XMM3, column 3 ;load XMM3 with set 0, fourth column (packed same as above)
MULPS XMM2, XMM7 ;multiply set 0, third column by scalar
MOVAPS XMM4, column 0 ;load XMM4 with set 1, first column (packed same as above)
MULPS XMM3, XMM7 ;multiply set 0, fourth column by scalar
MOVAPS XMM5, column 1 ;load Xmm5 with set 1, second column (packed same as above)
MULPS XMM4, XMM7 ;multiply set 1, first column by scalar
MOVAPS XMM6, column 2 ;load XMM2 with set 1, third column (packed same as above)
MULPS XMM5, XMM7 ;multiply set 1, second column by scalar
MOVAPS column 0, XMM0 ;store set 0, first column to free XMM0 for use
MULPS XMM6, XMM7 ;multiply set 1, third column by scalar
MOVAPS XMM0, column 3 ;load XMM3 with set 1, fourth column
MOVAPS column 1, xMM1 ;store set 0, column 1
MULPS XMM0, XMM7 ;multiply set 1, fourth column by scalar
MOVAPS column 2, XMM2 ;store results
MOVAPS column 3, XMM3
MOVAPS column 0, XMM4
MOVAPS column 1, XMM5
MOVAPS column 2, XMM6
MOVAPS column 3, XMM0
DEC ECX ;decrement counter
JNZ, evennumber ;process two more matricies if ECX <> 0
steamdone:
EMMS ;free the FPU
RET ;return to caller

If you understand ASM enough to read that, you may ask why I have put two seperate calculation blocks in the routine. Its simple - I want to process as many elements at one time as I can without any branching. So the first block is processed if the number of matricies to process is odd, then the code continues to the next block where two matricies are processed before branching.

So thats all for now. Feel free to comment as you like but please keep comments constructive, otherwise I will just ignore them. Bye bye :)

Previous Entry Test

Next Entry 4 x 4 ASM Matrix Multiplication

0 likes 10 comments

Comments

ApochPiQ

Good luck! Windows is a massive beast, but it's very rewarding once you tame it. Anyone with a Lego avatar has their head on straight, so you should do fine [wink]

Oh, and as I believe is customary, welcome to journal land!

November 05, 2005 12:06 PM

Roboguy

That sounds a bit like OO programming. Out of curiousity, do you just do asm programming, or do you program in other languages also? And, also, if you don't want to go to Windows programming, you could look at some of the other libraries out there (SDL, Allegro, OpenGL, wxWindows, etc).

November 05, 2005 02:54 PM

Caitlin

I only know ASM and a little basic but I'm sure that if I sat down and tried to learn basic it wouldn't be too hard. I have tried at C++ and I usually end up staring at the screen with a blank look on my face and drool coming out of my mouth. The main reason I don't care to go into Windows is that I'm used to system programming, and under Windows it seems like you are limited in what you can do as far as the system is concerned (short of doing some trickery to get around Windows).

November 05, 2005 03:08 PM

Roboguy

I would recommend learning Python. It's easy to learn, it's pretty powerful (it even has some features that C++ doesn't have) and very good library support.

November 05, 2005 03:35 PM

evelyn

Welcome to journalal-ing-a-ling land have some + as a customary welcome to it :)

November 05, 2005 04:14 PM

Sir Sapo

Welcome, mandatory rate++[grin]!

November 05, 2005 05:34 PM

Trapper Zoid

If you can code anything advanced in assembly then you have my respect. Anything more advanced than a few simple subroutines and my brain starts to melt [smile].

If you can handle ASM, then C shouldn't be too much of a stretch to learn. I haven't really got the hang of Windows programming too, because it seems to me to consist too much of looking up the right function calls in a great big reference book.

Best of luck, and it's always good to see another journal around!

November 05, 2005 06:48 PM

Oluseyi

Quote:Original post by Caitlin
The main reason I don't care to go into Windows is that I'm used to system programming, and under Windows it seems like you are limited in what you can do as far as the system is concerned (short of doing some trickery to get around Windows).

a.) Grab the DDK.
b.) Get dirty with the documentation-averse Native API.

Welcome to Journal Land.

November 05, 2005 06:52 PM

Caitlin

Thanks for the welcomes and links :)

Writing ASM isn't really too hard as long as you keep your code's major parts in their own routines and comment them well. What has worked best for me is first writing a routine's core function then looking everything up to make sure I'm using the proper instructions. After that its optimization time and then I can finally write entry and exit code such as setting up registers and pointers, storing data, etc.

November 05, 2005 08:15 PM

Daerax

A first. Usually it is assembly that is looked at in confuselated awe and C++ that is turned to for beginners. Impressive.

You know... your first sentence doesnt really say much. In fact, it doesnt say anything at all..

November 05, 2005 09:37 PM

You must log in to join the conversation.

Don't have a GameDev.net account? Sign up!

Caitlin

Author

Getting into Windows

Comments

Caitlin

Latest Entries

Ouchy

Wait, Not so Fast!

DMA is Good

Much better

Switching to Hardware (Again)

Sigh

Too much to do, too little time to do it

Brain Doesn't Sleep :((

4 x 4 ASM Matrix Multiplication

Getting into Windows

Getting into Windows

Comments

Caitlin

Latest Entries

Ouchy

Wait, Not so Fast!

DMA is Good

Much better

Switching to Hardware (Again)

Sigh

Too much to do, too little time to do it

Brain Doesn't Sleep :((

4 x 4 ASM Matrix Multiplication

Getting into Windows

Reticulating splines