Jump to content
  • Advertisement

Archived

This topic is now archived and is closed to further replies.

zoggo

Asm/C Performance

This topic is 5172 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

The following assembly code is generated by MS Visual C++ 6 push ebp mov ebp,esp push ebx push esi push edi fld dword [ ds:0x4080D0 ] pop edi pop esi pop ebx pop ebp ret for this function: float dotf( vec3 v1, vec3 v2 ) { return 0; } However, when I assemble the code using NASM (exact same calling conventions etc.) the code performs consistantly slower than the compiler generated code. Does anybody know why? I assume NASM is generating slightly different machine code but I don''t know what the difference is or how to change it. It is not being inlined, I can see this from a debugger. It is not dependant on where it is called from. Also, it is not a timing issue as various profilers and hand coded timer systems all show up the function as being slower. Thanks in advance for any help received. James

Share this post


Link to post
Share on other sites
Advertisement
Its not being inlined because I have turned it off. What I want to know is why two pieces of seemingly identical ASM are performing differently.

The idea is that I have two functions doing the same thing, one in C and one in Assembly. I don''t want either to be inlined, I just want the asm one to run at the same speed as the C one.

Sorry I didn''t make that clear.

Thanks anyway

Share this post


Link to post
Share on other sites
I suspect its a data access problem, but check the listing files to see if they both generated the same machine code. I dont remember the option for VC but in NASM, use an -l [listing file] option.

Share this post


Link to post
Share on other sites
I guess my suggestion would be to link both in under different function names and compare the disassembly of the functions directly from the exe with a disassembler. How are you determining that one is slower than the other?

Share this post


Link to post
Share on other sites
Thanks to all those who replied.

I have figured it out now. One, I was being stupid and looking at the wrong disassembly. But two, more interestingly is that VC generates a 6 byte instruction for FLD but NASM generates a 7 byte instruction. If anyone knows why, I would love to know.

However, having forced nasm to use the 6 byte version by means of many a DB, the performance difference is negligable.

Share this post


Link to post
Share on other sites
Just out of interest, does the MASM (7 byte) version differ from the 6 byte version by only an 0x3e at the start, i.e the 7 byte code = "0x3e d1 d2 d3 d4 d5 d6" and the 6 byte version = "d1 d2 d3 d4 d5 d6"?

Skizz

Share this post


Link to post
Share on other sites
Guest Anonymous Poster
Just wondering why, if you''re returning zero, you''re loading the constant from memory afaik there''s a FLDZ instruction on the FPU to handle this case precieslly

Share this post


Link to post
Share on other sites
I''m not using FLDZ because I wanted to replicate the VC++ asm code exactly in NASM. In their infinite wisdom, the MS compiler writers decided not use FLDZ and load 0 constants from memory.

Share this post


Link to post
Share on other sites

  • Advertisement
×

Important Information

By using GameDev.net, you agree to our community Guidelines, Terms of Use, and Privacy Policy.

Participate in the game development conversation and more when you create an account on GameDev.net!

Sign me up!