Archived

This topic is now archived and is closed to further replies.

Some micro-optimization thing...

This topic is 5148 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

Heya, me again. Still working on that VB college project. MSVC automatically converts mutliplications and divisions of the power of 2 to shifts. For instance... SomeVar = OtherVar * 16; ...becomes (sorry, my ASM is rusty but I'm sure you'll get the idea)... MOV eax, (OtherVar) LSL eax, 0x00000004 MOV (SomeVar), eax (Which is OtherVar << 4...) This is quite a good thing as shifts are, last I checked, considerably fast compared to MULs. The fact MSVC automatically changes the more obvious ones for you when compiling is even better. What about VB? I have a couple of rather tight loops that involve multiplications by 16 that, I believe, could be significantly improved by this. Does VB automatically convert obvious multiplications and divisions into shifts when compiling or do they remain MULs? Is there a way to speed up multiplications/divisions of 16 (or any power of 2, for that matter)? Should I just build a lookup table and hope for the best? Thanks I know this sounds a little ridiculous, what with fussing over what looks to be a really small issue, but as I've said, I end up having a good deal of multiplications/divisions by powers of 2 performed everytime the map is rendered and I can see a significant improvement if I were to convert all of these into shifts somehow... I've already looked into the more major optimizations and, on the average PC, it runs at top speed. But the school PCs are a little under-par and I get a slight, not very noticeable but annoying lag. My old man's PC runs it like crap. So I might as well squeeze out every last drop of power I can, I have the time... This isn't due until the end of the semester. [edited by - RuneLancer on November 12, 2003 7:06:47 PM]

Share this post


Link to post
Share on other sites
Probably the time to load, interpret and execute the code with interpreted language is greater than the mult itself...so using shifts instead of mults is always a good practice (during optimization!) but not so important...

If you have to multiply little numbers (ie: BYTES) look up tables can be a good solution.

Share this post


Link to post
Share on other sites
IMO: If you''re micro-optimising or even optimising VB code then your overall architecture design has gone wrong somewhere!

If you have some performance critical routine in your program, do that routine in C or C++ (with inline ASM if you''re 100% sure you can beat what the compiler produces). Write the rest of the system in VB.


Other than that, I''ve no idea what will be produced in the compiled result (why not try it and check). With VB.NET, the JIT stuff is likely to spot obvious stuff. With compiled VB I''m not sure.

Share this post


Link to post
Share on other sites
So how would I go about using shifts with VB?

I got some very... odd results testing a lookup table. I ran two rather unoptimized loops that did nothing but multiply random numbers for a second and keep track of how many multiplications they made. Both had the exact same code to avoid any ambiguities save for one difference: the first was a clean "* 16". The other was simply a lookup table. At first there was a noticeable increase in speed but after running it a few times...

18459 ... 21702
20640 ... 21848
21111 ... 21843
21656 ... 21816
21265 ... 21825
21669 ... 21796
20719 ... 21791
21793 ... 21816

Strange. :/ Did I overlook something (caching, maybe?) or is my PC just evolving into a greater lifeform?

Share this post


Link to post
Share on other sites
quote:
Original post by S1CA
If you have some performance critical routine in your program, do that routine in C or C++ (with inline ASM if you''re 100% sure you can beat what the compiler produces). Write the rest of the system in VB.


Sadly, the problem is that I can''t use C/C++ or any other API than the "built-in" Windows ones. That means I have to stick to the GDI instead of DirectX, for instance. Which hurts a lot since I''m making an RPG (tile-based, but an RPG nevertheless ).

So since all of my code has to be "pure" VB with some Win32 API calls, I''m kinda stuck. If I''ve already optimized something to the best of my abilities, micro-optimization (where it counts, such as very tight loops called repeatedly; the case here) is the last thing I can do.

What sucks is that I''ll probably rewrite the game in C++ with DirectX once I''ll have handed in my project (I like where I''m going with this ) and it''ll be leagues easier (and faster!) than now. At least I''ve learned quite a number of interesting optimizations for VB (who''d have thought mid and mid$ were actually different; I, for one, didn''t know the $ served a purpose )

Share this post


Link to post
Share on other sites
To shift a number in VB6 you use *.
To combine flags, you use +.

There are modules for VB to use DirectX.
It's not much different than using Win32 API calls.

IIRC, VB used to be compiled to p-code, and an exe consist of the execution engine and the p-code wad. Maybe they actually made a compiler for VB6...

[edited by - Magmai Kai Holmlor on November 12, 2003 9:32:00 PM]

Share this post


Link to post
Share on other sites
Actually, * multiples, it doesn''t shift. The effect is the same but unless VB optimizes it once compiled, it results in a MUL instead of a LSL (which is something like 50 cycles instead of 4-5 at best; at least on a 586 architecture, dunno how newer processors perform but I''d guess it''s a bit faster). A bit of google-work revealed that .NET DOES have a bit-shift operator ( << and >> ) but we''re stuck using 6. So that''s irrelevant.

As stated before, and as will be repeated because I wish to stress this, I cannot use ANYTHING BUT THE WIN32 API as this is an open-ended college project (the conditions were pretty simple: do something not seen in class so long as you stick to the Win32 API at most; believe me, I asked if DirectX, external DLLs written in an other language or anything of the sort could be used and the answer, in all cases, was a rather annoying ''no''; I''d gladly code the graphic side of the game in C++ with DirectX if I could). So I''m forced to make-do with what I have. I wouldn''t waste any time on optimization other than in major areas if it were otherwise.

I''ve found a function in the Win32 API that allows shifting of 64 bit integers ("large ints" or some name to that end). What I''m hoping for is a 32-bit equivalent (or 16, or 8; 32 would be best for type reasons though) of that or maybe some other work-around for many integer multiplications and divisions (the only that comes to mind is a lookup table but who knows, just because I don''t know of it it doesn''t mean it doesn''t exist )

VB6 can compile to native code, which allows for quite a number of optimizations (at least, compared to p-code). To be honest, I''m not quite sure what goes in the executable but since it''s a few ks smaller (~30-50), I guess the execution engine is removed and the code is more stand-alone. Still requires a runtime DLL though... :/

Share this post


Link to post
Share on other sites
There is no intrinsic bitwise shift in VB6.

Have you looked at the Advanced Compile options for your project - might help you performance wise.

(Project->Properties->Compile->Advanced Optimizations)

I guess you have already though.

Cheers,
Paul Cunningham

Share this post


Link to post
Share on other sites
you can find out easily if it is doing those optimizations by simply decompiling. Go to your loop that you want to look at, then putting something in the VB compiler like:

dim somevar as long ''integer if you are using .net
somevar = 45634534
'' your loop here

now compile the exe and search for ''45634534'' or ''2B853E6''

right below it should be your loop (hopefully unrolled), but you will notice the compare in the loop cause it will look something like this:

mov eax DWORD [lala]
...
shl eax 4 ; multiply by 16
...
cmp eax DWORD [lala2] | jxx SomeAddr

If you see the shl, you know it''s optimizing. If you see a bunch of random crap that looks like the inside of your loop, it''s probably unrolling.

Good Luck!
Kenny

p.S. why not start on your DX remake now, while you have the extra time, instead of trying to optimize multiplications... seriously unless that loop is looping more than 10,000 times a second, you will never see the difference. (that''s only ~20,000-30,000 clocks)

Share this post


Link to post
Share on other sites
quote:
Original post by sophisticatedlimabean
p.S. why not start on your DX remake now, while you have the extra time, instead of trying to optimize multiplications... seriously unless that loop is looping more than 10,000 times a second, you will never see the difference. (that's only ~20,000-30,000 clocks)


Meh, gotta finish the VB version for college first. Once that's out of the way I can slow down and focus on adding spiffy little features. Like palette-swapped blades of grass (oooh!) or maybe a green sky (aaah!)

The main "problem" loop is responsible for drawing every tile at every layer. The viewable map is roughly 40x30 (16x16 tiles in a 640x480 screen) and has three layers (tiles, objects and foreground tiles such as branches; sprites not included). Which is 3600 calls to two multiplications and... ugh... two bitblts (for transparency; the mask then the tile, except for the first layer which just blits the tile without transparency as it would be pointless ). Although it alone doesn't amount for more than a small handfull of lost FPS, because of the game's specs it runs slow enough as it is (I've been able to squeeze 5-6 FPS on my 4.5 year old PC; even 3-4 more would be a blessing, though it's not unplayable)

I didn't think there were a way to shift in VB but, hey, ya never know there might've been. Can anyone recommand a good decompiler? I'd shoot myself if I turned out wasting my time on this (not literally) ^^;

As for the advanced optimization options, they allowed me to fit in an extra frame or two. My personal goal is 10, so that's why I'm hoping to get as much out of that problem loop as possible (though frankly, the sheer amount of 16x16 bitblts is more the problem than the multiplications Unfortunatly, the ressource leak in TransparentBlt made me shy away from it)

Thanks for the help

Edit: I'm at college right now. Forget it all. We've just received new PCs today and it runs at almost 200 FPS here. So much for need for optimization.

[edited by - RuneLancer on November 13, 2003 12:31:21 PM]

Share this post


Link to post
Share on other sites
quote:
Original post by RuneLancer
Edit: I''m at college right now. Forget it all. We''ve just received new PCs today and it runs at almost 200 FPS here. So much for need for optimization.

heh heh, don''t you just love it when that happens?!?!

Share this post


Link to post
Share on other sites
for a better compilier, you might want to try PowerBasic -- It's not free, but it generates good code, and when it comes down to it, you can write your own inline asm for the loops if you need.

I would suggest, if I were you, to make a large "bitmap" and then use your own asm functions to blt your images to that, before doing the final bitBlt windows function. I don't know how fast that function would be if you wrote it in VB, seeing as there are no bit shifts, but it still *might* save you tons of time (and score hardcore points with your professor )

Clearly, your blt is the bottleneck, so I would optimize that...

P.S. MUL is not 50 cycles and LSL (which doesn't exist) is also not 5 cycles. mul is about 5-11 (depending on your processor) and shl (shift left register by immediate) is 1 cycle.

For a quick asm reference, you may want to check this gamedev article:
http://www.gamedev.net/reference/articles/article208.asp
http://www.gamedev.net/reference/list.asp?categoryid=20#5

EDIT: lol, cancel that

[edited by - sophisticatedlimabean on November 13, 2003 4:06:45 PM]

Share this post


Link to post
Share on other sites