Jump to content

  • Log In with Google      Sign In   
  • Create Account

ridiculous compiler output...

  • You cannot reply to this topic
10 replies to this topic

#1 BGB   Crossbones+   -  Reputation: 1554

Like
0Likes
Like

Posted 17 September 2013 - 12:35 AM

I was off compiling something, having some fairly good success...

 

basically, I went and wrote an implementation of the Apple Video codec (partly because, hell, why not?...), and was thinking things were doing pretty good.

after all, the thing is going along, decoding video at a rate of about 430 Mpix/s (megapixels per second), ...

 

side info:

http://en.wikipedia.org/wiki/Apple_Video

http://wiki.multimedia.cx/index.php?title=Apple_RPZA

 

 

then I was like, "maybe I will run it through the profiler, and see what the profiler has to say about it."

 

but, then I saw it...

 

 

the expressions:
tb[6]=0; tb[7]=0;

compile to:
0xfe08c6a mov edx,00000001h BA 01 00 00 00 0.44537419
0xfe08c6f imul eax,edx,06h 6B C2 06
0xfe08c72 mov [ebp-44h],eax 89 45 BC 0.21256495
0xfe08c75 cmp [ebp-44h],10h 83 7D BC 10
0xfe08c79 jnb $+04h (0xfe08c7d) 73 02 1.38335919
0xfe08c7b jmp $+07h (0xfe08c82) EB 05 0.19569471
0xfe08c7d call $+00018ce1h (0xfe2195e) E8 DC 8C 01 00
0xfe08c82 mov ecx,[ebp-44h] 8B 4D BC
0xfe08c85 mov [ebp+ecx-14h],00h C6 44 0D EC 00 0.25305352
0xfe08c8a mov edx,00000001h BA 01 00 00 00 0.50273299
0xfe08c8f imul eax,edx,07h 6B C2 07
0xfe08c92 mov [ebp-3ch],eax 89 45 C4 0.30703825
0xfe08c95 cmp [ebp-3ch],10h 83 7D C4 10
0xfe08c99 jnb $+04h (0xfe08c9d) 73 02 1.26189351
0xfe08c9b jmp $+07h (0xfe08ca2) EB 05 0.2598016
0xfe08c9d call $+00018cc1h (0xfe2195e) E8 BC 8C 01 00
0xfe08ca2 mov ecx,[ebp-3ch] 8B 4D C4
0xfe08ca5 mov [ebp+ecx-14h],00h C6 44 0D EC 00 0.00337405


which is actually, pretty much, just plain ridiculous...

 

(pretty much the most absurd compiler output I have seen recently...).

 

 

like, whatever happened to something like:

xor eax, eax

mov [...], al

or:

mov [...], 0x00

 

?...

 

 

or, maybe it is just my fault for trying to apply logic to compiler output sometimes?...

 



Sponsor:

#2 farmdve   Members   -  Reputation: 194

Like
0Likes
Like

Posted 17 September 2013 - 01:28 AM

I like it when people re-implement something written by big corporations like Google, Microsoft or Apple. But were the specifications for this codec available and enough for you to implement it, or was it reverse engineering done(or is the whole thing open source somewhere?)?

 

Also, what compiler are you using? And to me, it looks like you are proficient in assembly, so I am sure you can inline some assembly to handle that part where you say the compiler is doing a nasty job of optimizing.

 

However, if all you are doing is assigning an integer to whatever that index is pointing to, then that is a lot of instructions for something so simple, but hey, the only thing I know about assembly is what it looks like(depends on the architecture) so I could be wrong.


Edited by farmdve, 17 September 2013 - 01:32 AM.


#3 frob   Moderators   -  Reputation: 21318

Like
2Likes
Like

Posted 17 September 2013 - 01:47 AM

Considering that the assembly you posted makes two function calls to (0xfe2195e) and is referencing other objects (what is at -44h and -3ch?) it is pretty clear that you left out some important information from your problem description.

The function calls are conditional, the locations of the elements are computed, and it looks like either object existence or size bounds are being tested. My hunch is that it isn't a simple array at all, but is instead a sparse array or other container class, and objects are potentially getting created and added to it if they were not already present at those two locations.
Check out my personal indie blog at bryanwagstaff.com.

#4 patrrr   Members   -  Reputation: 1006

Like
0Likes
Like

Posted 17 September 2013 - 02:06 AM

Are you looking at an optimized release build, or debug? Maybe the compiler injected some bounds-checking. Or maybe it just rearranged your code and all that other stuff is actually from other lines, to optimize pipelining for example.


Edited by patrrr, 17 September 2013 - 02:09 AM.


#5 BGB   Crossbones+   -  Reputation: 1554

Like
0Likes
Like

Posted 17 September 2013 - 02:08 AM

Considering that the assembly you posted makes two function calls to (0xfe2195e) and is referencing other objects (what is at -44h and -3ch?) it is pretty clear that you left out some important information from your problem description.

 

 

if you look closely though, it seems that this is actually the code generated for those two array-index assignments.

 

it does actually follow through and assign the array elements, just it does so in a very roundabout way (far more so than normal).

 

 

those locations are (I suspect) temporaries on the stack, and the function-calls are compiler-inserted array bound-check handlers (newer versions of MSVC insert these).

 

 

here is the context of the code fragment where this came from:

    byte tb[16];
    byte *cs, *cse, *csm;
    byte *ct, *cte;
    int xs1, ys1, n, n1;
    int i, j, k, l;

...

        case 0xA0:
            j=(cs[0]<<8)|cs[1];
            cs+=2;
            j=((j&0x7FE0)<<1)|(j&0x1F);

            tb[0]=j&0xFF;
            tb[1]=(j>>8)&0xFF;
            tb[2]=j&0xFF;
            tb[3]=(j>>8)&0xFF;
            tb[4]=0; tb[5]=0;    //
            tb[6]=0; tb[7]=0;    //here

            n1=(i&31)+1;
            for(i=0; i<n1; i++)
                { memcpy(ct, tb, 8); ct+=stride; }
            
//            ct+=(i&31)*8;
            break;

where each of the pairs of assignments generated the same sequences of instructions...

 

IOW:

it appears as if MSVC has gone and generated a few sequences of unusually bad code...

 

compiler: Visual Studio Express 2013 RC

language: C



#6 BGB   Crossbones+   -  Reputation: 1554

Like
0Likes
Like

Posted 17 September 2013 - 02:17 AM

Are you looking at an optimized release build, or debug? Maybe the compiler injected some bounds-checking.

 

this is a debug build.

while debug code isn't usually particularly good, it usually isn't quite this bad.

 

the profiler (AMD Code XL) only really works correctly on debug builds (in my experience), so you have to do a debug build to profile things.

 

 

but, yes, it is compiler-inserted bounds checking...

and some particularly bizarre way of calculating the array offset.

...

 

 

ADD:

switching over to using a pointer rather than a local array, and making a few minor changes (using an #ifdef to eliminate the use of "memcpy()"), did somewhat boost the performance of the debug build, but had negligible impact on a release build (this implies that most likely the release build did not use the bounds-checks).


Edited by BGB, 17 September 2013 - 03:05 AM.


#7 BGB   Crossbones+   -  Reputation: 1554

Like
0Likes
Like

Posted 17 September 2013 - 02:34 AM

I like it when people re-implement something written by big corporations like Google, Microsoft or Apple. But were the specifications for this codec available and enough for you to implement it, or was it reverse engineering done(or is the whole thing open source somewhere?)?


most of the needed information is given or is linked to by the wiki articles.
I actually implemented it directly using this information, and tested it using videos from a linked-to site.

granted, it depends a lot on a lot of other code I already had laying around (I implemented the codec mostly by fudging it into DXT1, and using my DXT1 decoder to convert it back to RGBA).

 

Also, what compiler are you using? And to me, it looks like you are proficient in assembly, so I am sure you can inline some assembly to handle that part where you say the compiler is doing a nasty job of optimizing.


as noted, MSVC / Visual Studio 2013 RC.

there are ways to work around this issue without resorting to inline ASM though (usually you don't want to do this if it can be avoided, as this does bad things to portability).

 

However, if all you are doing is assigning an integer to whatever that index is pointing to, then that is a lot of instructions for something so simple, but hey, the only thing I know about assembly is what it looks like(depends on the architecture) so I could be wrong.


this is the issue.

even if building for debug, this is still a lot of code to accomplish the task...

though, even as such (funky array assignments and being a debug build aside), the codec still gets moderately good performance overall.

Edited by BGB, 17 September 2013 - 02:35 AM.


#8 patrrr   Members   -  Reputation: 1006

Like
1Likes
Like

Posted 17 September 2013 - 04:02 AM


this is the issue.

even if building for debug, this is still a lot of code to accomplish the task...

though, even as such (funky array assignments and being a debug build aside), the codec still gets moderately good performance overall.

 

So the issue is the amount of code that is generated, not the performance hike? Just compile with optimizations. I don't see the problem, and profiling non-optimized code sounds a bit meaningless. Debug symbols are good to have though.

If it's just the bounds checking that is annoying you, you can probably disable it somehow.

Btw. Didn't you find this out during your research? http://www.gamedev.net/blog/1645/entry-2258708-bounds-checking-in-c-it-would-appear-so-sort-of/



#9 Hodgman   Moderators   -  Reputation: 30385

Like
4Likes
Like

Posted 17 September 2013 - 04:13 AM

MSVC gives you options for inserting runtime debugging code, such as bounds checking. These are options - the compiler is just doing what you're telling it to. If you don't want the compiler to do this stuff, edit your project properties.

N.B. these validation options are separate from the optimization and debug-database options.

#10 Sik_the_hedgehog   Crossbones+   -  Reputation: 1747

Like
0Likes
Like

Posted 17 September 2013 - 09:10 AM


profiling non-optimized code sounds a bit meaningless

This is pretty much what I had to do with my game, the profiler was spitting out stuff that didn't seem to make much sense. I took the release build, copied it and enabled the settings to keep the symbols in (so the profiler could be used), and surprise, suddenly the profiler was showing results that made way more sense with what I'd have expected from the code.


Don't pay much attention to "the hedgehog" in my nick, it's just because "Sik" was already taken =/ By the way, Sik is pronounced like seek, not like sick.

#11 BGB   Crossbones+   -  Reputation: 1554

Like
0Likes
Like

Posted 17 September 2013 - 10:10 AM

MSVC gives you options for inserting runtime debugging code, such as bounds checking. These are options - the compiler is just doing what you're telling it to. If you don't want the compiler to do this stuff, edit your project properties.

N.B. these validation options are separate from the optimization and debug-database options.

 

yeah.

 

normally bounds-checks aren't a bad thing, but is kind of pointless with a constant index into a fixed size array, since otherwise it is fairly obvious whether the bounds-check will pass or fail.

 

luckily, it was fairly trivial to side-step though, via using a pointer instead in this case.

 

 

ADD:

just discovered I can generate optimized code and have debug symbols via "/O2 /Zi".

well, this is good to know I guess...

 

(I tried doing similar once before with an older version of MSVC, but it didn't work.)


Edited by BGB, 17 September 2013 - 02:00 PM.






PARTNERS