• Content count

  • Joined

  • Last visited

Community Reputation

122 Neutral

About Assembler015

  • Rank
  1. Branch Prediction and MSVC

    Quote:Original post by Starfox MSVC Has equivalent functionality - check MSDN for "__assume". Yea I stumbled across __assume as well. Unfortunately, the Anonymous Poster is right -- it's not the same as __likely. After spending sometime introducing myself to profile guided optimization,I agree -- it'll do what I'm looking for. Personally, I would still have liked the ability to give a compiler hint about the probability of a path being taken. No big deal though. I'm sure opinions will vary as well :-) Thanks again for guiding me to the PGO stuff, it'll come in handy. --Andrew
  2. Branch Prediction and MSVC

    Quote:Original post by phil_t Quote:Original post by Assembler015 Hey, I've already identified the critical portions of code. The problem isn't related to inlining of code, but rather the generation of the code. He was talking about profile guided optimization, which would include the situation you describe (I assume) in addition to other optimizations like inlining. I guess MSVC doesn't allow you to mark up code as such, but you can profile the code in real world situation and have the compiler do it for you. I've never tried it though, and I think it requires an expensive version of visual studio. Sorry, my mistake. I had never head of profile guided optimization till now. I will take some time to explore this avenue -- not exactly what I was looking for but it might do the trick. Quote:Original post by outRider Are you looking for the compiler to generate branch hint instructions so that the unlikely branch is predicted taken the first time, or to generate a jump for the loop termination instead of the other way around? I was looking to generate a jump for the loop termination :-) I agree, I think I'm stuck re-arranging code. The Intel compiler is tempting, but I just don't have that much money :-( Thanks for the help everyone, --Andrew
  3. Branch Prediction and MSVC

    Quote:Original post by Anonymous Poster msvc++ doesn't provide any special directives to guide inlining based on the likelihood of code paths being entered. profile guided optimization is the way to go in this case which is superior to manually identifying hot code paths. Hey, I've already identified the critical portions of code. The problem isn't related to inlining of code, but rather the generation of the code. For example, suppose you wanted a loop that would only bail out if some rare event occurs... while(1) { if(unlikely_expression) { /* calculate some stuff and bail */ return; } } The compiler puts the code together as follows: ;test if the unlikely expression is zero jnz continue_function ;perform some wrapup calculations ret continue_function: ;continue with the loop Here, the best that can be hoped for is two branch misses. If a lot of branches occur throughout the loop, it could be as bad as a miss per loop. In any case, the assembly could easily be rewritten so the static branch prediction predicts the correct path on every loop but the last, regardless of how many branches occur throughout the loop. Now, I did try rewriting the code to invert the conditional, and the compiler generated better code. This approach isn't acceptable for a couple reasons. First, there's no guarantee the code will always be generated like this. Second, in my opinion, it made the code more difficult to read. The one sure-fire way to help the compiler out is to give it some more information -- to tell it that the branch won't happen often. --Andrew
  4. Hello - While going through the disassembly of a few performance critical functions, I noticed that the Visual Studio 2005 C compiler does a poor job at branch prediction in some key areas. The GNU compiler provides a few keywords (__unlikely and __likely) that hint to the compiler about the probability of a certain code path being taken. I was interested in an equivlent functionality supported by the Microsoft compiler. Unfortunately, my Google and MSDN searches have turned up nothing. So, in my last ditch effort, I am turning to GameDev for help :-) If anyone has any knowledge in this area it would be greatly appreciated. Thanks, --Andrew
  5. Hello - I'm hoping there's some nmake gurus out there that can help me out. Here's my problem... I'm using Visual Studio 2005 and have a Makefile setup to compile some code. Two values passed into the Makefile are path names. I want these path names to be able to be relative or absolute. Here's where my question comes in... Assuming one of the path names passed in is absolute, and the other is relative; how can I go about determining if the directories are the same? Obviously something like this will not work: !if "$(DIRECTORY1)" == "$(DIRECTORY2)" I've read through all the nmake documentation I can find but have yet to find an answer. Any help you can offer is greatly appreciated! Thanks :-)
  6. I *HATE* Linux hippies and their unprofessional behaviour

    Well, speaking from experience, I can safely say GPL code is very contaminating. At Motorola we are putting together the 'next generation cell phone'. This next-gen cell phone is a dual core system running Linux on one processor to handle the user interface and a proprietary OS on the other processor to handle proprietary things. Motorola lawyers have been very adamant (based on their interpretation of the GPL) about a few things. First, any code that goes into the kernel *must* be GPL'd. So all of the dynamically linked and statically linked drivers we write have to be GPL'd simply because we are using the exported Linux kernel API. Even worse, the code in the Linux environment cannot be put anywhere near our proprietary os environment. We have some Linux driver code that we were also running on our proprietary os, which had to be yanked and rewritten by new developers who weren’t influenced by the code in the Linux environment. If the code had been placed into the proprietary os environment, most of the os would have to be GPL'd. Anyways, the GPL has done nothing but complicate a lot of the work we do there. It's a blast. --Andrew
  7. OpenGL Optimized Render Pipeline & GLSL

    Thanks for all the replies. I'm going to agree with some of the assumptions that it's a driver issue for several reasons... 1) A desktop with a Geforce FX 5xxx get's 170fps at 1024x768x32 2) A desktop with a ATi Radeon 9xxx get's 140fps at 1024x768x32 3) I've had driver issues with this laptop before. There is a previous thread of mine going on about difficulties with shaders I was having. For some reason, Catalyst 5.10 drivers would cause a varying vec4 -- set to vec4(0,0,0,0) for all vertices -- to randomly have the vec4.z component go to a seemingly random value. I love going on about how crappy my experience has been with ATi throughout the years but I'll let it go for tonight. I'll try finding a new set of drivers to try; thanks for all the help, --Andrew
  8. OpenGL Optimized Render Pipeline & GLSL

    Quote:Original post by Gluc0se Are you doing your drawing in immediate mode? Or are you using display lists? From the likes of this, it sounds like the bottleneck is happening in the CPU somewhere (though posted shader code might change that assumption). I ask this because I discovered first hand just how slow I could make a really fast video card (Quadro 4500) run when I was still using old code that drew each triangle without display listing. -Edit Also after reading above, if its emulating any part of the vertex or fragment program process, that will take a serious hit on performance. That is one thing I'm not doing. However, I'm not using display lists because none of the objects in my program are static -- so I was under the impression that display lists would be pretty useless to me. Quote: Yeah the fact that the vertex shader is emulated in software might be a big hit. Instead of drawing, 4000 tris, draw one quad that fills up the whole screen. If the performance improves, that means that the vertex shader was the bottleneck and not the pixelshader. I gave that a shot and the fps picks up quite a bit. So it sounds like it is the # of vertices being used. So in order to speed things up it seems like I'll have to: 1) Remove surfaces that aren't being seen 2) Optimize the vertex shader (I don't think I can get any simpler though) 3) Try to move things into display lists (how much of a speed increase should I expect?) 4) Get a new video card Did I miss any options? :-) Thanks for everyones help, your input has been quite helpful, --Andrew
  9. OpenGL Optimized Render Pipeline & GLSL

    Quote:Original post by zedzeek Quote:The entire scene is roughly 4000 triangles, which isn't very many; which is also why I'm surprised to be seeing framerates of about 14fps only when my pixel shaders are enabled. the number of tris is not important for fragshaders as the number of fragmentss rendered will be the same (if depthtest == on ) no matter how many tris there are. post your vertex + fragment shader code, theres prolly a few things u can do That's a good point. I went down to the minimum of fragment/shader code which boiled down to: Vertex: void main (void) { gl_Position = ftransform(); gl_TexCoord0 = gl_MultiTexCoord0; } Fragment: uniform sample2d texture; void main (void) { vec4 color; color = texture2D(texture, gl_TexCoord[0].st); gl_FragColor = color; } And this ran just as slowly. There are some other interesting notes, however. Earlier in development, I noticed that if I increase the complexity of the vertex shader (by adding a few varying variables and multiplying by a few matrices) that the decrease in framerate was huge. The complexity increase didn't even need to be by that much; say, 3 varying vec3's and 3 matrix multiplies. I understand that a vertex shader running in software will be slower than in hardware, but, is a slow down as extreme as this expected? Is there anyway it can be optimized? Also, here are some statitics I've gathered: No pixel shaders--> 4000K triangles @ 178fps With pixel shaders--> 4000K triangles @ 17fps 2000K triangles @ 45fps Thanks for your help, --Andrew
  10. OpenGL Optimized Render Pipeline & GLSL

    Quote:Original post by Optus Probably try installing the latest drivers for your video card Just got done trying that. I've downloaded the latest omega drivers for my Radeon. I'm beginning to think that it could just be my system. I am developing on a Turion 64 / ATI Xpress 200M laptop. The Xpress200M is full of features but it could be a tad slow. The graphics card supports Pixel shader 2.0, however, it emulates the vertex shader in software. My only grief is: the application I'm running isn't all that complex. Shouldn't the video card beable to at least hand 4K triangles or so with a simple fragment/vertex shader?
  11. OpenGL Optimized Render Pipeline & GLSL

    It looks like my shader takes 14 cycles. After performing the test, I went to the extreme of making my shader as simple as possible. The vertex shader merely computes the vertex position. The fragment shader reads the texture value and writes it to the frag color. With this simple setup I see just as low frame rates as I did with the complex shader. In fact, there is almost no difference. The only weird thing I've encountered throughout my debugging is that my vertex shader appears to run in software (according to the GLSL link phase). Could this be the cause of the bottleneck? If so, is there anything I can do about it? Thanks, --Andrew
  12. Hello - So I'm having a bit of difficulty with a program I'm working on. More accurately, the application drops by about 50fps when my shaders are enabled. So here's what I know... The slow down is caused by the fragment shader. I know this because: 1) My vertex shader is very short 2) The framerate reduces as resolution increases 3) Turning off shaders all together increases performance dramatically NOTE: My fragment shader isn't very complex. The most complex operation is 1 length() call. So, the optimizations I've done so far... 1) All information about the shader (Uniform locations / compiling / binding) is obtained at initialization time. 2) The shader program is only installed once - when it needs to be - and then it's uninstalled and replaced with the standard openGL pipeline for the remainder of the frame. 3) I've also attempted rendering in 2 passes like this: /**** START OF RENDER ****/ Render(NON-SHADER OBJECTS); glDepthFunc(GL_LEQUAL); glDisable(GL_TEXTURE_2D); glDisable(GL_BLEND); Render(ALL-SHADER OBJECTS); glDepthFunc(GL_EQUAL); glEnable(GL_TEXTURE_2D); glEnable(GL_BLEND); InstallShaderProgram(); Render(ALL-SHADER OBJECTS); UninstallShaderProgram(); glDepthFunc(GL_LEQUAL); /**** END OF RENDER ****/ All of my attempted optimizations don't have much impact on the framerate. The entire scene is roughly 4000 triangles, which isn't very many; which is also why I'm surprised to be seeing framerates of about 14fps only when my pixel shaders are enabled. So, for those more experienced with GLSL: Are there any good places to learn about optimizing shaders? Also, do you have any input about my current situation? Thanks for all your help --Andrew
  13. SOLVED (Crappy ATI): varing variables in GLSL

    Hmm, that is interesting. When I first encountered the problem I did try a few sets of drivers, but not the Omega drivers. (I tried the ATI drivers that came with my laptop, the windows default, and the latest catalyst drivers. Between each install I ran ATI's 'ati software removal' tool) I'm content with blaming it on ATI though. Previous experiences with them led to random freezes of my computer (while I was just sitting at the desktop), random crashes of 3d intensive games, and the random rejection of the video card by my system (BIOS complaining about strange things and Windows reverting back to the default video driver). I've been an Nvidia fan ever since all those problems...their drivers have always seemed very stable. Unfortunately, I had to choose a laptop with ATI. -Andrew
  14. SOLVED (Crappy ATI): varing variables in GLSL

    Thanks everyone for your replies... I solved the problem, it was a driver issue. I went and had my brother run the shader on his geforce fx 5000 something and it ran exactly like it was suppose to. So this is the 4th time ATI's drivers have screwed me...I should have learned the first time. But anyways, I was running ATI's latest catalyst drivers on my ATI Xpress 200M graphics card (on my laptop) and those drivers just didn't behave right with GLSL shaders (but they worked okay with DX HLSL shaders). I verified this by running additional sample code (some even released by ATI). So I went and I grabbed the latest Omega drivers and they work out of the box...running my shader flawlessly. Thanks for everyones help, ATI sucks, -Andrew
  15. SOLVED (Crappy ATI): varing variables in GLSL

    Quote:Original post by rollo watch out with float equality! Maybe the fragment shader can't represent 0.0 properly, or the way its represented in vectors and scalars are different. You really should put in some epsilons to be safe. So try this instead: *** Source Snippet Removed *** I tried something similar to that. After some playing around I found that when the variable is declared as: varying vec3 n; and set in the vertex shader as: n.x = 0.0; n.y = 0.0; n.z = 0.0; That n.x and n.y are always 0 in the shader. However, n.z is a larger value. I tried ruling out rounding error by saying: if(abs(n.z) < 0.0001){} But the value was significantly larger than that. So why would n.xy stay constant (like I expected) but n.z be changing? EDIT: Also, the output is inconsistent. The model is output as solid green one second, and solid red the next.