• Content count

  • Joined

  • Last visited

Community Reputation

697 Good

About Koen

  • Rank
  1. Note that it's still worth 'unifying' your code. Once you start using render targets as textures, the cancellation will not happen (since there is no loading of an image), and you'll need to have the texture coordinates correct for both apis.
  2. Awesome! That actually works Thanks a lot!
  3. Thanks for all the suggestions! When I use the newer compiler, and use bool instead of float, the selection between the two branches seems to work out. But then the normal is still wrong: it gets normalized in both branches, but not flipped for the backside. When I then try the suggestion to calculate the normal by multiplying by VFACE, I get this: // // Generated by Microsoft (R) HLSL Shader Compiler 9.30.9200.16384 // // Parameters: // // float4 BackDiffuse; // float BackShininess; // float4 BackSpecular; // sampler2D BackTexture; // float4 FrontDiffuse; // float FrontShininess; // float4 FrontSpecular; // sampler2D FrontTexture; // float4 LightColor0; // float3 LightDirection0; // // // Registers: // // Name Reg Size // --------------- ----- ---- // LightDirection0 c0 1 // LightColor0 c1 1 // FrontDiffuse c2 1 // FrontSpecular c3 1 // FrontShininess c4 1 // BackDiffuse c5 1 // BackSpecular c6 1 // BackShininess c7 1 // FrontTexture s2 1 // BackTexture s3 1 // ps_3_0 def c8, 1, -1, 0, 0 dcl_normal v0.xyz dcl_texcoord v1.xy dcl vFace dcl_2d s2 dcl_2d s3 0: cmp r0.x, vFace, c8.x, c8.y 1: mul r0.yzw, -r0.x, v0.xxyz 2: nrm r1.xyz, r0.yzww 5: if_ne r0.x, -r0.x 8: mov r0.xyz, c0 9: add r0.xyz, r0, c8.zzxw 10: nrm r2.xyz, r0 13: dp3_sat r0.x, r1, r2 14: pow r1.w, r0.x, c7.x 17: texld r0, v1, s3 17: mul r2.xyz, r0, c6 18: mul r3.xyz, r1.w, c1 19: mul r2.xyz, r2, r3 20: dp3_sat r0.w, r1, c0 21: mul r3.xyz, r0.w, c1 22: mul r0.xyz, r0, c5 23: mad r0.xyz, r3, r0, r2 24: else 25: mov r2.xyz, c0 26: add r2.xyz, r2, c8.zzxw 27: nrm r3.xyz, r2 30: dp3_sat r0.w, r1, r3 31: pow r1.w, r0.w, c4.x 34: texld r2, v1, s2 34: mul r3.xyz, r2, c3 35: mul r4.xyz, r1.w, c1 36: mul r3.xyz, r3, r4 37: dp3_sat r0.w, r1, c0 38: mul r1.xyz, r0.w, c1 39: mul r2.xyz, r2, c2 40: mad r0.xyz, r1, r2, r3 41: endif 42: mov oC0.xyz, r0 43: mov oC0.w, c8.x // approximately 46 instruction slots used (2 texture, 44 arithmetic) Look at the if statement in line 5! That can't be right? How come this stuff is so shaky? It's not like I'm doing super advanced stuff... I guess something else must still be wrong. Some more googling points to some forum threads claiming branching code (especially when combined with textures) are not SM3's strong point. I can imagine performance being crappy, but it should still work, right? Besides, when I use the normal's z-component(in camera space) to decide on which branch to take, everything works fine, even with texturing on both sides.
  4. So I've tried switching to compiler 4.6 (this is the one that comes with VS2012). Before it was 4.3. The disassembled code indicates the version number used to be 9.29.952.3111, while the newer one has version number 9.30.9200.20546. I also tried reducing the optimization level. Didn't make any difference. Here's the pixel shader's full hlsl code: struct PS_INPUT { float3 normal : NORMAL0; float2 textureCoordinate : TEXCOORD0; }; struct PS_OUTPUT { float4 color : COLOR0; }; void HandleClipping(PS_INPUT Input) { } float3 LightDirection0; float4 LightColor0; float4 FrontDiffuse; float4 FrontSpecular; float FrontShininess; sampler2D FrontTexture: register(s2); float4 BackDiffuse; float4 BackSpecular; float BackShininess; sampler2D BackTexture: register(s3); float LambertFactor(float3 normal, float3 lightDir) { return saturate(dot(normal, lightDir)); } float PhongFactor(float3 normal, float3 lightDir, float shininess) { float3 H = normalize(lightDir + float3(0,0,1)); float NdotH = saturate(dot(normal, H)); return pow(NdotH, shininess); } float4 ShadeFragment(PS_INPUT Input, float isBackFacing) { if (isBackFacing < 0) { Input.normal = normalize(Input.normal); float4 color = float4(0,0,0,1); float4 textureColor = tex2D(FrontTexture, Input.textureCoordinate); float4 diffuse = float4(1,1,1,1); diffuse *= textureColor; diffuse *= FrontDiffuse; float4 specular = float4(1,1,1,1); specular *= textureColor; specular *= FrontSpecular; color += LambertFactor(Input.normal, LightDirection0) * LightColor0 * diffuse; color += PhongFactor(Input.normal, LightDirection0, FrontShininess) * LightColor0 * specular; return float4(color.rgb, 1); } else { Input.normal = - normalize(Input.normal); float4 color = float4(0,0,0,1); float4 textureColor = tex2D(BackTexture, Input.textureCoordinate); float4 diffuse = float4(1,1,1,1); diffuse *= textureColor; diffuse *= BackDiffuse; float4 specular = float4(1,1,1,1); specular *= textureColor; specular *= BackSpecular; color += LambertFactor(Input.normal, LightDirection0) * LightColor0 * diffuse; color += PhongFactor(Input.normal, LightDirection0, BackShininess) * LightColor0 * specular; return float4(color.rgb, 1); } } void main(in PS_INPUT Input, in float isBackFacing : VFACE, out PS_OUTPUT Output) { HandleClipping(Input); Output.color = ShadeFragment(Input, isBackFacing); }
  5. Ok, so it seems this might not be my mistake after all then :-) I currently use the D3DX functionality from the June 2010 SDK. For now I'm still stuck to Windows XP, so I guess switching to newer versions of the compiler won't work (I'm doing runtime shader generation, so I can't compile offline, and the D3DCompiler dlls are only available for Vista and higher). I'll already try whether the issue is fixed on my Windows 7 machine if I use D3DCompile and D3DReflect instead (the documentation is not really clear on whether this will all work with D3D9, but let's find out :-) ).   Are there possibilities to get the June 2010 SDK compiler to do the right thing? Using different statements, or reordering code or somesuch?   If all else fails, I can always return to the flawed approach of using the (possibly interpolated) vertex normals to decide on front vs back.
  6. Hi, I've been trying to use the VFACE semantic to get different shading on the front and back side of my geometry. According to the D3D documentation, VFACE should be positive for frontfacing triangles and negative for backfacing triangles (I guess it would be more concise to say counterclockwise and clockwise?). I got it working in most scenarios. Only when textures are involved, it seems to fail (more precisely: I've got a batch of tests, and only the ones with textures are failing. What happens is that only one side of the geometry gets shaded, or both sides get shaded with the same material). I've been trying to pinpoint the problem, but without succes. In the case where it works, I get shader assembly code like this: ps_3_0 def c0, 0, 1, 0, 0 dcl vFace 0: cmp oC0.xz, vFace, c0.xyyw, c0.yyxw 1: mov oC0.yw, c0.xxzy I don't know the instructions well, but this seems logical: in line 0 something different happens, based on the value of vFace. When I now take one of my shaders that use textures, I get this: ps_3_0 def c8, -1, 1, 0, 0 dcl_normal v0.xyz dcl_texcoord v1.xy dcl vFace dcl_2d s2 dcl_2d s3 0: cmp r0.x, vFace, c8.x, c8.x 1: if_lt r0.x, c8.x //blah 24: else //blah 45: endif That seems very wrong: c8.x gets stuffed in r0.x, no matter what value vFace has! In both cases, the hlsl code is something like: void main(in PS_INPUT Input, in float isBackFacing : VFACE, out PS_OUTPUT Output) { Output.color = float4(0,0,0,1); if (isBackFacing < 0) //blah else //blah } Am I misunderstanding this stuff? Are there things I should know about the interaction between VFACE and texturing? I really have no clue why this doesn't work, so any help is much appreciated BTW: I check all calls into D3D, and have debug output at the highest level, but no errors are reported.
  7. In my (windows) application I want to render in a thread that is not the main gui thread. What would be the best way to 'share pictures' between two threads? My initial plan was to let the renderthread draw to d3d textures, and let the main thread blit the textures using a fullscreen quad. That would mean the renderthread has to wait while the main thread is drawing the fullscreen quad (as in: both threads should not be using the same direct3d device, even when it was created using the multithreading flag). More context: I'm writing rendercode for a cad-like application. It will be used to process lots of unstructured triangles. Rendering those takes some time (vertex data might not fit into video ram, multipass techniques are really slow on some hardware,...), but it should not 'block' the application. That's why all 'real' rendering is done on a second thread, and the main thread should only 'blit' (Present, fullscreen quad,...) the results to a window. So whenever an application generates an event that should trigger a redraw, the renderthread will first draw a 'quick' approximation (using LOD, simpler techniques, or maybe even drawing bounding boxes or nothing at all...) for the application to show. Afterwards it will start rendering a slow high-quality image. To avoid flickering (because of always showing the 'quick' image) the main application thread will always wait some time (eg. 30Hz) before presenting the 'quick' image. If within that time the high-quality image is ready, the 'quick' image will be discarded. The renderthread on the other hand should have the possibility to interrupt rendering the high-quality image. So that when lots of render requests arrive (eg. the user is interactively rotating the view) it can continuously generate the 'quick' frames for immediate feedback. An alternative idea to avoid the render-to-texture-combined-with-fullscreen-quad approach is to create swapchains with one backbuffer. The quick image could be rendered directly to the backbuffer. The slow image is still rendered to texture, but the final stage would be to blit the slow image to the backbuffer. That way, the main thread only has to call Present() whenever it wants, and very little locking has to happen. Well, actually lots of locking has to happen, but it will be relatively simple, and the waiting will be limited. I would use a rendercommand queue approach where every rendercommand does some state setting and typically one draw call. Each one of those would then be locked separately, giving the main thread the opportunity to call Present() in between rendercommands. Thanks for reading!
  8. I'm implementing transparency using depth peeling. I have an implementation that works. But things start to fail when I resize my window. When I glClear my resized blended color texture (the one that stores the result of blending each consecutive peeled layer over the previous ones) before the first pass, It should become all black (alpha being 1.0). Instead it looks like this: [img]http://i40.tinypic.com/23prh4.png[/img] (the white/grey pattern was added by the tool I use to view the texture). Obviously this does not result in a correct final result :-) [img]http://i44.tinypic.com/kbw19s.png[/img] I check for OpenGL errors after each single gl function call, but no errors are generated. My fbo's are complete, and when I use gDEBugger I can see that all textures/render targets have the same size. I have been looking into textures being attached to an fbo and a texture unit at the same time. I've been playing with the wrapping and mipmap settings to fix things. But I can't find the problem. Can anyone give me hints here? I would be very grateful!
  9. Semantics

    You could create the library using simple abstract classes, and provide a wrapper that adds smart_pointer semantics. That way users can still choose to use the raw-pointer-interface.
  10. C# Dictionary

    [quote name='btower' timestamp='1300962553' post='4789890'] which should perform really well (close to O(N) for lookups). [/quote] Shouldn't that be [i]close to O(1)[/i]?
  11. I have heard of similar "problems" using the .net timer class. The explanation was that when you keep no reference to the timer object, the garbage collector might collect the timer before the specified time has passed and your function has been called. Maybe that can explain your strange results?
  12. That's an elegant solution. (You can't use the baseclass members in a deriving class' constructor initializer list, so you'll have to assign in the body.)
  13. Quote:Original post by RobTheBloke Maybe in the strictest dictionary definition sense I have always been wondering why that constructor limitation was in there. I don't see how a constructor would ever need to influence the physical layout of a class. Maybe the commentary at the end of this note is an explanation. POD "classes" are supposed to mimic C structs, which is broader than only memory layout. The expression 'new T' should leave the POD object members un-initialized, which might not be the case if you implement a constructor.
  14. Quote:Original post by Hodgman Quote:...meaning it has no user-declared constructors...As far as I can tell, your class meets these requirements, assuming T is POD. Aren't his constructors user-declared? So I would assume that violates the requirements?
  15. You should definitely consider using the std::string class for this. You don't have to specify any buffer size. It will dynamically resize when it needs more memory: #include <iostream> #include <string> int main() { using namespace std; string firstName; string lastName; cout << "Please enter your first name: "; cin >> firstName; cout << endl << "Please enter your last name: "; cin >> lastName; string fullName = lastName + ", " + firstName; cout << endl << fullName << endl; } edit: You already commented out the string include in your example, so maybe this is not new information :)