Jump to content
  • Advertisement

tanzanite7

Member
  • Content Count

    270
  • Joined

  • Last visited

Community Reputation

1409 Excellent

About tanzanite7

  • Rank
    Member

Personal Information

  • Role
    3D Animator
  • Interests
    Programming
  1. Minimized vertex shaders (source printout of what gets sent to the compiler): // depth prepass #version 450 #pragma shader_stage(vertex) #extension GL_ARB_separate_shader_objects : enable layout(location=0) in vec3 inPos; invariant gl_Position; layout(push_constant) uniform Push { vec4 proj, pos, rot; } par; vec4 projection(vec3 v) { return vec4(v.xy * par.proj.xy, v.z * par.proj.z + par.proj.w, -v.z); } vec3 qrot(vec4 q, vec3 v) { return v + 2.0 * cross(q.xyz, cross(q.xyz, v) + q.w * v); } vec4 qinv(vec4 q) { return vec4(-q.xyz, q.w); } vec3 transInv(vec3 v, vec4 pos, vec4 rot) { return qrot(qinv(rot), (v - pos.xyz) / pos.w); } vec3 projAndGetPos() { vec3 pos = inPos * (32767.0 / 1024.0); gl_Position = projection(transInv(pos, par.pos, par.rot)); return pos; } void main() { projAndGetPos(); } // draw pass #version 450 #pragma shader_stage(vertex) #extension GL_ARB_separate_shader_objects : enable layout(location=0) in vec3 inPos; layout(location=1) in vec2 inSelColorTex; layout(location=2) in vec4 inNormSelCover; layout(location=3) in vec4 inTexSet; invariant gl_Position; layout(push_constant) uniform Push { vec4 proj, pos, rot; } par; layout(location=0) out Frag { vec3 pos; float selTex; vec4 color; vec3 normal; float selCover; vec3 tocam; flat vec4 texSet; } sOut; vec4 projection(vec3 v) { return vec4(v.xy * par.proj.xy, v.z * par.proj.z + par.proj.w, -v.z); } vec3 qrot(vec4 q, vec3 v) { return v + 2.0 * cross(q.xyz, cross(q.xyz, v) + q.w * v); } vec4 qinv(vec4 q) { return vec4(-q.xyz, q.w); } vec3 transInv(vec3 v, vec4 pos, vec4 rot) { return qrot(qinv(rot), (v - pos.xyz) / pos.w); } vec3 projAndGetPos() { vec3 pos = inPos * (32767.0 / 1024.0); gl_Position = projection(transInv(pos, par.pos, par.rot)); return pos; } void main() { sOut.pos = projAndGetPos(); sOut.selTex = inSelColorTex.y; sOut.color = vec4(0.0); sOut.normal = inNormSelCover.xyz; sOut.selCover = inNormSelCover.w; sOut.tocam = par.pos.xyz - sOut.pos; sOut.texSet = inTexSet * 255.0; }
  2. I cannot see how depth buffer resolution can be relevant to this kind of z-fighting between two passes. Wiki example to illustrate how exactly it looks in my case (stuff gets through unless math errors make it fail - ie. exact same input with exact same code gives different end results): NB! With one major difference from the image given - it is not different geometry intersecting with insufficient depth resolution. It is the exact same geometry fighting with itself (depth prepass) - no amount of extra depth precision can fix it. Anyway, the stats: float depth buffer, range 0.1-1000, not reversed. I actually do not need that range. For sanity sake, i just tried 0.5-100.0 range (which cuts off half my scene - so, projection math works, yay!) - as expected, it makes no difference at all. Both SPIR-V's (prepass and draw pass) do have the "Invariant" decoration for "BuiltIn Position" (named 'gl_Position'). This kind of thing is essential for many - so, driver errors i can not reasonably believe in. I must be doing something wrong, but what?
  3. Ah, yeah, that z clipspace fix rings some bells. Now to figure out why it is missing for some shaders - or what is going on. I vaguely remember seeing somewhere something about compatibility options about [-1,1] <-> [0,1] z clipspace - cannot see that to enable/disable itself without me having any say. Have to investigate - perhaps i can force it to be consistent. Using shaderc (from VulkanSDK 1.1.73.0) with compile options: * target environment: vulkan * set warnings as errors * shaders have identical headers/extensions "GL_ARB_separate_shader_objects" + what shaderc adds. Not using any cache - all always recompiled by the same code. edit: Could not find anything. I wonder what convention the hardware prefers - or whether it makes any difference. If it makes no difference for the hardware then it would make sense to completely omit any muckery with y and z (combining what needs to be done into the projection constants pushed to shader). Could not find anything about how to do that either with shaderc x_x. edit2: I am now virtually certain the y and z muckery is purely an artifact of NSight decompile. While the SPIR-V is a binary format and hard to read - it is still readable in its "assembler" form. Findings for depth prepass: * main calls one function (my projection code - a common shared function) and does nothing else. Findings for normal pass: * SPIR-V has 3 constants with value 2. * main calls my projection function and other stuff. Constant 2 is never referenced. Findings for both: * gl_Position is referenced in only one place and once - inside my shared projection function: %135 = OpFunctionCall %9 %12 %134 // %9 = vec4, %12 = one_of_my_helper_functions, %134 = input variable %137 = OpAccessChain %136 %122 %40 // %136 = pointer to vec4, %122 = gl_Position, %40 = 0 (struct offset) OpStore %137 %135 Which brings me back to - why is there z-fighting happening?
  4. Cannot get rid of z-fighting (severity varies between: no errors at all - ~40% fail). * up-to-date validation layer has nothing to say. * pipelines are nearly identical (differences: color attachments, descriptor sets for textures, depth write, depth compare op - LESS for prepass and EQUAL later). * did not notice anything funny when comparing the draw commands via NSight either - except, see end of this post. * "invariant gl_Position" for all participating vertex shaders makes no difference ('invariant' does not show up in decompile, but is present in SPIR-V). * gl_Position calculations are identical for all (also using identical source data: push constants + vertex attribs) However, when decompiling SPIR-V back to GLSL via NSight i noticed something rather strange: Depth prepass has "gl_Position.z = 2.0 * gl_Position.z - gl_Position.w;" added to it. What is this!? "gl_Position.y = -gl_Position.y;", which is always added to everything, i can understand - vulcans NDC is vertically flipped by default in comparison to OpenGL. That is fine. What is the muckery with z there for? And why is it only selectively added? Looking at my perspective projection code (the usual matrix multiplication, just simplified): vec4 projection(vec3 v) { return vec4(v.xy * par.proj.xy, v.z * par.proj.z + par.proj.w, -v.z); } All it ends up doing is doubling w-part of 'proj' in z (proj = vec4(1.0, 1.33.., -1.0, 0.2)). How does anything show at all given that i draw with compare op EQUAL. Decompile bug? I am out of ideas.
  5. tanzanite7

    Input attachment reads as black.

    New day, new ideas. My VulkanSDK was a bit outdated and since i use the validation layers it provides i though to be worth updating - maybe it sees what i do not. And it did. Found a few seemingly minor things. After fixing thous - everything started working as intended again. Apparently not as minor as i though. Hard to tell which one or combination of caused this. So, all i can really recommend is to check your validation layers are the latest and greatest when facing bizarre problems with vulkan.
  6. Trying to figure out why input attachment reads as black with NSight VS plugin - and failing. This is what i can see at the invocation point of the shader: * attachment is filled with correct data (just a clear to bright red in previous renderpass) and used by the fragment shader: // SPIR-V decompiled to GLSL #version 450 layout(binding = 0) uniform sampler2D accum; // originally: layout(input_attachment_index=0, set=0, binding=0) uniform subpassInput accum; layout(location = 0) out vec4 fbFinal; void main(){ fbFinal = vec4(texelFetch(accum, ivec2(gl_FragCoord.xy), 0).xyz + vec3(0.0, 0.0, 1.0), 1.0); // originally: fbFinal = vec4(subpassLoad(accum).rgb + vec3(0.0, 0.0, 1.0), 1.0); } * the resulting image is bright blue - instead of the expected bright purple (red+blue) How can this happen? 'fbFinal' format is B8G8R8A8_UNORM and 'accum' format is R16G16B16A16_UNORM - ie. nothing weird.
  7. Given: template<typename T> struct Thing { Thing(std::initializer_list<T> arr) {...} }; template<typename T> struct Arbiter { template<typename ...Args> Arbiter(type stuff, Args ...args) : subject(args...) {...} T subject; }; typedef Arbiter<Thing<int32u>> Arbiter32u; Arbiter32u test1(stuff, {1U,2U,3U}); // error C2661 : no overloaded function takes 2 arguments auto wtf = {1U,2U,3U}; Arbiter32u test2(stuff, wtf); // this is fine Why does 'test1' fail where 'test2' is fine? Presumably the constructor used is removed by SFINAE - but why? SFINAE unfortunately masks out whatever goes wrong leaving me just puzzled :/. edit: Found something relevant to the 'auto' special rule: https://stackoverflow.com/questions/26330499/why-is-there-a-special-type-deduction-rule-for-auto-and-braced-initializers-in-c?utm_medium=organic&utm_source=google_rich_qa&utm_campaign=google_rich_qa ... still not sure i get it.
  8. tanzanite7

    workaround for invalid C2247 error

    Yes, i have considered that. Unfortunately, the code referencing snafu logically may not ask explicitly for global scope - it must use current scope. That said, i am considering forbidding use of global logger without explicitly asking for it - which would work around the actual problem without so much smell. But that has to wait atm.
  9. tanzanite7

    workaround for invalid C2247 error

    No. It (Bar) should not try to access a member it should not even know exists (in Foo) and use the variable in global scope it does see and as far as implementer of Bar is concerned - is the only one the implementer can be reasonably expected to know about. One could make a reasonable argument of precedence that the compiler should issue a warning about Foo hiding global snafu, but not generate any errors (ie. like VC warning about member function parameter name hiding class members) - implementer of Bar does not know an unrelated variable with that name exists inaccessibly in Foo and the compiler should use the global snafu. Anyway, this all is pretty much moot as i am fairly sure it works as specified - so, while idiotic, it is the correct way and that is what matters.
  10. tanzanite7

    workaround for invalid C2247 error

    Neither would i - that was not the problem. Seems none of you understood it. However, it served me to become suspicious ... I suspect it is an error in the language standard itself. Given the horrendous nature of C/C++ - would not surprise me in the slightest. Common sense tells that inaccessible private members should be invisible - as doing otherwise would require that implementer of derived class must know and account for all private internal details of the whole parent class chain to avoid accidental private identifier name collisions. This is beyond ridiculous to put it very mildly - yet, i suspect now, that seems how it is supposed to "work" for some reason i cannot fathom. Is anyone familiar enough with the standard to be able to find where it is specified? Would like to amend my bug report if i can find a tangible reason to do so. Would be my first invalid VS bugreport ever in a long line of confimed/fixed reports - bound to happen one day. edit: Checked with godbolt - same result all around. This very strongly hints that it is indeed a language problem. For crying out loud.
  11. Ran into bizarre vc++ error, minimal example code: int snafu; struct Foo { static int snafu; }; int Foo::snafu; struct Mid : private Foo {}; struct Bar : Mid { void really() { snafu = 7; } // error C2247: 'Foo::snafu' not accessible because 'Mid' uses 'private' to inherit from 'Foo' }; The error is nonsense - implementer of Bar should not know/care about private internals of Mid. Reported the problem - but obviously cannot wait for a fix. Any idea how to work around this problem? Background: * 'snafu' is actually a context specific logger instance - the global instance is ... well ... for generic fallback context when there is no need for having a context specific one. * Mid is not a Foo, Mid has/shares a Foo implementation. Similar to composition. * Composition is not an usable alternative as it would need friending a lot of classes dealing with Foo part of Mid who currently do not even know/care about there being a Mid or whatever else. edit: Oh, ffs. I guess having to describe the problem to others helps. Workaround: add a static reference in Bar to global snafu. Ie: int &Bar::snafu = snafu. Have not yet tested it, but pretty sure it will work. edit: Yes, it does work. Annoying that i have to do it, but oh well.
  12. vkQueuePresentKHR is busy waiting - ie. wasting all the CPU cycles while waiting for vsync. Expected, sane, behavior would of course be akin to Sleep(0) till it can finish. Windows 7, GeForce GTX 660. Is this a common problem? Is there anything i can do to make it behave properly?
  13. Followup:   Tried the old code with U3 - same behavior. Which at this point i am confident enough to call a bug (*) - no compiler should fail at this kind of basics. I have compared both compilations and they are nearly identical (even ends up using the exact same registers with same instructions throughout) - with the sole exception of minor difference at the function call site and the unnecessary spill (non-spilling code just uses R8 [free to trash] instead of RBX [requires spill]) and frame code.   Platform Toolset: Visual Studio 2015 (v140). Is this correct? How can i confirm the new compiler is in use (edit: i did notice that the compiler specifically told it had to recompile all functions instead of doing the usual incremental rebuild)?   Did i miss something?   *) Which it obviously technically is not - since the functional end result is exactly what it should be.
  14.   Still on Update1. New optimizer? Did not know that, thanks. Well, then i need to update - will do that.   However, before i revisited the forums today i noticed (since all that code was on slow path i did not care much what happened on that path before) that i can easily convince the compiler to tail-call optimize out the, somehow offending, function call:   ----------------- before:         auto res = man.get(at);         man.lock.put();         return res; ----------------- after:         return man.getAndUnlock(at);   ... and the compiler happily obliged - all of the spill nonsense is gone (ebx gone and all the frame junk also gone [since all the function calls got tail-call optimized causing the caller to be essentially leaf function enabling all relevant optimizations]).   So, unfortunately can not tell whether U3 would have made a difference (would be interesting to know, so might un-fix when i update).
  15. I am experiencing a perplexing registry spill in a function that uses ONLY (spill inclusive) the following registers: rax, rbx, rcx, rdx, rsp.   rsp: for spill (and one call - cannot remember whether it is required ... should be pointless in regards to unwind, dunno) rbx: for spill   Why would that happen? What could cause that? It does not even use all the volatile registers up ( https://msdn.microsoft.com/en-us/library/9z1stfyw.aspx ) - at least R8 and R9 are free (i think R10 and R11 are also with standard Win x64 ABI).   Excluding code for rbx spill itself, rbx ends up used exactly 4 times:   Twice on fast path: movzx       ebx,byte ptr [rcx+rax] lea         rdx,[rax+rbx*4]   Twice on slow path, which i do not really care about: lock cmpxchg dword ptr [rbx],ecx lock cmpxchg dword ptr [rbx],ecx   None of them look suspicious (pretty sure LEA can take whatever as its index register). It is hard to believe VS2015 fails that badly with the basics ... umm ... ???   I guess the function responsible for the call (also on the slow path) is the cause - if i comment it out then rsp and rbx will be unused. If i allow inlining the function then the fast path will be filled with spill code (quadrupled). The function, from callers perspective, is a simple one - equivalent to "static int foo(int)". The slow path ends with another function call, but it is tail-call optimized ("jmp". so, does not care about caller stack frame at all - irrelevant to the issue).   Fun times.
  • Advertisement
×

Important Information

By using GameDev.net, you agree to our community Guidelines, Terms of Use, and Privacy Policy.

We are the game development community.

Whether you are an indie, hobbyist, AAA developer, or just trying to learn, GameDev.net is the place for you to learn, share, and connect with the games industry. Learn more About Us or sign up!

Sign me up!