# lomateron

Member

515

1. ## webGL updating view matix

If I have 10 different shaders for drawing 10 different shaded objects that need the same view matrix do I need to update this view matrix for every shader which means that I have to call gl.uniformMatrix4fv() 10 times every frame?   In directx 10 in one file I can have 10 shaders and only one global "viewMatrix" variable which means I only need to update the view matrix one time per frame. Can something similar be done in webGL?
2. ## HLSL reading outside array safe?

I have an ID3D10EffectVectorVariable in HLSL code it looks like this: int4 array; float4 VS() { uint x = 0; //"x" is loaded from texture can be a value bigger than 3 int grab = array[x]; if(x > 3) { grab = 0; } //blablabla } is this code safe?

if I have a technique that uses the geometry shader, is there any reason to use the vertex shader to do something( except for passing input assembler data to the geometry shader )? Is it faster to do the heavy part of an algorithm in the vertex shader and use the geometry shader only to generate the extra mesh?
4. ## request HLSL support for sqrt() for integers

The problem is that float operations are not deterministic on GPU so sqrt() should support ints for those people that want determinism. I created my own sqrt() for ints, Its actually a function to get the length of a 2D uint vector but it has the limitation of only getting the correct results for numbers from 0 to 2^15. uint length2D(uint2 u, uint b) { //dot uint l = u.x*u.x + u.y*u.y; //aproximation of sqrt() b = b + ((u.x + u.y - b) >> 1); //aproximate it more using babilonian method b = (b + (l / b)) >> 1; b = (b + (l / b)) >> 1; b = (b + (l / b)) >> 1; return b; } "b" is the biggest value in the "u" vector   oops its actually a request for a faster length() function for ints that can support every vector dimension(from 2D to 4D) and correct answer for every number from 0 to 2^32.
5. ## request HLSL support for sqrt() for integers

HLSL trigonometric functions are sin() cos() tan() etc...
6. ## request HLSL support for sqrt() for integers

HLSL trigonometric functions have some magic going on behind it, that's why I want them to do a length() for uints, they could make something that will let the function calculate all vectors that have a length less than 2^32
7. ## request HLSL support for sqrt() for integers

wait what? uint operations are more accurate than float operations, as I said you can change the definition of 0 to 1 at your own taste, so I can make (0 to 1) = (0 to 2^25) and this will make it 2 times more accurate than (0 to 1) float
8. ## request HLSL support for sqrt() for integers

float operations have a margin of error too, when using uints you just have to change your definition of what 0 to 1 means, for example 0 to 1 in my world physics engine is 0 to 2^9 uint.
9. ## request HLSL support for sqrt() for integers

just tested HLSL intrinsic length() vs my integer length2D() and length3D() my functions are deterministic, HLSL intrinsic length() isn't, tested on an old ATI vs a new NVIDIA, on HLSL 4. You should test it yourself and that should be enough reason, for the people that want determinism.
10. ## request HLSL support for sqrt() for integers

so there is that case when the length of the vector is 1 and another case when its 0 and some other cases when it is bigger than 2^32 but there are a whole lot more cases when it actually makes sense and those outweighs the other, so it actually makes sense to make that function, its the same with float operations isn't it, its even worse, there are more cases when length() doesn't work with float vectors. what do I propose whent its bigger than 2^32? return the same value when any number is divided by 0.
11. ## request HLSL support for sqrt() for integers

arrrgg I messed up in the title, pls read the whole question, I would like to request to the people that make the intrinsic HLSL functions to make a length() function for uint vectors that gets a correct answer for every vector value uint4(0 to 2^32,0 to 2^32,0 to 2^32,0 to 2^32)
12. ## how Nvidia does the balls collision detection in Flex without using a 3D grid?

at 7:15 in this video   He says it's grid free, you can download the demo to play with the balls and in some worlds you can move the balls anywhere you want, no walls
13. ## getting rid of near and far clipping planes

The near and far planes operation makes the depth resolution poop. I want every object in front and in the field of view of the camera to be rendered doesn't matter the distance, until its so small it can't be seen but still rendered.   In the vertex shader after the Projection multiplication I have tried this:   float4 r = mul(Pos, Projection); r.xy /= abs(Pos.z); r.z = Pos.z*MAX_DEPTH; r.w = 1.0f; if(Pos.z<0.0f) { r.w = -1.0f; } return r;   MAX_DEPTH  is (1.0f/16777216.0f) so that any float distance gets rendered...16777216 = 2^24   but it still doesn't works, well the mesh is rendered correctly but when the camera is near and you read the depth and positions pixels across the mesh they are off and this off thing increases as the pixels gets near the camera     I want to know what is happening to SV_POSITION after it leaves the vertex shader? I know that "xyz" get divided by the abs() of "w" what other operations happen in that part were it gets divided by "w"? if someone has already a way to render the way I want can you share it too?
14. ## render any float value to 32bit depth buffer

Using directx 10   I changed  DepthClipEnable = FALSE;   and changed the viewport to vpTRY.MinDepth = 0.0f; vpTRY.MaxDepth = 16777216.0f;   but I still can't render a 10.0f float value to the 32 bit depth buffer   What else do I need to configure?
15. ## getting rid of near and far clipping planes

My final optimized solution which has a near clipping plane and an infinite far plane: Only 2 variables of the Projection matrix are needed, the ones that modify Pos.xy so we don't need to pass the whole 16 floats of the matrix to the shaders. Lets call these 2 variables ProjecXY Now in the vertex shader instead of doing this: Pos = mul(Pos, Projection); do this: Pos.xy*= ProjecXY; Pos.w = Pos.z; Pos.z = NEAR_PLANE; //NEAR_PLANE = 0.5f and finally reverse the depth test instead of passing if smaller, pass when greater
16. ## when will we see DXGI_FORMAT_R64G64B64A64_FLOAT?

Why haven't GPUs made that full leap to 64 bit as CPUs have? Is directx 12 going to add this format?
17. ## nvidia atmospheric scattering

I am trying to use the algorithm explained in this nvidia page In the vertex shader code there is a function "scale()", I don't understand what it does.  what does it do?   There is another part where there is this... scale(fLightAngle) ? scale(fCameraAngle)) what does the ? mean?
18. ## nvidia atmospheric scattering

found it! I dowloaded the code form here which is in hlsl  and it has this in scale()... float scale(float fCos) { float x = 1.0 - fCos; return ScaleDepth * exp(-0.00287 + x*(0.459 + x*(3.83 + x*(-6.80 + x*5.25)))); } and for the ?... it is a minus... scale(fLightAngle) - scale(fCameraAngle))   even better you can download any GPU gems files from here  http://http.download.nvidia.com/developer/GPU_Gems_2/CD/Index.html

I was trying to render some texels by using a depth stencil which I cleared before rendering to value 2^24 but nothing was rendered. So finally I found that the default RasterizerStateState has DepthClipEnable = TRUE; wich means that this clip will happen "0 <= z <= w" after the vertex shader. I changed RasterizerStateState, still no texels were rendered then I found this in the ClearDepthStencilView documentation: pDepthStencilView [in] Type: ID3D10DepthStencilView* Pointer to the depth stencil to be cleared. ClearFlags [in] Type: UINT Which parts of the buffer to clear. See D3D10_CLEAR_FLAG. Depth [in] Type: FLOAT Clear the depth buffer with this value. This value will be clamped between 0 and 1. Stencil [in] Type: UINT8 Clear the stencil buffer with this value.   just why?
20. ## best way to connect 2 random people around the world in winsock

I have a multiplayer game that works pretty good Right now if I want 2 people in the same house connected one would need the LAN IP of the other computer If I want 2 people in different countries connected one person needs to configure the wifi router to make port "X" connections go to "Y" IP computer inside his house and give the house IP to the other person Now I would like to make 2 random people around the world connect by just pressing one button inside the game nothing to configure just open the game, press the button and bam 2 random people connected so whats the best way to do this using winsock?
21. ## trying to make GPU physics deterministic

I am doing collision physics of spheres on GPU I am doing the multiplayer by only sending user input so determinism is very important One PC has new nvidia and other has old ATI when I use floats in physics code I just have to wait less than 1 sec and I would see all balls in completely different position and as I said IEEE stricness(/Gis) doesn't works now that I use uint-int, it works, waited 5 minutes in a very chaotic place, 1000 balls with explosions and collisions and all balls positions stay exactly same between 2 PCs
22. ## trying to make GPU physics deterministic

I have to tryyyyyy   stuck with directx 10   1 update every 1/250 seconds every 4 updates the input changes   IEEE stricness(/Gis) doesn't works   so I thought about using ints textures every integer 0 to 2^24 can be represented in float(meaning I could still use float textures but operating with them as ints)   the resolution of the physics space will be 2^24 places, velocity and force vectors will have this resolution too   but then in HLSL sqrt(), it only works with floats, there are some functions that I could use like http://stackoverflow.com/questions/4930307/fastest-way-to-get-the-integer-part-of-sqrtn   but in the end will this work will it be too slow what do you think? any recommendations?
23. ## how to get length of a int vector without overflow on 32 bit int

It's very easy to overflow a 32 bit integer when calculating it's length the common way because of this: I have a 3D space divided in 400 cubes, every cube is 256x256x256 so the biggest vector that can exist here is 102400x102400x102400 and dot(v,v) = 31457280000 is a number bigger than 2^32. So even if I could use 64 bit int then to get the sqrt() of this very big number I would need a double sqrt() or the integer newtonian way which uses a loop but I can't use this two because I am doing this in GPU HLSL 4.   Does someone know a fast method to get the length of big 32bit int using 32bit ints?
24. ## how to get length of a int vector without overflow on 32 bit int

Every solution I used was too slow or fast with bad approximation This is in hlsl 4, the biggest problem was dividing 62 bit uint made of 2x32bit by a 32 bit uint But then I though... I only need 64 bit uint when the length of the vector is bigger than 2^16 So I just have to decrease the size of the vector when any of (x,y,z) is bigger than 2^16 And that is my solution
25. ## Fill in the Blank: I am wasting some game dev time by _______

frigging changing all my physics shader code from floats to ints I hope it works will kill myself...no multiplayer