• Advertisement

evelyn4you

Member
  • Content count

    19
  • Joined

  • Last visited

Community Reputation

1 Neutral

About evelyn4you

  • Rank
    Member

Personal Information

  • Interests
    Art
  1. Hello JoeJ , many thanks for your manifold hints. At the moment i am still overthinking - implementing anisotropic Voxel GI ( according to my studies several, different methods exist - improve my Reflection CubeMaps / Light Probes method für GI Some results about Light Probes I changed the code for more efficent creation of Cube maps an tested performance with Sponca / Atrium - 1 DirLight with shadowmap - 1 long range point light with omnidirectianl shadowmap - some dozend Pointlights without shadowmap - 100 GI Cubemaps ( all the same resolution e.g. 32 (64, 128, 256 ) with Mipmapping Frustum culling for every cubemap an update dynamically only when something in the frustrum changes To my big supprise the resolution of the cubemap has nearly NO effect of the renderperformance until the resolution remains under 512. That is quite astonishing. It seems that the pixel shader here is not the bottleneck Eg. with 2 dynamic cubmaps i get 50 FPS for the scene, The performance decreases to 1 FPS with 100 dynamic cubemaps. So it shows that about 3 dynamic cubemaps per frame do not give a big render hit. The reflections are nice eye candy For slow changes all the e.g. 100 cubemaps could be changed by a cycle of about 1 sec. But In this scenario i have to place the maps manually which is most efficent to renderperformance but bad when editing the scene. And 100 still too less. This is too slow, fast lighting changes e.g open a door in a room that gets filled with light will not look good. The lighting will be modulated too slow. My method is still too inefficent, because the low res lightmaps e.g. 32 version should be doable much faster. I would like to implement a dynamic regular grid of light probes later SH2, SH3 and interpolationg them ( only in space where there is no vertecie data ) My frustrum culling is checking the planes of the view pyramid with so to say endless depth. I dont know how to implement a good layered method. When creating light probes in a regular grid the should not be treated individually but "the knowledge/information" of the already updated cubemaps should be used to update the others much faster.
  2. Hi JoeJ, - all scenes happen indoor in the house, castle or on the balcon or terrace ( no open world scenario ) - there is no destruction ( no such things like explosions, no shooter scenario ) - only the characters and the animals like ( bird, cat, dog ) are dynamic - lights have static position but with slow day and night cycle. ( At the sunset sitting on terrace .. ) - very much effort i had to create believable characters. This was the reason why i did not use unity, or unreal because i could not transfer my morph, bone animation, face animation like i wanted to have. ( lack of skill for c++ for unreal, and problems with unity shader integration) e.g. a character shall grab a glass with her/his hand and drink. This is NOT done by simple pre processs animation for this special character but with an animation tempate and realtme IK calculation taking into account wheter the person is small or large ( long or short arms, distance from shoulder to glass and so on. I think lightmapping would probably be the best solution for me but it was too hard for me to switch between blender, unwrapping, ray tracing, bake lightmaps importing them... Also i have read about doing lighmap baking myself, but my skills are too small to integrate a engine like nvidias optix (c++) in my c# engine. My intention was to bake the interieur scenes without characters and to combine the baked lighting with my dynamic directional, spot and point lights. I am working on my engine for nearly 16 months an have reached much, much more i never thought i could reach. In principle my results are not bad but they dont reach the standard i want to reach. The last weeks i have read very much about GI but my enlish is not the best and nearly all publications are in english with much mathematic background. My mathematic standard i think is quite hight, but things explained in the papers exceed me knowledge. Placing here and there small point lights help to create a pleasing scene but the overall impession is not what i want to reach. It is the "old style" games where made. Things like AO, bloom, hdr pipeline, i have already integrated.
  3. Now my misunderstanding is clarified. But now i am at my starting point. Which Gi solution would be best for me ? I have already tried LPVs but the results did not please me. My game happens indoor in a castle. Until now i work with high resolution dynamic cube map sampling 6 viewport directions ligt probe with mipmapping. In the room i place it the illimination and reflection look marvelous, but i havent found a automatic solution to determine which vertecie shall be lighted by wich probe. So the probe workes on all verecies in its range through walls and doors. Does here exist a good standard solution ?
  4. many thanks for you answers JoeJ and turanszkij, @JoeJ Your "perfect mirror ball" helped me a lot to understand, also your info about weighting rays vs. changing the distribution of rays The part that comes nearly exactly to my understanding problem is the front/backside wall problem. Test scene. Just imagine a room which is diveded by a wall with a closed door. Lets assume the voxel len is 10 units and the wall is even 20 units thick. One room has a strong point light, the other is completely dark. The voxels of the middle wall on one side will be bright because of the direct light but the voxels on the other side of the wall will be completely dark. But at the 1, 2, 3, mip level of the voxels of the middle wall will be a middle bright area that will give indirect illumination in the dark room. is this right ?
  5. hi turanszkij, many many thanks for you answer. Some months ago i had an intensive study on your surce code und your home page. Although i work with c# i could success with my basic c++ knowledge to compile and run your engine. Bad badly i did no save the code and after trying to download and compile again i saw, that i cannot use the engine because i only have Win7 64bit your engine requires Win10 DirectX12. Do you still have a Win7 version ? In ray marching contex i understand. But opacity here is easy to detect and the progress is only e.g. one pixel. So the problem i have is not touched. My Problem "understanding occlusion query" is solved by tracing each pixel step by step. Now here is my problem. When doing mip mapping on a 3d texture several coarser 3d texture mip maps are created ? But when sampling the 3d texture at a given point an given mip level it will allways return the same value right ? When sampling with "quadrilinear interpolation" we get smooth values but allways the same value independent from view direction of the cone. see the following Part of your code The coneDirection is only for calculating the next postion in texture space where we sample from. But if sampling the mip map is view direction independent how can it reproduce the correct color ? Sampling a point within a 3d mipmap is something totally different from getting a projection (of the view ) of the colors from the Voxels it consists of ( when looking at it from a certain viepoint) ? To my understanding the sampling yust gives me the interplation of the accumulated surrounding voxel colors of the sample point. I still need help in understanding, please be patient. best regards evelyn Part Code from turanszkij game engine float diameter = max(g_xWorld_VoxelRadianceDataSize, 2 * coneAperture * dist); float mip =log2(diameter * g_xWorld_VoxelRadianceDataSize_Inverse); // Because we do the ray-marching in world space, we need to remap into 3d texture space before sampling: // todo: optimization could be doing ray-marching in texture space float3 tc = startPos + coneDirection * dist; tc = (tc - g_xWorld_VoxelRadianceDataCenter) * g_xWorld_VoxelRadianceDataSize_Inverse; tc *= g_xWorld_VoxelRadianceDataRes_Inverse; tc = tc * float3(0.5f, -0.5f, 0.5f) + 0.5f; // break if the ray exits the voxel grid, or we sample from the last mip: if (any(tc - saturate(tc)) || mip >= (float)g_xWorld_VoxelRadianceDataMIPs) break; float4 sam = voxels.SampleLevel(sampler_linear_clamp, tc, mip);
  6. Hello, i try to implement voxel cone tracing in my game engine. I have read many publications about this, but some crucial portions are still not clear to me. At first step i try to emplement the easiest "poor mans" method a. my test scene "Sponza Atrium" is voxelized completetly in a static voxel grid 128^3 ( structured buffer contains albedo) b. i dont care about "conservative rasterization" and dont use any sparse voxel access structure c. every voxel does have the same color for every side ( top, bottom, front .. ) d. one directional light injects light to the voxels ( another stuctured buffer ) I will try to say what i think is correct ( please correct me ) GI lighting a given vertecie in a ideal method A. we would shoot many ( e.g. 1000 ) rays in the half hemisphere which is oriented according to the normal of that vertecie B. we would take into account every occluder ( which is very much work load) and sample the color from the hit point. C. according to the angle between ray and the vertecie normal we would weigth ( cosin ) the color and sum up all samples and devide by the count of rays Voxel GI lighting In priciple we want to do the same thing with our voxel structure. Even if we would know where the correct hit points of the vertecie are we would have the task to calculate the weighted sum of many voxels. Saving time for weighted summing up of colors of each voxel To save the time for weighted summing up of colors of each voxel we build bricks or clusters. Every 8 neigbour voxels make a "cluster voxel" of level 1, ( this is done recursively for many levels ). The color of a side of a "cluster voxel" is the average of the colors of the four containing voxels sides with the same orientation. After having done this we can sample the far away parts just by sampling the coresponding "cluster voxel with the coresponding level" and get the summed up color. Actually this process is done be mip mapping a texture that contains the colors of the voxels which places the color of the neighbouring voxels also near by in the texture. Cone tracing, howto ?? Here my understanding is confus ?? How is the voxel structure efficiently traced. I simply cannot understand how the occlusion problem is fastly solved so that we know which single voxel or "cluster voxel" of which level we have to sample. Supposed, i am in a dark room that is filled with many boxes of different kind of sizes an i have a pocket lamp e.g. with a pyramid formed light cone - i would see some single voxels near or far - i would also see many different kind of boxes "clustered voxels" of different sizes which are partly occluded How do i make a weighted sum of this ligting area ?? e.g. if i want to sample a "clustered voxel level 4" i have to take into account how much per cent of the area of this "clustered voxel" is occluded. Please be patient with me, i really try to understand but maybe i need some more explanation than others best regards evelyn
  7. HowTo: bone weight painting tool

    Thank you, this is a good trick. I think it will save memory but not much workload, because the pixel shader in this case is so easy. (Just writing the index Id to the renderTarget ) Maybe i dont understand correctly, but i feel it would be a heavy workload. E.g. my base character ( from which all others are made just be changing the morphing parameters ) are highpoly 20.000 Vertecies about 60.000 trinangles. When doing weight painting the characters are not in bind pose but in a certain user defined pose and morph. So on CPU side i have to apply all morphs and bone transforms to all vertecies and store them in the "access vertecie array". But there is no need to transform them in screen space with worldViewPrjectionj matrix, right ? In my case this would be no problem, because i have a compute shader that does this pre-tranform an stores the new world coordinates in a structured buffer (unordered acces view.) The CPU side would begin here a. read back the transformed vertecies from GPU in a "access vertecie array" ( still in world space) b. Now i have to translate "somehow" the cursor position to world space. (by inverseViewProjection matrix ??) c. Then the brute force loop is done finding these vertecies with a quadratic distance within the given range. d. change and update the coresponding bone weights of the vertecies in the input vertecie buffer to GPU I will simply have to try out how fast the methods are. I think both should not be too hard to implement. Yesterday i made a first implementation of my shader version with full screen rendertarget but the FSP did drop from 48 to 39 which is in my case a lot. ( Did the test with a big scene of 8 animated high poly characters and only 1500 by mouse selectable vertecie geometrie representations ( small cones with only 4 vertecies each of them in one big vertexbuffer and a constant buffer with the transform matrixes) Not at all optimized.
  8. HowTo: bone weight painting tool

    hi JoeJ, again many thanks for your input. I think for debugging and other purpose i will need a vertecie and/or triangle visualizer/picker so i think i wil begin with my "shader version" that can serve for both ( visualization AND editing ) Here my method ( easiest solution ? ) i will give a try in code: a. give every vertecie ( = point ) a small 3d geometrie representation e.g. a very small cube. b. each cube has has its color representing the vertex index, and draw with one draw call the whole "cube cloud" c. make a simple render to texture ( renderTarget ) On CPU Side d. read only the small 2d area from the RenderTarget, that is surrounding the mouse cursor positon. e. scan the small area for the different colors an weight according to distance from mouse centerpoint (here the problem is over and undersampling, and visibility depending on zoom level its not exact but i think for a first solution it should work, with all the disadvantages you mentioned.
  9. HowTo: bone weight painting tool

    hi JoeJ, many thanks for your comprehensive answer. What about the following method that came i my mind ? a. We treat the vertex painting like shading our mesh with a spotlight in a deferred renderer ( that i already have ) b. The brush would be a round cone vertexbuffer geometry that is "shining" on the mesh. c. Yust like doing a phong shading the mesh is shaded e.g. with cosin intensity and a fall off function from deviation of the light vector. By this way the problem of touching vertecies that are behind visible geometry is also solved and we get automatically the searched vertecies that surround our "center vertecie" d. every vertecie gets a "unique color" attribute that coresponds to its index Either we udate an unordered acess view or write to a texture rendertarget. d. e.g. all black parts of the rendertarget corespond to vertecies with no change the lighted parts give the amount of weight to add. But here my question: How is it possibe in the PixelShader to calculate the color that we can later find out from which 3 vertecie colores the pixel was blended ? Or even better we want to find that unique Color from that vertecie to which the pixel is nearest to. We need this information to write in a additional rendertarget or buffer the information to get the link from light intensity to vertecie index. On the CPU Side, we update the bone weights by the information we read back from these two rendertargets What do you think ?
  10. Hello, in my game engine i want to implement my own bone weight painting tool, so to say a virtual brush painting tool for a mesh. I have already implemented my own "dual quaternion skinning" animation system with "morphs" (=blend shapes) and "bone driven" "corrective morphs" (= morph is dependent from a bending or twisting bone) But now i have no idea which is the best method to implement a brush painting system. Just some proposals a. i would build a kind of additional "vertecie structure", that can help me to find the surrounding (neighbours) vertecie indexes from a given "central vertecie" index b. the structure should also give information about the distance from the neighbour vertecsies to the given "central vertecie" index c. calculate the strength of the adding color to the "central vertecie" an the neighbour vertecies by a formula with linear or quadratic distance fall off d. the central vertecie would be detected as that vertecie that is hit by a orthogonal projection from my cursor (=brush) in world space an the mesh but my problem is that there could be several vertecies that can be hit simultaniously. e.g. i want to paint the inward side of the left leg. the right leg will also be hit. I think the given problem is quite typical an there are standard approaches that i dont know. Any help or tutorial are welcome P.S. I am working with SharpDX, DirectX11
  11. hello turanszkij, again, many thanks for your answer. Now things have hopefully become clear to me. I will try and report. I also have opened a second thread for a comparable implementation as geometry streamout method. When you could spend a little time for me to give an answer that would be very kind of you
  12. hi, after implementing skinning with compute shader i want to implement skinning with VertexShader Streamout method to compare performance. The following Thread is a discussion about it. Here's the recommended setup: Use a pass-through geometry shader (point->point), setup the streamout and set topology to point list. Draw the whole buffer with context->Draw(). This gives a 1:1 mapping of the vertices. Later bind the stream out buffer as vertex buffer. Bind the index buffer of the original mesh. draw with DrawIndexed like you would with the original mesh (or whatever draw call you had). I know the reason why a point list as input is used, because when using the normal vertex topology as input the output would be a stream of "each of his own" primitives that would blow up the vertexbuffer. I assume a indexbuffer then would be needless ? But how can you transform position and normal in one step when feeding the pseudo Vertex/Geometry Shader with a point list ? In my VertexShader i first calculate the resulting transform matrix from bone indexes(4) und weights (4) and transform position and normal with the same resulting transform Matrix. Do i have to run 2 passes ? One for transforming position and one for transforming normal ? I think it could be done better ? thanks for any help
  13. hello turanszkij, many, many thanks for you kind answer. Yesterday evening have set up a test framework only for testing compute shader with skinning. Your explanations were very helpfull. Especially the hint for the correct flags to set. Without this it was impossible to create a buffer that can be used as a vertex buffer an also as UOA View. In my test framework 1. create structured buffer which can be used for UOA View und Vertex Buffer 2. create a staging buffer with cpu access and fill it with testdata array 3. copy values from the staging buffer to the structured buffer 4. compile and run a compute shader that changes the data of the structured buffer 5. create again a staging buffer an copy the values from the structured buffer back 6. access staging buffer and compare values if correct or not. By this way i check, ifthe compute shader does work correctly. Did i understand things right ? a. When using a buffer e.g. structured buffer in a shader e.g. vertex shader i always have to use a ShaderResourceView b. when using a buffer e.g. rwstructured buffer in a compute shader i allways have to use a UOA View ? c. a constant buffer is a special very fast buffer and can NEVER used in a compute shader The next step would be to feed my compute shader with first a "typical" strutured buffer including - position, normal, bone_index, weight_index the second buffer will be a "typical" structured buffer including the bone transform maticies - matrix the third buffer will be the "special" mentioned buffer above raw read write byteadressbuffer that can be used as UOA View and vertexbuffer including - position, normal, uv with this buffer ( containing the bone transformed vertecies) if feed my "old pipeline" as before so the old shader even does not know that this is a bone transformed object, by this way there are no shader permutations necessary. Is this the right way or do i have a wrong concept ?
  14. hi, until now i use typical vertexshader approach for skinning with a Constantbuffer containing the transform matrix for the bones and an the vertexbuffer containing bone index and bone weight. Now i have implemented realtime environment probe cubemaping so i have to render my scene from many point of views and the time for skinning takes too long because it is recalculated for every side of the cubemap. For Info i am working on Win7 an therefore use one Shadermodel 5.0 not 5.x that have more options, or is there a way to use 5.x in Win 7 My Graphic Card is Directx 12 compatible NVidia GTX 960 the member turanszkij has posted a good for me understandable compute shader. ( for Info: in his engine he uses an optimized version of it ) https://turanszkij.wordpress.com/2017/09/09/skinning-in-compute-shader/ Now my questions is it possible to feed the compute shader with my orignial vertexbuffer or do i have to copy it in several ByteAdressBuffers as implemented in the following code ? the same question is about the constant buffer of the matrixes my more urgent question is how do i feed my normal pipeline with the result of the compute Shader which are 2 RWByteAddressBuffers that contain position an normal for example i could use 2 vertexbuffer bindings 1 containing only the uv coordinates 2.containing position and normal How do i copy from the RWByteAddressBuffers to the vertexbuffer ? (Code from turanszkij ) Here is my shader implementation for skinning a mesh in a compute shader: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 struct Bone { float4x4 pose; }; StructuredBuffer<Bone> boneBuffer; ByteAddressBuffer vertexBuffer_POS; // T-Pose pos ByteAddressBuffer vertexBuffer_NOR; // T-Pose normal ByteAddressBuffer vertexBuffer_WEI; // bone weights ByteAddressBuffer vertexBuffer_BON; // bone indices RWByteAddressBuffer streamoutBuffer_POS; // skinned pos RWByteAddressBuffer streamoutBuffer_NOR; // skinned normal RWByteAddressBuffer streamoutBuffer_PRE; // previous frame skinned pos inline void Skinning(inout float4 pos, inout float4 nor, in float4 inBon, in float4 inWei) { float4 p = 0, pp = 0; float3 n = 0; float4x4 m; float3x3 m3; float weisum = 0; // force loop to reduce register pressure // though this way we can not interleave TEX - ALU operations [loop] for (uint i = 0; ((i &lt; 4) &amp;&amp; (weisum&lt;1.0f)); ++i) { m = boneBuffer[(uint)inBon].pose; m3 = (float3x3)m; p += mul(float4(pos.xyz, 1), m)*inWei; n += mul(nor.xyz, m3)*inWei; weisum += inWei; } bool w = any(inWei); pos.xyz = w ? p.xyz : pos.xyz; nor.xyz = w ? n : nor.xyz; } [numthreads(1024, 1, 1)] void main( uint3 DTid : SV_DispatchThreadID ) { const uint fetchAddress = DTid.x * 16; // stride is 16 bytes for each vertex buffer now... uint4 pos_u = vertexBuffer_POS.Load4(fetchAddress); uint4 nor_u = vertexBuffer_NOR.Load4(fetchAddress); uint4 wei_u = vertexBuffer_WEI.Load4(fetchAddress); uint4 bon_u = vertexBuffer_BON.Load4(fetchAddress); float4 pos = asfloat(pos_u); float4 nor = asfloat(nor_u); float4 wei = asfloat(wei_u); float4 bon = asfloat(bon_u); Skinning(pos, nor, bon, wei); pos_u = asuint(pos); nor_u = asuint(nor); // copy prev frame current pos to current frame prev pos streamoutBuffer_PRE.Store4(fetchAddress, streamoutBuffer_POS.Load4(fetchAddress)); // write out skinned props: streamoutBuffer_POS.Store4(fetchAddress, pos_u); streamoutBuffer_NOR.Store4(fetchAddress, nor_u); }
  15. Hello, until now i am using structured buffers in my vertexShader to calculate the morph offsets of my animated characters. And it works fine. But until now i only read from this kind of buffers. ( i use 4 of them ) Now i had in mind to do other things, where i have to use a readwrite buffer that i can write to. But i cant get in my head how to sync write acceses. when i read a value from the buffer at a adress that coresponds to e.g. a pixel coordinate and want to add a value another thread could have read the same value overrides the value that i had written. How is this done typically ?
  • Advertisement