Jump to content
  • Advertisement


  • Content Count

  • Joined

  • Last visited

Community Reputation

689 Good

About B_old

  • Rank
    Advanced Member

Recent Profile Visitors

The recent visitors block is disabled and is not being shown to other users.

  1. I'm currently looking at the Hosek sky model and experimenting with the provided source code. At the moment the results for arhosekskymodel_radiance() vs arhosekskymodel_solar_radiance() are confusing to me.   Using the spectral arhosekskymodel_radiance() and converting it to RGB gives me:   [attachment=31582:hosek_expected.jpg].   Notice that I manually added in a sun, but it only affects a very small area of the image.   Using arhosekskymodel_solar_radiance() and converting to RGB gives me:   [attachment=31583:hosek_actual.jpg]   In this case the sun is directly provided by the hosek model. The positions match but the color of the sky is completely different. I was expecting the exact same result, expect for the small area where the sun is visible. The bright point of the hosek sun is at the same position as my manually rendered solar disk, but I also get vastly different results for rays that don't directly hit the sun at all.   From looking at the interface I was expecting that I can call arhosekskymodel_solar_radiance() with the exact same parameters as arhosekskymodel_radiance() and it will just work. Is this assumption correct? Does anyone have experience with this model? I don't think it is a matter spectral to RGB conversion, as I use the spectral version of the hosek functions in both cases, so the conversion is identical.    Could it be that arhosekskymodel_solar_radiance() should only be used when rendering the solar disk? But that would mean, that you can render an arbitrarily large sun that don't match the assumptions the hosek model makes.   EDIT: After reading the comments in the code I'm now pretty sure that arhosekskymodel_solar_radiance() should only be used when rendering the solar disk!
  2. Hammersley sequences or the like are a higher level construct than radical inverse functions (in fact the "end product" of the whole algorithm), not a low-level primitive.   I read the paper and don't understand how they combine the routines to get their RDS.  The scrambled part still makes sense to me. func ScrambledHammersley(i, numSamples, r uint32) Vector2 { return MakeVector2(float32(i) / float32(numSamples), ScrambledRadicalInverse_vdC(i, r)) } But apparently they are combining all of the three radical inverse functions to get some result.
  3. In the paper Efficient Multidimensional Sampling the authors describe a technique they call Random Digit Scrambling (RDS). Has someone more implementation details on this algorithm? The paper gives source code for three different radical inverse functions that can be used to implement RDS but I don't understand how they should be applied. I especially don't understand how they jump from Hammersley samples to RDS.  func Hammersley(i, numSamples uint32) Vector2 { return MakeVector2(float32(i) / float32(numSamples), radicalInverse_vdC(i)) } Can the above snippet easily be changed to something that generates RDS?
  4. If simulation and rendering run at different frequencies it can be useful to interpolate between two simulation steps during rendering for smooth animations. For moving meshes I simply interpolate the transformation on the CPU before sending to the GPU.   For particles, which I simulate entirely on the CPU, I'm less sure about a good strategy. Currently I keep the particle array from the previous simulation frame around and send both to the GPU where I do the interpolation. I figured doing this on the GPU is faster even though I'm sending twice the data over now. Does this make sense or would you do the interpolation on the CPU as well?   I have two arrays of particle structs. One for the previous and the other for the current frame. Before each simulation frame I just copy the array. I send them to the GPU as two separate buffers. Would it be smarter to store it as one interleaved array?   Particle rendering is currently not a bottleneck for the scenes I have (at least not the number of particles), but I would like to set it up somewhat sane. How would you handle this?
  5. Yes, it returned GL_FRAMEBUFFER_COMPLETE.   Maybe the easiest thing is to render to UNORM and map [-1, 1] to [0, 1] in the shader. Not really a big deal, but I wanted to find out what the problem is.
  6. Yes, I'm checking for completeness. The behavior I get is completely identical to just using a UNORM target, so it seems to be silently converting to that. I also don't get any debug output from OpenGL.   This is with a GeForce GTX 570, driver 331.38 on Ubuntu.   Do you have experience with rendering to this format?
  7. I don't know. It says:   We are silent about requiring R, RG, RBA and RGBA rendering. This is an implementation choice.   As the hardware seems to perfectly capable of rendering to SNORM I expect that it is implemented for all drivers. Has someone here successfully rendered to SNORM with OpenGL?   EDIT:   This OpenGL 4.4 core spec document also does not mark the SNORM formats as something that must be supported for a color target. Maybe it is really not supported to render to SNORM. Can anybody confirm this?
  8. Hi,   is it possible to a SNORM texture in OpenGL 4.4? Apparently they are not a required format for color targets in 4.2.   I want to render to a RG16_SNORM target to store normals in octahedron format. The linked paper contains code that expects and outputs data in the [-1, 1] range and I was just assuming that it would automatically work with SNORM textures.   The output seems to get clamped to [0, 1] though. It checked with a floating point render target and got the expected results so I don't think it is an issue with the code.   Should this work? Am I maybe doing something wrong when creating the texture?   EDIT:   D3D11 hardware supports SNORM render targets, so I guess I'm doing something wrong.
  9. Oh, namespace was indeed a problem. Sorry. The default namespace was set to "scene", but writing scene::Scene in the script did not change the error message. As a test I changed the namespace to "", after which I get "There is no copy operator for the type 'Scene' available."
  10. I have a C++ type that I don't want to instantiate from scripts and only use as handles, preferably by passing it to the constructor for script classes that should be able to interact with it.   I followed the documentation and used asOBJ_NOCOUNT and did not provide a factory behavior. Actually, this is all I did (never mind the wrapper): engine.register_object_type("Scene", 0, asOBJ_REF | asOBJ_NOCOUNT); engine.register_object_method("Scene", "const string &name() const", asMETHODPR(scene::Scene, name, () const, const std::string&)); When I try to load a script that looks like this: class Scene_test { Scene_test(Scene@ scene) { scene_ = scene; } void on_scene_loaded() { print("Hello World"); } Scene@ scene_; }; I get an error stating "Identifier 'Scene' is not a data type".   Any idea what I'm missing?
  11. B_old

    Uniform buffer updates

    OK, I think I understand. If we are thinking about the same thing calls to glBindBufferRange actually have to have an offset that is a multiple of GL_UNIFORM_BUFFER_OFFSET_ALIGNMENT, which on my machine (and I think on many other cards) 256.   Just now I tried to evaluate if the uniform updates are a bottleneck in my case. For this test I stripped down the rendering pipeline as much as I could, regarding OpenGL interaction. I simulated the performance of the "optimized" uniform updates by replacing glBufferSubData(float4x4) with glBindBufferRange(). I compared the two approaches with 1K and 4K draw calls for very simple geometry (same vb for every draw  call) and could not see any noticeable difference.   I concluded that the optimized version could not possibly be faster than just calling glBindBufferRange() for every differently transformed object, which in turn means this is not my bottleneck.    So has the driver situation improved or is my test/conclusion flawed?
  12. B_old

    Uniform buffer updates

    Thanks, that answers my question! Have you tried this architecture that works well for OpenGL with a D3D backend? Will it also work well there? At least it would make it less annoying, that this is apparently a driver weakness.   I'm indeed already iterating over each object twice but I'm wondering if I don't have to do it a third time now, because the first iteration is followed by a sort which could tell me when I don't need to update the per-material buffers for instance.    I'm also wondering about another thing. Is it very important that the uniform buffer is large enough to fit every single object drawn per frame, or can you achieve good performance with a buffer that is large enough to contain some object data before the data has to be changed. To me it sounds like that should already help, but then again the handling of uniform buffers shouldn't be so hard in the first place.
  13. B_old

    Uniform buffer updates

    What I currently do: bindBufferBase(smallUniformBuffer); for (o : objects) { bufferSubData(smallUniformBuffer, o.transformation); draw(o.vertices); } What I think I should be doing: offset = 0; for (o : objects) { memory[offset] = o.transformation; ++offset; } bufferData(hugeBuffer, memory); offset = 0; for (o : objects) { bindBufferRange(hugeBuffer, offset); draw(o.vertices); } At first I was a bit frustrated because I am used to the Effects of the D3D-Sdk, but after reading the presentation about batched buffer updates it seems a D3D application can also benefit from doing it this way. So the architecture can be the same for both APIs.   @richardurich: "Were updates far enough apart to avoid conflicts (can be perf hit if call A writes to first half of block A, and call B writes to second half of block A)?" Can you explain this a bit more. Are you saying, that it is not good to write to the first half of the buffer and then to the second half, although the ranges don't intersect?
  14. I use uniform buffers to set constants in the shaders. Currently each uniform block is backed by a uniform buffer of the appropriate size, glBindBufferBase is called once per frame and glNamedBufferSubDataEXT is called for every object without orphaning.   I tried to optimize this by using a larger uniform buffer, calling glBindBufferRange and updating subsequent regions in the buffer and this turned out to be significantly slower. After looking around I found this and similar threads that talk about the same problem. The suggestion seems to be to use one large uniform buffer for all objects, only update once with the data for all objects and call glBindBufferRange for every drawcall.    Is this the definite way to go with in OpenGL, regardless of using BufferSubData or MapBufferRange? At one place it was suggested that for small amounts of data glUniformfv is the fastest choice. It would be nice to implement comparable levels of performance with uniform buffers.   What is your experience with updating shader uniforms in OpenGL?
  • Advertisement

Important Information

By using GameDev.net, you agree to our community Guidelines, Terms of Use, and Privacy Policy.

GameDev.net is your game development community. Create an account for your GameDev Portfolio and participate in the largest developer community in the games industry.

Sign me up!