Jump to content
  • Advertisement

Arjan B

  • Content Count

  • Joined

  • Last visited

Community Reputation

1137 Excellent

About Arjan B

  • Rank

Personal Information

  • Interests

Recent Profile Visitors

The recent visitors block is disabled and is not being shown to other users.

  1. Thanks for the help, everybody! For the ones interested, splitting up the steps shows that it really is in the push(/emplace)_back() calls: Using a light breeze's suggestion, FPS went up from 1 to 95.
  2. Thanks for the suggestions! Sadly, setting iterator debug level to 0 made no difference. Seems that range checking on the iterators is not the bottleneck.
  3. I started working on a raytracer project (again) and ran into a problem when compiling a Debug configuration. All I have at the moment is a set of pixels in float RGB format that I convert to unsigned char RGBA format that SFML wants. This happens once per frame, running at 200+ FPS in Release mode, but 1 FPS in debug mode. Please have a look at the attached profiling result. It seems to spend almost all of its time in std::vector::push_back(). Is there any way to speed up this process? Could I create all elements in a batch and then start filling in values? Is there some handy use of std::transform that I could apply? Thank you in advance! std::vector<sf::Uint8> SfmlFilm::ToRgba(const std::vector<Spectrum>& image) { std::vector<sf::Uint8> rgba; rgba.reserve(GetWidth() * GetHeight() * 4); for (auto spectrum : image) { const auto max = 255; rgba.push_back(static_cast<sf::Uint8>(spectrum.r * max)); rgba.push_back(static_cast<sf::Uint8>(spectrum.g * max)); rgba.push_back(static_cast<sf::Uint8>(spectrum.b * max)); rgba.push_back(max); } return rgba; }
  4. Arjan B

    'Remove' direction from velocity

    Just to add to Alvaro's comment: projection of A onto B gives you all of A in the direction of B. This is why he/she subtracts that projection from A. The Wikipedia page on vector projection calls this the rejection of A from B: https://en.wikipedia.org/wiki/Vector_projection
  5. Wow, thanks a lot guys!   I ended up doing what Juliean said, which brought me right to the glUniform1f() function. And exactly as Nanoha stated, it generated a GL_INVALID_OPERATION error, which was fixed by simply replacing the 'f' with an 'i'. Wish I'd posted here sooner, I spent tons of hours sadly staring at my screen as well.   Thanks again!
  6. Hi! I first create a lookup-table for a transfer function and then try to upload it as a 1D texture as follows: for (unsigned i = 0; i < 1024; i++) tfm0[i] = qfeGetProbability(tf0, (float)i/1023.f); glActiveTexture(GL_TEXTURE17); if (glIsTexture(tfmTex0) glDeleteTextures(1, &tfmTex0); glGenTextures(1, &tfmTex0); glBindTexture(GL_TEXTURE_1D, tfmTex0); glTexImage1D(GL_TEXTURE_1D, 0, GL_R16F, 1024, 0, GL_RED, GL_FLOAT, tfm0); Right before the rendering call I make sure all the textures I need are bound to the right texture units: glActiveTexture(GL_TEXTURE17); glBindTexture(GL_TEXTURE_1D, tfmTex0);  Then, I set my uniform variable for the 1D texture: glUniform1f(glGetUniformLocation(shaderProgram, "tf0"), 17); And this is how the 1D texture is defined in the fragment shader, where sampleNorm is a value between 0 and 1: uniform sampler1D tf0; vec4 tfValue = texture1D(tf0, sampleNorm); Somehow, all of the tfValues end up being (0, 0, 0, 1), which I suspect is a default fallback value.   To be sure that I uploaded the values to the graphics card correctly, I also have this check right before the draw call: float values[1024]; glActiveTexture(GL_TEXTURE17); glGetTexImage(GL_TEXTURE_1D, 0, GL_RED, GL_FLOAT, values); It retrieves the values in the texture I uploaded back to "normal" memory, and they show up to be exactly the values I expect them to be.   Does anyone have an idea of where things might be going wrong? What would cause the sampler in the fragment shader to return (0, 0, 0, 1), when it should be returning my values in the R-channel?   Thank you in advance, Arjan
  7. Arjan B

    Is ray tracing hard or is it just me?

    I think it's appropriate here to link to Bacterius' journal: http://www.gamedev.net/blog/2031-ray-tracing-devlog/. I think he does a good job at thoroughly explaining the process of writing a raytracer.
  8. Arjan B

    The Rendering Equation

    Loving this blog. Keep up the good work!
  9. Depth of field My friend had added depth of field to the path tracer, which shows some nice results. Without DoF: With DoF: This effect was achieved by picking a focal point on the focal plane for every pixel, and then jittering our camera rays to go through this focal point. Finished report After some significant revisions on our two reports for the two subjects for which we did this project, we are finally finished. I feel like I've learned an awful lot more about rendering, mostly due to looking at it from a different angle than the approach I'm used to (rasterization). Working on this project has been a joy for me and I'm happy with the results. Having finished the report does not mean that we're finished with this project. We do intend to find some time to add more features. But, in reality, time might be sparse. My interests have shifted to learning how to implement these kind of effects (AA, DoF, GI) in a rasterization setting. I hope people enjoyed having a look at this series of blog posts. Maybe there'll be more. Thanks for reading!
  10. Arjan B

    Off-line rendering and block editor

    Sounds like a good plan! But those courses are now done, which means I have a lot less time for this project. Since the performance is at acceptable levels, I will probably start with adding new features such as textures and meshes. Probably, I will then need to start working on some speedup and will give your suggestion a try. :)
  11. A lot of progress in this new entry! We have implemented the octree, which makes our renderer scale well with a large number of objects in our scenes. We have finished our block editor, which allows us to create, save and load worlds to render. And finally, we added some form of offline rendering. Off-line renderer - Be sure to watch in 720p! [media][/media] I've implemented a system in which you give a set of (point, direction) pairs, which the camera will be at at some point. You also specify a list of times, which indicate the time it takes for the camera to go from one pair to the next. If you now specify the number of frames per second, the system will interpolate all camera positions and directions for all frames. Using this system, I set the camera position and direction, let it run a given number of samples per pixel, save the resulting image as a .jpeg file, and then move on to the next position and direction. In the end, I use some other program (MonkeyJam) to paste all these images together into a movie file. The YouTube movie you see above is rendered at 1280x720, with 200 samples per pixel, at 30 frames per second. Block editor [media][/media] We can move around, much like a ghost cam in some FPS. You use ASWD to move around, Spacebar and CTRL to ascend and descend, and the mouse to look around. Left mouse-click adds an object, while right mouse-click removes one. Using other keys on the keyboard, you can choose which color the next object will have, its material type, its albedo (or brightness for a light), and its shape. You can also increase and decrease the brightness of the "skylight". Octree My friend did all the work on the octree. Since CUDA doesn't support recursion very well, we had to do all operations on this tree stackless. Speed We are happy with the results, though we did expect more speed gain from the switch to CUDA. Since we are not using textures or meshes, all we have in GPU memory is our octree, which is rather small. The time spent on memory access was rather insignificant compared to time spent on performing calculations. This made a lot of memory optimizations, which we learned about in class, not useful. We think that the minor gains are to blame on the very branching nature of our kernels. Running one instruction on a whole lot of different data is fast, but if the instruction that we're at with the calculations for one pixel is different from another, we won't benefit from this. So whenever one ray hits a different type of material than another ray does, a different piece of code is run to sample the new direction. We think that the path tracer would be a whole lot faster if we could figure out ways to reduce the branching.
  12. Arjan B

    CUDA Progress!

    Yes! we've made great progress on my project's conversion to CUDA! Without any optimizations, it's already 6 times as fast as the CPU version. It's not as much as I'd hoped, though. But, hell, it's real-time(-ish) now! No pretty pictures this time... Instead, YouTube video! [media][/media] Getting the initial idea of keeping host and device well separated isn't all that hard. However, actually putting it into practice turned out to cause quite some headaches. Debugging a program with hundreds of threads running in parallel is hard, I tell ya. But now, all of the basic functionality seems to be in place. Next up: Octree - way more objects in the scene! Optimization - 6 times as fast as a single CPU core should be far from the limit! Block editor - we really need to be able to place/remove objects for our assignment... Stay tuned!
  • Advertisement

Important Information

By using GameDev.net, you agree to our community Guidelines, Terms of Use, and Privacy Policy.

GameDev.net is your game development community. Create an account for your GameDev Portfolio and participate in the largest developer community in the games industry.

Sign me up!