Jump to content
  • Advertisement

JeffRS

Member
  • Content count

    11
  • Joined

  • Last visited

Community Reputation

232 Neutral

About JeffRS

  • Rank
    Member
  1. JeffRS

    Handmade Hero after two years

    I didn't find out about handmade hero until well after the series had started. It sounded interesting but it was too hard to catch up on with the limited spare time I have.   The videos range in length from an hour to 2.5 hours in length and I found I had to frequently pause and rewind to make sure I caught what he was saying/writing/coding which easily added another 30 minutes or so to the watching time. I just don't have much time to spare with work and family commitments so even finding 2 or 3 hours a day is a significant time investment. Then considering the overall pacing being very slow and even trying to skip episodes often caused problems because there would be some assumed information or code from a previous episode that was relied on in later episodes. I think I got around 20 episodes in and he started completely changing and rewriting code he had already done and it was just not a worthwhile use of time for me.   There is some good stuff in what I saw and I think there are some nice lessons on how to write low level code and avoiding using libraries for every thing. I just wish some of the concepts were presented in an easier to access format. Someone mentioned there is an episode guide but as I already pointed out he is writing entire core low level code to handle everything and if you jump ahead in episodes you find he is calling all these custom functions and it just becomes too hard to follow.   I'm not going to touch the whole OOP debate. There is some good information in these tutorials but I am not sure how many people can spend 3-4 hours a day for 2-3 years or however long he drags this out for.
  2. JeffRS

    Not All is Fine in the Morrowind Universe

    Yes this is obviously promotional material thinly disguised as a tech article. There are multiple articles on here posted by the same person to promote this software. It either needs to be made clear that it is an advertisement or at the very least present the content in a way that has some value other than showing the functionality of this commercial software and then telling people to purchase it.   Perhaps present some interesting statistics at least. What are the most common types of problems found? What are the average amount of errors found per 100/1000/10,000 lines of code. What are some of the real world results gained from fixing all the detected problems... speed increase, size decrease, stability?   For what it's worth the software seems excellent and very useful. I just feel like this is an abuse of the spirit of this article system.
  3. JeffRS

    Getting OpenGL functioins

    Probably easiest thing to do is use one of the libraries that takes care of managing openGL extensions for you. I personally don't use them much because the rendering engines I make are not exactly conventional so I can't really recommend a specific solution. However I do know many people use and have success with GLEW, SDL, GLFW, and not sure what else but there are a few solutions if you google for it.   Failing that you can just import them yourself if you only need a few openGL functions and don't want to use a library for some reason, but I don't really recommend it. For example: const PFNGLCREATEPROGRAMPROC glCreateProgram = ((PFNGLCREATEPROGRAMPROC)wglGetProcAddress("glCreateProgram"));
  4. Once you get a basic understanding I highly recommend getting Disney's BRDF Explorer: http://www.disneyanimation.com/technology/brdf.html It was a big help for me to understand how different compenents of a BRDF work together using instant visual feedback and playing with the formulas which come with the software. Once you get your head around it you can then program your own formulas or use combinations of existing ones.   It's a great tool and it's free!
  5. Have a read of this article: http://www.iquilezles.org/www/articles/filtering/filtering.htm
  6. I got the same thing and I am in Australia. There is obviously some other requirement besides geographic location because it stopped me when I selected the job description part.
  7. Just to follow up on this, I actually fixed the problem. Thanks again to everyone who responded, each of you helped in different ways and I appreciate your patience.
  8. Just looking at your result image I think you are close. Problem is not with the noise but more with the application of noise. Really the perlin noise can be substituted with other variations and still get the result you want. I recommend taking a look also at simplex noise, value noise, gradient noise, etc which all offer different levels of code speed, complexity and refinement of resulting noise.   I believe the problem is your implementation of 2d fbm which has no scaling or rotation. The last picture you posted looks like it might be correct but is scaled so that the pattern is very small. Try also adding global scaling for all the x/y coordinates by dividing them by maybe 50 or so. The other thing is to get a nice result like you want you need to rotate your coordinates between each level of noise as well as scaling them.   I almost always implement these types of noise functions in a gpu shader so I will post glsl example code and I hope you can interpret it. vec2 uv = vec2(X, Y); float result = 0.0; uv *= 12.0; //scaling factor mat2 m = mat2( 1.6, 1.2, -1.2, 1.6 ); //rotation matrix result = 0.5000*noise( uv ); //1st octave weighted perlin noise uv = m*uv; //rotate coordinates result += 0.2500*noise( uv * 0.5); //2nd octave weighted perlin noise scaled coords uv = m*uv; //rotate coordinates result += 0.1250*noise( uv * 0.25); //3rd octave weighted perlin noise scaled coords uv = m*uv; //rotate coordinates result += 0.0625*noise( uv * 0.125) //4th octave weighted perlin noise scaled coords result = 0.5 + 0.5*result; // scale from -1.0/1.0 to 0.0/1.0 vec3 rgb = vec3(result, result, result); The line commented scaling factor is the global scale, each octave has a hard coded scale and weight. For what it's worth I normally use a simplex noise which you may find easier to implement at least until you get your head around how everything is working. //pseudorandom number function vec2 hash( vec2 p ){ p = vec2( dot(p,vec2(127.1,311.7)), dot(p,vec2(269.5,183.3)) ); return -1.0 + 2.0*fract(sin(p)*43758.5453123); } //simplex noise float noise( in vec2 p ){ const float K1 = 0.366025404; // (sqrt(3)-1)/2; const float K2 = 0.211324865; // (3-sqrt(3))/6; vec2 i = floor( p + (p.x+p.y)*K1 ); vec2 a = p - i + (i.x+i.y)*K2; vec2 o = (a.x>a.y) ? vec2(1.0,0.0) : vec2(0.0,1.0); //vec2 of = 0.5 + 0.5*vec2(sign(a.x-a.y), sign(a.y-a.x)); vec2 b = a - o + K2; vec2 c = a - 1.0 + 2.0*K2; vec3 h = max( 0.5-vec3(dot(a,a), dot(b,b), dot(c,c) ), 0.0 ); vec3 n = h*h*h*h*vec3( dot(a,hash(i+0.0)), dot(b,hash(i+o)), dot(c,hash(i+1.0))); return dot( n, vec3(70.0) ); } I also find it easier to put the fbm function into a loop with some global variables to control the output. This makes it easier to change the weighting and the scaling of each octave of noise with single variables and also easily animate the results. Something like this: float result = 0.0; uv *= scale; uv += offset; weight = 0.7; for (int i=0; i<Iterations; i++) { result += weight * noise( uv ); uv = rotationMatrix * uv + offset; weight *= 0.6; }
  9. Yes that is very helpful. If you can achieve accurate timing with the same video card and opengl then there must still be a problem with my setup. I did actually spend some time with nsight to try and compare against my measured timings. Some things seems close and others were very different and made no sense at all.   Could you just confirm for me which method you are using for your queries? Are you inserting gl_timestamps or using glBeginQuery/glEndQuery? Are you doing anything to try and sync the time queries with the opengl calls you are trying to measure? Are you retrieving the queries every frame or waiting a frame or 2 before reading them back?   I will post some of my code later when I get the chance. Perhaps there is something wrong that other people can notice.
  10. Thankyou for the helpful reply. Given your comments about glFinish I re-read where I found that information and it seems there was some confusion from me not understanding the documentation regarding opengl queries along with some false information posted on the opengl forums about having to sync the cpu and gpu.   I have again rewritten my timing code and have been trying to follow closely the explanations and examples from the books OpenGL Superbible and OpenGL Insights. I have tried using both gl_timestamp and glBeginQuery/glEndQuery and I am getting stable and repeatable results from both. I am now making queries at the start and end of when each shader is called and then retrieving those queries 3 frames (3 SwapBuffers) later. There is no longer anything causing the CPU to wait and the results are stable and repeatable. However I am still getting around 0.04-0.06ms timings where no opengl functions are being called. Both books I mentioned above make a point that the queries are inserted into the pipeline after the last previous opengl call is made but there is no guarantee that the last opengl function has 100% completed due to the parallel nature of the way the pipline is executed. The small incorrect times I am getting seem to be the result of this.   Again both books mention using glFenceSync/glWaitSync to try and perfectly align the timing queries. I tried an example of this from one of the books but it only gave me worse results. At this point the inaccuracies are known by me and seem stable so it would be possible for me to use the timings I am getting and make a small mental adjustment for error when I read the results. However it would be nice if someone could confirm whether or not there is a way to get more precise timing or if the errors I am getting are normal and to be expected.
  11. The information on post processing times is great. I don't expect to come close to what a commercial studio can achieve in terms of quality and optimisation but it is just good to have something to aim for.   Yes the SSAO is cheap, it's only 8 samples in a small radius which is ok for what I am doing. I am testing on an nvidia GTX750 which is a modern but relatively modestly powered gpu. I am aiming for 60fps at 1280x720 and 30fps at 1920x1080.   So I have a lot more questions but your question about timings made me double check and it seems it is not accurate and there is no point going any further until I find a more accurate timing method.   I am using queries with GL_TIMESTAMP. After your question about timing I did some tests and found anywhere from 0.080-0.150ms time delay even just simply making 2 timestamps one after the other and comparing the difference. This is obviously skewing all my timing results by ~ +0.100ms whch of course explains why the very fast processing seems to be taking twice as long.   I am using glFinish() before making each time stamp, this is required to make sure all previous calls have been completed before checking the time. Without this the times are wildly all over the place however it seems that the glFinish() call itself is at least in part causing some problems as the actual fps drops signifcantly just by calling glFinish after each shader. I did some testing and rewrote the timing code in a number of ways and found I could easily skew the timing by +0.100ms or so depending on when the queries were read back. For example if I do a timestamp and read it back imediately then do another timestamp and read that back the difference is 0.080ms greater than if I do 2 timestamps in a row then read the 2 queries back. I can't imagine what is possibly causing this situation where 2 timestamps back to back with no other code or opengl calls inbetween are returning up to 0.150ms difference.   I realise I am trying to measure extremely precise timings but various documentation and annecdotal stories lead me to believe it is possible to get correct timings using this method which I believe is also similar to the way it is done in directX.   Perhaps some experienced people could suggest a reliable way of measuring gpu timings in openGL or point me to a source of information I could read on how to do it correctly.
  12. Hi all, I'm new to GameDev and new-ish to modern graphics programming so you'll have to forgive any extreme ignorance on my part.   I'm in the process of adding various post-processing effects to a rendering engine which I am working on. I am using OpenGL and have set up a couple of framebuffers with multiple texture attachments which I am using to render multiple passes of screenspace effects to (DoF, SSAO, bloom, etc.)   Currently I am trying to optimise the post processing effects as much as possible and after doing some rudimentary profiling I was surprised to see how much time is being taken to perform what I thought should be very simple and fast tasks. Primarily that is generating mipmaps and up/downscaling images.   Much of what I have read regarding post processing often talks about getting a significant speed increase by only processing a half or quater size screen image where suitable. This makes great sense although the cost of any downscaling or upscaling passes never seem to be mentioned,  my testing is showing significant cost which seems to often almost cancel out any benefit.   I am aware that it is much faster for the gpu to run ALU instructions than to read and write from textures. Is this difference on modern gpu's significant enough to cause the performance I am seeing?   Some examples: 0.595ms to generate mipmaps for one fullscreen texture ( using glGenerateMipmap(GL_TEXTURE_2D) ). 0.268ms to read a fullscreen texture and render it to another texture at 1/4 size.   In contrast my SSAO takes almost exactly 1ms running at fullscreen, doing multiple reads per screen pixel of a 32bit depthbuffer then reading a full screen texture and applying the AO result to that and rendering it out to a new fullscreen texture. Why would it be almost 1/3 the cost of a fullscreen AO effect just to render a texture at 1/4 size where the shader is literally one texture fetch then an output? It is just not making sense to me.   I was hoping to convert my SSAO to run at 1/2 screen size, I also hoped to convert my bloom from a single 1/4 sized blur to a combination of multiple screen sizes as I have seen in many examples. At this point the cost of writing and reading many textures seems to very high, around 3ms total is being spent not actually processing anything other than rescaling.   I can't help but feel I must be doing something fundamentally wrong, many papers, books and tutorials talk about using many render passes and many variously scaled images as if it is commonplace and trivial. Many bloom techniques I have read talk about using 1/2, 1/4, 1/8 and 1/16 size images blurred and combined. I know for a fact many modern game engines are doing many post processing passes and rescaling for screenspace effects, I find it hard to believe they are spending several ms just on reading and writing to multiple textures.   Would using Compute shaders to resize textures be faster than fragment shaders?   What is the typical "budget" of processing time for various common effects? That is to say assuming many of the post effects are screenspace and have a relatively fixed cost per screen size regardless of the scene, how much of a % of a theoretical 16.6ms total rendering budget would be allocated to bloom, DoF, SSAO, etc...?  
  • Advertisement
×

Important Information

By using GameDev.net, you agree to our community Guidelines, Terms of Use, and Privacy Policy.

We are the game development community.

Whether you are an indie, hobbyist, AAA developer, or just trying to learn, GameDev.net is the place for you to learn, share, and connect with the games industry. Learn more About Us or sign up!

Sign me up!