Fast Way To Determine If All Pixels In Opengl Depth Buffer Were Drawn At Least Once?

Martin B?h Musálek · 2016-10-14T15:15:08

Hello, I am programming a FPS game and I simply want to make it faster. I tried a lot of things, from which few of them worked. My testing map has 15k vertexes and 26k triangles, and I am using partitioning space by X*Y*Z orthogonal cubes. Thats fine cause it works. Next thing that helped a lot was to order partitions by metric from partition where I am and display them from nearest, which caused OpenGl to not overdraw it so much. Also I use face culling and my own per-triangle frustum clipping, and also overdraw check that makes it sure that same triangle is rendered only once. Also before loading a map, the triangle lists of each partition are ordered by texture to minimalize need of switching glEnd and glBegin, which caused big slowdown. Also, I tried arranging triangles into triangle strips, but they suck arse, and also I tried using VBOs instead glBegin and glEnd, but it didnt helped that much as internet promised. My next, yet undone idea is to compute all this stuff just when the player position and rotation changes, and if not just render it as before without any computations. Anyway, in every tic I count number of triangles being actually drawn. As a testing map I use Hell Gate from Quake III (that one with crazy mouth in a room, if somebody knows that one :D ) loaded from exported .obj file. When I am on the end of map and looking outside (in the mouth), 36 triangles are being drawn, which seems fair to me. However, turning by 180 degrees causes me to look INSIDE the map and my frustum to contain nearly all partitions, and 22k triangles from 26k are displayed. Now I get to my idea - display few partitions, then CHECK IF ALL PIXELS WERE DISPLAYED, if not, display some more partitions, and so on. That could make it really fast, cause it would cut of everything excpet the first room. Problém is that extracting depth buffer and checking all the 1920x1080 of that little guys is so slow that it would be contraproductive (proven by try). So my question is - is there actually a FAST way how to check if all pixels are rendered at least once? (= if the depth value is not 127 anywhere) I did like 3-hours research which didnt found answer. Also, if people here will say "no" it will encourage me in writing my own rasterization (at least I will have totally full control).

Graphics and GPU Programming Programming OpenGL

Started by Macin Software July 24, 2016 07:53 PM

18 comments, last by Macin Software 7 years, 6 months ago

Macin Software

138

Author

September 04, 2016 09:39 AM

mhagain:

You understood me bad - I am loading from .obj just initially after I model some thing in Cinema4D and export it. Then I have a function in my engine that already saves a map in my own format, which saves every number as bytes, plus there is also space partitioning saved in that file. I dont know what exactly you mean with memory-mapping a file, but from name it sounds that it would only work for static arrays of classes, and not for linear linked list of dynamically allocated classes as I have. I like .obj cause its intuitive and readable and there is no reverse engineering about format specification.

PVS are really good idea, I like it. I will definitely try it. I am thinking of making special class for that, which is some box area and list of other box areas that are potentially visible. In map editor, the user will choose these manually so there will be no need to make some extra complicated code that determines what is visible from where (even if THIS is exactly the point where occlussion culling would be useful...) and also no need to compile a map like this after every change.

dpadam450

2,403

September 09, 2016 05:21 AM

My testing map has 15k vertexes and 26k triangles, and I am using

You are looking for optimizations in the wrong place if you are getting such low framerates, with or without any texture filtering. This scene is on my old Radeon 7870 video card from 3 years ago:

This is 3,000 plants drawn on top of a terrain with anisotropic texture filtering set to highest, or at least medium. So we are talking about 1 million triangles and plenty of overdraw between the plants. FPS was ... I don't remember around 60 though. You have other issues to sort out depending on how good your hardware is. 30 FPS is bad. 60fps sounds like you may have vsync on so it is capping it at 60. Typically vsync will cap framerates to either 30 or 60, nothing in between. At 56 fps you could be capped down to 30 potentially.

http://orig04.deviantart.net/7e28/f/2014/271/0/3/desert2_by_dpadam450-d80w64l.jpg

NBA2K, Madden, Maneater, Killing Floor, Sims http://www.pawlowskipinball.com/pinballeternal

Macin Software

138

Author

September 20, 2016 08:41 AM

dpadam450:

Comp where it was tested is Asus X751L. Has Nvidia GeForce 820M, Intel Core i7-4500U 1.8GHz, 12GB RAM. I think hardware should be enough OK.

And yes I am using vsync, thats why I get 60fps. Previously I heard that 25fps is OK, but truth is that you feel subliminaly that something is wrong. I choose 60fps. The texture of triangles was just 32x32. When I put there some 256x256 texture, framerate is same (so there is so far no need for mipmapping).

Your image looks nice, I even see you have some shadows. Perhaps I could make similar benchtest like that and test on it.

I got next idea that I will implement some multithreading (theoretically less then 4x speedup when done correctly). Do you guys use it in your games?

_Silence_

1,075

September 20, 2016 11:02 AM

I choose 60fps. The texture of triangles was just 32x32. When I put there some 256x256 texture, framerate is same (so there is so far no need for mipmapping).

Any other good reasons for this ? Actually you might not see any differences, but maybe later, you will notice a performance drop, and you'll spend days (and certainly a lot more) to detect that this is due to the fact that you don't use mipmapping. One big reason for this is that you currently use vsync. So as long as your GPU can afford things, you'll have 60 fps, but once it won't, you won't get 59 fps or so, you'll end up with 30 fps...

Mipmaps is just a factor of few more lines in your code.

Of course, it (almost) double the memory requirement for all your images. But except if you really target to use most of your VRAM for your geometry (and other buffers), I don't see any real good motivations by not using mipmapping.

21st Century Moose

13,459

September 20, 2016 02:08 PM

Mipmapping is not only a performance thing; it's also affects image quality, and in fact the primary reason why mipmapping was invented in the first place was for quality.

https://en.wikipedia.org/wiki/Mipmap

They are intended to increase rendering speed and reduce aliasing artifacts. ..... Mipmapping was invented by Lance Williams in 1983 and is described in his paper Pyramidal parametrics. From the abstract: "This paper advances a 'pyramidal parametric' prefiltering and sampling geometry which minimizes aliasing effects and assures continuity within and between target images." The "pyramid" can be imagined as the set of mipmaps stacked on top of each other.

I suggest that you do some research on aliasing to fully understand the problems that this solves. Also be aware that to some people, aliasing may be confused with additional detail.

Mipmaps don't use almost double the memory - they use one-third extra. But don't get fooled into thinking that memory usage is a primary arbiter of performance, because it's not.

Direct3D has need of instancing, but we do not. We have plenty of glVertexAttrib calls.

_Silence_

1,075

September 21, 2016 06:56 AM

Mipmapping is not only a performance thing; it's also affects image quality

Right. But one could notice aliasing effects more easily. Aliasing on a textured object could probably means a texture issue (or depth issue, right). But a sudden FPS drop can have many reasons.

Mipmaps don't use almost double the memory - they use one-third extra.

Right too. I'm still wondering how I could reach such a result...

Macin Software

138

Author

September 29, 2016 07:41 AM

How much faster should be VBO compared to glBegin/glEnd?

Cause with glBegin/glEnd I got 41fps and with VBO like 43fps in same point. So yes, it IS faster, but honestly I expected more improvement, since its one of main tips people give you on optimization of displaying. The speedup will probably vary depending on type of rendered stuff, but I dont know where it helps most. Another 2 problems (minor problems, I can survive these) is ugly syntax compared to glBegin/End and fact that when you create VBO for some object that changes geometry, the VBO doesnt change with position of vertexes, so you must reupdate positions that are previously copied into the data buffer.

_Silence_

1,075

September 29, 2016 08:24 AM

the VBO doesnt change with position of vertexes, so you must reupdate positions that are previously copied into the data buffer.

The best thing, as far as you can do it, is to let the vertex shader do these calculations.

How much faster should be VBO compared to glBegin/glEnd?

It depends on many things. And this might be related to the fact that you update the VBO each frame with new vertex positions.

If you want/can live with GL immediate mode, then why not. But you must be aware that immediate mode is deprecated since GL 3. For example your code won't work on Apple machines, and on mobile devices neither.

21st Century Moose

13,459

September 29, 2016 08:48 AM

How much faster should be VBO compared to glBegin/glEnd?

Cause with glBegin/glEnd I got 41fps and with VBO like 43fps in same point. So yes, it IS faster, but honestly I expected more improvement...

This is the naive expectation. As Silence correctly observes, if your VBO implementation is just a glBegin/glEnd-like wrapper around the VBO API, or if you're updating data each frame, glBegin/glEnd will often outperform it. It's common to see bad VBO usage actually perform worse. An alternative for the dynamic data requirement is to use client-side vertex arrays.

It's also the case that your actual bottleneck may be elsewhere. GPU pipelines are very deep and using a VBO just addresses performance at one very small part of them. If you're not actually transferring much data to the GPU (and 15k vertices is not much) then using a VBO, even in the best case scnario, isn't going to give you much perf gain, if any.

Direct3D has need of instancing, but we do not. We have plenty of glVertexAttrib calls.

Macin Software

138

Author

October 14, 2016 03:15 PM

Recently I have done the networking and tested it by first deathmatch ever. But the performance on other comps than mine is still one of big problems. I figured out, that when I replace item models with simplified ones, I get OK fps, but thats a wimpy solution.

Also - mipmaps have been implemented. Adjustable LOD bias has no effect on performance, even if set as high (like 9) that textures fade into single color.

It is sow really in glBegin/glEnd when switching textures. Texture sorting helped a bit, but not totally. So I packed all my textures to one big, and before compiling a map i will do just linear fix for texcoords to make their xy ranges in big texture. Trouble is that it will not repeat or wrap anymore, and I need that. Texcoords outside 0..1 will continue in big map through other textures, which is what I dont want. I was looking on Internet for solutions, but there is nothing. My idea is to somehow get into place in memory where texture is stored and manually hack width and height and starting pointer, to make it think that its actually just a region of itself. There will be complication with mipmaps, but thats not the immediate problem. Also textures will have to be stored not in image as I see it, but lineary after rows, so program could read it. I have dilema if I should go for it or no, cause maybe there is function like that - I just looked not enough. Anyone knows a functions that draws only region of texture? I care only about those WITH PRESERVED ABILITY TO WRAP OR REPEAT - the answer TexCoord2f(0.5,0.5) really is not what I am looking for :D

Fast Way To Determine If All Pixels In Opengl Depth Buffer Were Drawn At Least Once?

This topic is closed to new replies.

Popular Topics

Recommended Tutorials

Fast Way To Determine If All Pixels In Opengl Depth Buffer Were Drawn At Least Once?

This topic is closed to new replies.

Popular Topics

Recommended Tutorials

Reticulating splines