• Announcements

    • khawk

      Download the Game Design and Indie Game Marketing Freebook   07/19/17

      GameDev.net and CRC Press have teamed up to bring a free ebook of content curated from top titles published by CRC Press. The freebook, Practices of Game Design & Indie Game Marketing, includes chapters from The Art of Game Design: A Book of Lenses, A Practical Guide to Indie Game Marketing, and An Architectural Approach to Level Design. The GameDev.net FreeBook is relevant to game designers, developers, and those interested in learning more about the challenges in game development. We know game development can be a tough discipline and business, so we picked several chapters from CRC Press titles that we thought would be of interest to you, the GameDev.net audience, in your journey to design, develop, and market your next game. The free ebook is available through CRC Press by clicking here. The Curated Books The Art of Game Design: A Book of Lenses, Second Edition, by Jesse Schell Presents 100+ sets of questions, or different lenses, for viewing a game’s design, encompassing diverse fields such as psychology, architecture, music, film, software engineering, theme park design, mathematics, anthropology, and more. Written by one of the world's top game designers, this book describes the deepest and most fundamental principles of game design, demonstrating how tactics used in board, card, and athletic games also work in video games. It provides practical instruction on creating world-class games that will be played again and again. View it here. A Practical Guide to Indie Game Marketing, by Joel Dreskin Marketing is an essential but too frequently overlooked or minimized component of the release plan for indie games. A Practical Guide to Indie Game Marketing provides you with the tools needed to build visibility and sell your indie games. With special focus on those developers with small budgets and limited staff and resources, this book is packed with tangible recommendations and techniques that you can put to use immediately. As a seasoned professional of the indie game arena, author Joel Dreskin gives you insight into practical, real-world experiences of marketing numerous successful games and also provides stories of the failures. View it here. An Architectural Approach to Level Design This is one of the first books to integrate architectural and spatial design theory with the field of level design. The book presents architectural techniques and theories for level designers to use in their own work. It connects architecture and level design in different ways that address the practical elements of how designers construct space and the experiential elements of how and why humans interact with this space. Throughout the text, readers learn skills for spatial layout, evoking emotion through gamespaces, and creating better levels through architectural theory. View it here. Learn more and download the ebook by clicking here. Did you know? GameDev.net and CRC Press also recently teamed up to bring GDNet+ Members up to a 20% discount on all CRC Press books. Learn more about this and other benefits here.
Sign in to follow this  
Followers 0
maxgpgpu

OpenGL
difficult problem for OpenGL guru

8 posts in this topic

I have a difficult problem that I cannot figure out an efficient way to solve.  Part of my problem is, I'm not intimately familiar with every last nook and cranny of OpenGL and GPU pipelines, and this program can obviously be solved in several ways.  I need to find the most efficient solution (fastest running), not the simplest, because millions of vertices must be processed each frame.

 

Here is a generic statement of what needs to happen.

 

#1:  The application program on the CPU contains an array of roughly 2 billion vertices in world-coordinates, each with an RGB color and possibly a couple other items of information.  The original coordinates are spherical coordinates (2 angles + distance), but separate files can be created with x,y,z cartesian coordinates to speed processing by CPU and/or GPU if appropriate.

 

#2:  Before the application runs, the vertex data is transferred from disk into CPU memory.  This will consume several gigabytes of RAM, but will fit into memory without swapping.

 

#3:  However, the entirety of vertex data will not fit into the RAM in current generation GPUs.  We can assume the application is only run on GPUs that contain gigabytes of internal RAM, with at least 1 gigabyte always allocated to this vertex data.

 

#4:  The data on disk is organized in a manner analogous to a cube-map to make the following processes efficient.  The data for each face of the cube-map are subdivided into a 1024x1024 array of subsections called "fields", each of which can be easily and efficiently located and accessed independently by the application program to support efficient culling.

 

#5:  The vertex data for all currently visible fields will presumably be held in a fixed number of dedicated VBO/VAOs in GPU memory.  For normal viewport angles, probably a few million to several million vertices will be in these VBO/VAOs, and need to be processed and displayed each frame.

 

#6:  When the camera/viewpoint rotates more than about 0.1 degree, some "fields" will no longer be visible in the viewport, and other "fields" will become visible.  When this happens, the application will call OpenGL functions to write the vertex data of newly visible vertices over the vertex data of no longer visible vertices, so no reallocation of VBO/VAOs is ever required.

 

#7:  Each frame, the entire scene except for this vertex data is first rendered into the framebuffer and depthbuffer.

 

#8:  The OpenGL state can be modified as necessary before rendering the vertex data.  New vertex and fragment shader programs can be enabled to implement rendering of the vertex data in the required manner.

 

#9:  All these special vertex data is now rendered.

 

-----

 

The above is just background.  The tricky question for OpenGL gurus is how to most efficiently render the vertex data in the following manner:

 

#1:  Each vertex is a point.  No lines or triangles are rendered.

 

The following requirements are what make this problem difficult...

 

#2:  For each vertex, find the value in the depth buffer that corresponds to where this vertex/point would be displayed.  The nearest point is what we need to find, since we must assume for this step that the actual physical size of each point is zero (infinitesimal).

 

#3:  If the depth buffer value indicates the depth buffer (and corresponding pixel in color-buffer) has already been written this frame, then no further action may be taken for this vertex (the color-buffer and depth-buffer are not modified).  In effect, we want to discard this vertex and not perform any subsequent processes for this vertex.  Otherwise perform the following steps.

 

#4:  Based upon the brightness and color of the vertex data (the RGB or ARGB values in each vertex structure), render a "blob" of the appropriate brightness and size (maximum 16x16 ~ 64x64 screen pixels), centered on the screen pixel where the depth-buffer was tested.

 

NOTE:  The most desirable way to render the image for this "blob" is "procedurally".  In other words, all screen pixels within 1 to 32 pixels of the computed screen pixel would be rendered by a fragment shader program that knows the vertex data (brightness and color), plus how far away each pixel is from the center of the image (dx,dy).  Based on that information, the fragment shader code would appropriately render its pixel of the blob image.

 

Alternatively, a "point sprite texture" could be selected from an array of tiny textures (or a computed subregion of one texture) based the brightness and color information.  Then a point sprite of the appropriate size and brightness could be rendered, centered on the screen pixel computed for the original vertex.

 

In either case, the RGB of each screen pixel must be summed (framebuffer RGB = framebuffer RGB + new RGB).

 

The depth buffer must not be updated.

 

-----

 

The above is what needs to happen.

 

What makes these requirements so problematic?

 

#1:  We need to make the rendering of an extended region of the screen conditional upon whether a single vertex/point in the depth-buffer has been written.  I infer that the many pixels in a point-sprite rendering are independently subjected to depth tests by their fragment shaders, and therefore the entire point-sprite would not be extinguished just because the center happened to be obscured.  Similarly, I do not see a way for a vertex shader or geometry shader to discard the original vertex before it invokes a whole bunch of independent fragment shaders (either to render the pixels in the point-sprite, or to execute a procedural routine).

 

#2:  It appears to me that vertex-shaders and geometry-shaders cannot determine which framebuffer pixel corresponds to a vertex, and therefore cannot test the depth-buffer for that pixel (and discard the vertex and stop subsequent steps from happening).

 

The last couple days I've been reading the new SuperBible and the latest OpenGL specs... and they are so chock full of very cool, very flexible, very powerful capabilities and features.  At many points I thought I found a way to meet these requirements, but... always came up short.  In some cases I could see a feature won't work, but in other cases I had to make certain assumptions about subtle details of how GPU pipelines work, and the OpenGL specification.  So... I'm not convinced there isn't some way to accomplish what I need to accomplish.  In fact, I have the distinct feeling "there must be"... but I just can't find it or figure it out.

 

But I bet someone who has been mucking around in OpenGL, GLSL and the details of the GPU pipeline understands how everything works in sufficient detail that... they'll immediately flash on the solution!  And tell me!

 

I'd hate for this process to require two passes.  I suppose we could create a first pass that simply writes or clears a single bit (in a texture or ???) that corresponds to each vertex, indicating whether the screen pixel where the vertex would be displayed as a point is currently written or not (assuming the fragment shader can read the depth buffer, and assume 1.0 means "not drawn").  Then on the second pass the vertex shader could discard any vertex with bit=0 in that texture.  Oops!  Wait... can vertex shaders discard vertices?  Or maybe a whole freaking extra geometry shader would be needed for this.  Gads, I hate multiple passes, especially for something this trivial.

 

I'll even be happy if the solution requires a very recent version of OpenGL or GLSL.

 

Who has the solution?

 

 

0

Share this post


Link to post
Share on other sites

If I understand you correctly you have depth buffer & bunch of vertexes, now you want to render pointsprites (or quads) based on the information that this particular vertex passes the depth test. 

 

Assuming that the vertices CANNOT write anything to the depth buffer - the easiest way would be to copy depth buffer as texture before drawing vertices. Then acces it from vertex shader for testing (compute screen space coords in vs, then lockup copied depth and reject in vertex shader using either gl_PointSize=0 if you are happy with pointsprites, or gl_ClipDistance[0]=-1 if you need to use geometry shader for the quad expansion (remember to glEnable(GL_CLIP_DISTANCE0); in code!)). You simply disable depth testing for fragments, all is done in vertex shader, fragments are invoked only for vertices that passed test.

 

** If the vertices CAN contribute (write) to the depth buffer values i dont see any fast way to do this.

 

 

Feel free to ask about details if something is unclear!

1

Share this post


Link to post
Share on other sites

maxgpgpu:

#2: geometry shader indeed CAN discard the vertices by simply not emitting any primitives. Any shader stage can read the depth texture as ADDMX suggested.

 

#1: I don't quite understand, you keep mixing per-fragment and per-vertex depth-tests. Remember that vertex/geometry/hull/tessellation shaders operate with vertices, they know nothing about the final fragments that a rasteriser might generate. They can, however, project anything anywhere and sample any textures they like. Only geometry shader has the ability of not emitting anything and effectively exitting the pipeline.

 

I assume you don't want to discard a whole primitive (2 triangle sprite) based on its centre. In such case, when you need a per-fragment depth-test, you'll need to do it in the fragment shader, indeed, and the above helps not.

 

Nevertheless, you might still do some kind of conservative geometry-shader killing, for example using some kind of conservative Hi-Z / "mip-mapped" depth texture and consvative AABB of the final primitive, or something similar, where only a couple texture samples would be enough to safely tell that the whole primitive is "behind".

Edited by pcmaster
0

Share this post


Link to post
Share on other sites


#1:  We need to make the rendering of an extended region of the screen conditional upon whether a single vertex/point in the depth-buffer has been written.  I infer that the many pixels in a point-sprite rendering are independently subjected to depth tests by their fragment shaders, and therefore the entire point-sprite would not be extinguished just because the center happened to be obscured.  Similarly, I do not see a way for a vertex shader or geometry shader to discard the original vertex before it invokes a whole bunch of independent fragment shaders (either to render the pixels in the point-sprite, or to execute a procedural routine).
As you've already discovered: Use point sprites and disable depth testing.

To selectively discard a vertex, either return the actual transformed vertex, or return an invalid/off-screen vertex for vertices to be discarded, such as vec4(0,0,0,0)

 


#2:  It appears to me that vertex-shaders and geometry-shaders cannot determine which framebuffer pixel corresponds to a vertex, and therefore cannot test the depth-buffer for that pixel (and discard the vertex and stop subsequent steps from happening).
Disable hardware depth testing and implement it yourself in the vertex shader. Bind a texture to the vertex shader containing the depth values, and perform the comparison yourself.
1

Share this post


Link to post
Share on other sites

If I understand you correctly you have depth buffer & bunch of vertexes, now you want to render pointsprites (or quads) based on the information that this particular vertex passes the depth test. 

 

Assuming that the vertices CANNOT write anything to the depth buffer - the easiest way would be to copy depth buffer as texture before drawing vertices. Then acces it from vertex shader for testing (compute screen space coords in vs, then lockup copied depth and reject in vertex shader using either gl_PointSize=0 if you are happy with pointsprites, or gl_ClipDistance[0]=-1 if you need to use geometry shader for the quad expansion (remember to glEnable(GL_CLIP_DISTANCE0); in code!)). You simply disable depth testing for fragments, all is done in vertex shader, fragments are invoked only for vertices that passed test.

 

** If the vertices CAN contribute (write) to the depth buffer values i dont see any fast way to do this.

 

 

Feel free to ask about details if something is unclear!

 

I have a feeling this reply is going to sound stupid, but here goes.  First let me try to clearly answer your first sentence, which is:

 

If I understand you correctly you have depth buffer & bunch of vertexes, now you want to render pointsprites (or quads) based on the information that this particular vertex passes the depth test.

 

 

 

Yes, the engine has already drawn everything (environment, objects, etc) for this frame.  The only things not yet drawn for this frame are the huge array of vertices.  One point-sprite must be drawn for each visible vertex, where visible means nothing has been drawn on the screen-pixel where the vertex will be drawn.  The peculiarity of my requirement is that we need to suppress or draw the entire point-sprite based upon the contents of the depth-buffer at the screen-pixel where the vertex would be drawn.

 

Here are a couple examples to clarify what this means.

 

peculiar case #1:  Assume the vertex would be drawn on a screen-pixel that had never been drawn during this frame, but immediately adjacent to that pixel is a wall or some other large object.  In this case, the entire 3x3 to 65x65 pixel point-sprite must be drawn, including the portion that overlaps the wall.

 

peculiar case #2:  Assume the vertex would be drawn one pixel closer to the wall or object described in the above case, and thus the vertex would fall on a pixel already drawn to display the wall or object.  The depth-buffer would therefore contain a value less than "infinity" (which is probably 1.0 in practice), and therefore the vertex would not be drawn (since they are all effectively at a distance of "infinity").  In this case, the entire 3x3 to 65x65 pixel point-sprite must be suppressed, and nothing be drawn as a consequence of this vertex.

 

As I read your sentence, you are correct.  However, unless I am mistaken about quads, the only OpenGL mechanism that works for this application is the point-sprite mechanism.  Why?  Because I need the center of the point-sprite image to be displayed at the screen-pixel where the vertex would be drawn, and from what I can tell the point-sprite does this, but there is no way to know where to draw a variable-size quad to assure the center is located on the screen-pixel where the vertex would have been drawn.

 

-----

 

I did say that these vertices cannot write the depth-buffer.  However, that is not absolutely necessary in practice, since presumably we can force the fragment shader to write "infinity" for the depth.  I'm not certain, but this might mean writing 1.0 (or any value greater than 1.0) in the fragment shader.  So if you have some reason to want to write the depth-buffer, I suppose we can do that.  However, since this process will necessarily require switching to special-purpose shaders, we also have the luxury of setting the OpenGL state any way we wish to make the process function as desired.  I looked at those state to see if I could find a way to help make this work, but didn't find any combination that works.

 

Like you say, if somehow I can compute in the vertex or geometry shader which screen-pixel in the framebuffer will be written by the vertex, then I could do exactly as you say.  Well, that assumes the vertex shader can read the depth-buffer (or before we execute this pass my application can copy the whole depth-buffer into a texture that is available to the vertex shader).  Is this computation possible?  I certainly don't know how to perform that computation.  Can you point me at something to show me how to do that?  I suppose in some sense I already have much of that in my fragment shader, but as far as I know the screen-pixel magically appears between the vertex-shader and fragment-shader (and furthermore, as far as I remember, the fragment shader doesn't even know which pixel in the framebuffer it will draw upon).  I'm probably missing something simple here.

Edited by maxgpgpu
0

Share this post


Link to post
Share on other sites

maxgpgpu:

#2: geometry shader indeed CAN discard the vertices by simply not emitting any primitives. Any shader stage can read the depth texture as ADDMX suggested.

 

#1: I don't quite understand, you keep mixing per-fragment and per-vertex depth-tests. Remember that vertex/geometry/hull/tessellation shaders operate with vertices, they know nothing about the final fragments that a rasteriser might generate. They can, however, project anything anywhere and sample any textures they like. Only geometry shader has the ability of not emitting anything and effectively exitting the pipeline.

 

I assume you don't want to discard a whole primitive (2 triangle sprite) based on its centre. In such case, when you need a per-fragment depth-test, you'll need to do it in the fragment shader, indeed, and the above helps not.

 

Nevertheless, you might still do some kind of conservative geometry-shader killing, for example using some kind of conservative Hi-Z / "mip-mapped" depth texture and consvative AABB of the final primitive, or something similar, where only a couple texture samples would be enough to safely tell that the whole primitive is "behind".

 

 

You say "Any shader stage can read the depth texture".  Do you mean the vertex or geometry shader can read individual depth-values from any x,y location in the depth-buffer?  How?  What does that code look like?  Or if you only mean to say the entire depth-buffer can be copied to a "depth texture" (of the same size), what does that code look like?  I understand the general process, but never seem to understand how the default framebuffer or its depth buffers can be specified.

 

-----

 

Yes, I probably do sound like I'm "mixing per vertex and per fragment depth tests" in my discussion.  Actually, it only seems that way, and that's my problem.  What I need is for each vertex in the VBO to be depth-tested, but the entire 3x3 to 65x65 pixel point sprite must be drawn or non-drawn on the basis of that one test.  Of course the depth-test of that vertex needs to be tested against the value in the depth-buffer where that vertex would be drawn, but as far as I understand the vertex shader doesn't have a clue at that stage of the pipeline which x,y pixel on the framebuffer or depth-buffer the vertex will fall.

 

Though the vertex shader can't "discard" a pixel, it can change its coordinates to assure the vertex is far behind the camera/viewpoint, right?  So that may be one way of effectively performing a discard in the vertex shader (for points only, which is what we're dealing with here).  Or do you think that's a stupid idea?

 

-----

 

You say, "I assume you don't want to discard a whole primitive (2 triangle sprite) based on its centre".  That is precisely what I need to do!!!!!  And that is what makes this problem difficult (for me, and maybe for anyone).  Read the example I gave in my previous reply (to ADDMX) for an example.  I need to discard (not draw) the entire point-sprite if the vertex (the center of the point-sprite) has been drawn to during the previous normal rendering processes.

 

This is the correct behavior of the process we're talking about here.  Consider a star for example, or a streetlight or airplane landing lights many miles away.  They are literally (for all practical purposes) "point sources" of light.  However, in our eyeballs, in camera lenses, on film, and on CCD surfaces a bright pinpoint of light blooms into a many pixel blur (or "airy disc" if the optical system is extraordinarily precise).  So, when the line-of-sight to the star or landing-lights just barely passes behind the edge of any object, even by an infinitesimal distance, the entire blur vanishes.

 

This is the kind of phenomenon I am dealing with, and must represent correctly.  So this is the physical reason why I must in fact do what you imagine I can't possibly want to do, namely "discard the whole primitive (a largish point-sprite) based upon its center".  And thus I do NOT want a "per fragment depth test", unless somehow we can perform a per-fragment depth test ONLY upon the vertex (the exact center of the point-sprite), then SOMEHOW stop all the other pixels of the point-sprite from being drawn.  I don't think that's possible, because all those pixels have already been created and sent to separate shader cores in parallel with the pixel at the exact center of the point-sprite.  That is, unless I don't understand something about how the pipeline works in the case of point-sprites.

 

I don't understand your last paragraph, but that probably doesn't matter, because it appears I am trying to do something you think I can't possibly want to do!  Hahaha.

0

Share this post


Link to post
Share on other sites

 


#1:  We need to make the rendering of an extended region of the screen conditional upon whether a single vertex/point in the depth-buffer has been written.  I infer that the many pixels in a point-sprite rendering are independently subjected to depth tests by their fragment shaders, and therefore the entire point-sprite would not be extinguished just because the center happened to be obscured.  Similarly, I do not see a way for a vertex shader or geometry shader to discard the original vertex before it invokes a whole bunch of independent fragment shaders (either to render the pixels in the point-sprite, or to execute a procedural routine).
As you've already discovered: Use point sprites and disable depth testing.

To selectively discard a vertex, either return the actual transformed vertex, or return an invalid/off-screen vertex for vertices to be discarded, such as vec4(0,0,0,0)

 

 

 


#2:  It appears to me that vertex-shaders and geometry-shaders cannot determine which framebuffer pixel corresponds to a vertex, and therefore cannot test the depth-buffer for that pixel (and discard the vertex and stop subsequent steps from happening).
Disable hardware depth testing and implement it yourself in the vertex shader. Bind a texture to the vertex shader containing the depth values, and perform the comparison yourself.

 

 

You say, "To selectively discard a vertex, either return the actual transformed vertex, or return an invalid/off-screen vertex for vertices to be discarded, such as vec4(0,0,0,0)".  That sounds correct to me.  What I don't understand is:

 

#1:  How can my vertex shader know where in the framebuffer and depthbuffer the vertex will fall.
#2:  And if you have an answer to the previous question, how can my vertex shader access that value in the depthbuffer to determine whether it has been written or not?

 

If you have answers to these two questions, I guess the values I will receive back from the depthbuffer will be 0.000 to 1.000 with 1.000 meaning "never written during this frame".

 

-----

 

You say, "Disable hardware depth testing and implement it yourself in the vertex shader. Bind a texture to the vertex shader containing the depth values, and perform the comparison yourself".  Okay, I take this to mean you have a valid answer to question #1 above, but not #2 above (in other words, you do not know any way for my vertex shader to read individual x,y locations in the framebuffer or depthbuffer.  And therefore you propose that after rendering the conventional geometry into the framebuffer and depthbuffer, I should then call OpenGL API functions to copy the depth-buffer to a "depth-texture" (a texture having a depthbuffer format), then draw all these VBOs full of vertices with a vertex shader that somehow computes the x,y location in the framebuffer and depthbuffer each vertex would be rendered to, and on the basis of the depth value, draw the point-sprite if (depth < 1.000) and otherwise throw the vertex to some invisible location to effectively make the vertex shader discard the entire point-sprite.

 

Do I have this correct?  If so, two questions:

 

#1:  How does the vertex shader compute the x,y location in the depth texture to access?
#2:  Is the value I get back from the depth-texture going to be a f32 value from 0.000 to 1.000?  Or a s16,u16,s24,u24,s32,u32 value with the largest positive value being equivalent to "infinity" AKA "never written during this frame"?

 

Thanks for helping!

0

Share this post


Link to post
Share on other sites

Just as an alternative -- instead of using quads to achieve this effect, you could use a bloom post-process wink.png

 

Sorry, I'm not an OpenGL guru, so this is all API agnostic:

#1:  How can my vertex shader know where in the framebuffer and depthbuffer the vertex will fall.

That is the main job of every vertex shader! The output position variable is the position in the framebuffer where the vertex will be located.
 
However, the VS outputs values in NDC coordinates, which range from -1 to +1, whereas textures range from 0 to 1.
So:
vertexScreenUV = vertexOutPosition.xy * 0.5 + 0.5;
or depending on the API, sometimes texture coordinates are upside down, so you might need:
vertexScreenUV = vertexOutPosition.xy * vec2(0.5, -0.5) + 0.5;
 

[edit]Oops, I forgot about perspective division:

vertexScreenUV = vertexOutPosition.xy/vertexOutPosition.w * 0.5 + 0.5;

 

#2:  And if you have an answer to the previous question, how can my vertex shader access that value in the depthbuffer to determine whether it has been written or not?

The same way that you read from a texture in the pixel shader. Create a sampler/texture in your shader, and read from it using the texture, etc,  function.
 

There will be some kind of API for creating a depth-buffer (it will be separate to the regular, automatically created one that comes with the device), and there will be a way to create a special kind of resource that's both bindable as a depth-buffer, and as a texture.
 
 

Older GPUs might not allow you to create a depth buffer that is readable as a texture, but DX10-level GPUs and onwards will allow this.

--To workaround this, you can use MRT (multiple render targets) to create your own depth texture. In your main rendering pass, you output your colour values to render-target #0, and manually output depth values to render-target #1.

 

Older GPUs also might not allow you to use textures in the vertex shader (but DX10+ ones will).

--There's a workaround for GPUs that don't support VTF (vertex-shader texture support) -- you have the vertex-shader pass the centre point to the pixel-shader as an extra varying/interpolant, and then in the pixel-shader, you fetch the depth value at that coordinate and compare it against the pixel depth.

 
 

Is the value I get back from the depth-texture going to be a f32 value from 0.000 to 1.000?  Or a s16,u16,s24,u24,s32,u32 value with the largest positive value being equivalent to "infinity" AKA "never written during this frame"?

The texture function returns a vec4 as usual, no matter what kind of texture it's reading from. The depth value will be in the r/x/[0] component, and yes 1.0 will represent the far plane.
If you've cleared the depth buffer using a value of 1.0, then yes, 1.0 will represent "never written to".
 

However, unless I am mistaken about quads, the only OpenGL mechanism that works for this application is the point-sprite mechanism.  Why?  Because I need the center of the point-sprite image to be displayed at the screen-pixel where the vertex would be drawn, and from what I can tell the point-sprite does this, but there is no way to know where to draw a variable-size quad to assure the center is located on the screen-pixel where the vertex would have been drawn.

Sure there is. Say that you're drawing a quad primitive using 4 verts:
Each vert has a position and a UV. All 4 verts have the same position, but 4 different UV's, e.g.

{ 42, 64, 13, -1, -1 }
{ 42, 64, 13,  1, -1 }
{ 42, 64, 13, -1,  1 }
{ 42, 64, 13,  1,  1 }

In the vertex shader, you can transform all of these points to the same position (e.g. outPos = mul( inPos, matrix )), but then offset them using the unique UV values (e.g. outPos.xy += inUV * scale).

Edited by Hodgman
1

Share this post


Link to post
Share on other sites

Hodgman:

 

Okay, rather than copy your message, which makes this message a bit difficult to parse, I'll just ask my followup questions here.

 

As far as I know, the conventional output of the vertex shader has not had the following performed:

  1:  perspective division

  2: viewport transformation

 

Nonetheless, I see that your answer might be correct anyway, if the code added to the vertex shader is written properly.  First of all, I've always had a suspicion that gl_Position.w is always 1.000 and therefore the perspective division doesn't change anything, and can therefore be ignored.  However, even if that is not always true (tell me), perhaps my transformed vertices always have gl_Position.z equal to 1.0000 since they are at infinity, and my model-view and projection transformation matrices don't contain anything especially wacko.

 

Then there's the viewport transformation, which appears like maybe can also be ignored due to the way textures are accessed.  What I mean is, I guess the normal output of the vertex shader is clip coordinates (not NDC = normalized device coordinates), BUT if we assume the output coordinates of the vertex shader in gl_Position always contains gl_Position.w == 1.0000, then "clip coordinates" may be the same as "NDC" (which would then correspond to what you said).

 

Then the viewport transformation scales the NDC coordinates by the width and height of the framebuffer in order to map the NDC to specific pixels in the framebuffer and depthbuffer.  However, if my vertex shader is not able to directly access the framebuffer or depthbuffer and instead has to access a texture, then there's no reason my vertex shader needs to compute the x,y pixel location in the framebuffer or depthbuffer.  Instead, it needs to compute the corresponding texture coordinates (presumably with "none" or "nearest" filtering or something like that).  And since the range of NDC and texture-coordinates are only a factor of two different, your trivial equation does the trick.

 

Very cool!

 

I guess the only thing this depends upon is... gl_Position.w == 1.0000 (but for objects at distance infinity, I'm betting that's pretty much guaranteed).  I know I should remember, but when is the perspective division value in gl_Position.w != 1.0000?  Gads, I can't believe I forget this stuff... it's only been several years since I wrote that part of the engine - hahaha.

 

-----

 

I am programming with the latest version of OpenGL and nvidia GTX680 cards (supports the latest versions of OpenGL and D3D), so fortunately I don't need to worry about compatibility with more ancient versions.  But thanks for noting that anyway.

 

-----

 

I don't entirely follow your last section, but I probably don't need to unless you tell me there is some speed or convenience advantage to displaying these star images with quads instead of point-sprites.  Is there?

 

Note that I much prefer to draw computed color values to the framebuffer with the pixel shader rather than just display a point-sprite texture or quad-primitive texture.  That way I can simulate optical aberrations [that are a function of the position relative to the center of the field], or even simulate atmospheric turbulence (twinkling of the stars) with procedural techniques.  At the moment I forget how to do this, so I'll have to hit the books and OpenGL specs again.  But what I need to compute the appropriate color for each pixel in the 3x3 to 65x65 region is to know the x,y offset from the center of the point-sprite.

 

I suppose the obvious way to do that is to fill the x,y elements in the point-sprite "image" with x,y pixel offset values instead of RG color information (and receive the RGBA color values as separate variables from the original vertex).

 

I sorta maybe half vaguely recall there is a gl_PointSize output from the vertex shader, which would be perfect, because then I can specify the appropriate point-sprite size (1x1, 3x3, 5x5, 7x7, 9x9... 63x63, 65x65) depending on the star brightness.

 

I sorta maybe half vaguely also recall there is a gl_PointCoord input to the pixel shader that the GPU provides to identify where in the point-sprite the current pixel is.  If so, that's perfect, because then the pixel shader can compute the appropriate brightness and color to draw each screen pixel based upon the original vertex RGBA color (which presumably is passed through and not interpolated since there is only one such value in a point) and the gl_PointCoord.xy values, plus a uniform variable specifies "time" to based twinkling on.

 

Oh, and I guess I'll need to have the vertex shader output the NDC of the vertex unless the screen-pixel x,y is available to pixel shaders (which I don't think is).  Hmmm... except I need to multiply by the number of x and y pixels in the frame buffer to make the value proportional to off-axis angle.

 

Getting close!

 

Thanks for your help.

0

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!


Register a new account

Sign in

Already have an account? Sign in here.


Sign In Now
Sign in to follow this  
Followers 0

  • Similar Content

    • By Toastmastern
      So it's been a while since I took a break from my whole creating a planet in DX11. Last time around I got stuck on fixing a nice LOD.
      A week back or so I got help to find this:
      https://github.com/sp4cerat/Planet-LOD
      In general this is what I'm trying to recreate in DX11, he that made that planet LOD uses OpenGL but that is a minor issue and something I can solve. But I have a question regarding the code
      He gets the position using this row
      vec4d pos = b.var.vec4d["position"]; Which is then used further down when he sends the variable "center" into the drawing function:
      if (pos.len() < 1) pos.norm(); world::draw(vec3d(pos.x, pos.y, pos.z));  
      Inside the draw function this happens:
      draw_recursive(p3[0], p3[1], p3[2], center); Basically the 3 vertices of the triangle and the center of details that he sent as a parameter earlier: vec3d(pos.x, pos.y, pos.z)
      Now onto my real question, he does vec3d edge_center[3] = { (p1 + p2) / 2, (p2 + p3) / 2, (p3 + p1) / 2 }; to get the edge center of each edge, nothing weird there.
      But this is used later on with:
      vec3d d = center + edge_center[i]; edge_test[i] = d.len() > ratio_size; edge_test is then used to evaluate if there should be a triangle drawn or if it should be split up into 3 new triangles instead. Why is it working for him? shouldn't it be like center - edge_center or something like that? Why adding them togheter? I asume here that the center is the center of details for the LOD. the position of the camera if stood on the ground of the planet and not up int he air like it is now.

      Full code can be seen here:
      https://github.com/sp4cerat/Planet-LOD/blob/master/src.simple/Main.cpp
      If anyone would like to take a look and try to help me understand this code I would love this person. I'm running out of ideas on how to solve this in my own head, most likely twisted it one time to many up in my head
      Thanks in advance
      Toastmastern
       
       
    • By fllwr0491
      I googled around but are unable to find source code or details of implementation.
      What keywords should I search for this topic?
      Things I would like to know:
      A. How to ensure that partially covered pixels are rasterized?
         Apparently by expanding each triangle by 1 pixel or so, rasterization problem is almost solved.
         But it will result in an unindexable triangle list without tons of overlaps. Will it incur a large performance penalty?
      B. A-buffer like bitmask needs a read-modiry-write operation.
         How to ensure proper synchronizations in GLSL?
         GLSL seems to only allow int32 atomics on image.
      C. Is there some simple ways to estimate coverage on-the-fly?
         In case I am to draw 2D shapes onto an exisitng target:
         1. A multi-pass whatever-buffer seems overkill.
         2. Multisampling could cost a lot memory though all I need is better coverage.
            Besides, I have to blit twice, if draw target is not multisampled.
       
    • By mapra99
      Hello

      I am working on a recent project and I have been learning how to code in C# using OpenGL libraries for some graphics. I have achieved some quite interesting things using TAO Framework writing in Console Applications, creating a GLUT Window. But my problem now is that I need to incorporate the Graphics in a Windows Form so I can relate the objects that I render with some .NET Controls.

      To deal with this problem, I have seen in some forums that it's better to use OpenTK instead of TAO Framework, so I can use the glControl that OpenTK libraries offer. However, I haven't found complete articles, tutorials or source codes that help using the glControl or that may insert me into de OpenTK functions. Would somebody please share in this forum some links or files where I can find good documentation about this topic? Or may I use another library different of OpenTK?

      Thanks!
    • By Solid_Spy
      Hello, I have been working on SH Irradiance map rendering, and I have been using a GLSL pixel shader to render SH irradiance to 2D irradiance maps for my static objects. I already have it working with 9 3D textures so far for the first 9 SH functions.
      In my GLSL shader, I have to send in 9 SH Coefficient 3D Texures that use RGBA8 as a pixel format. RGB being used for the coefficients for red, green, and blue, and the A for checking if the voxel is in use (for the 3D texture solidification shader to prevent bleeding).
      My problem is, I want to knock this number of textures down to something like 4 or 5. Getting even lower would be a godsend. This is because I eventually plan on adding more SH Coefficient 3D Textures for other parts of the game map (such as inside rooms, as opposed to the outside), to circumvent irradiance probe bleeding between rooms separated by walls. I don't want to reach the 32 texture limit too soon. Also, I figure that it would be a LOT faster.
      Is there a way I could, say, store 2 sets of SH Coefficients for 2 SH functions inside a texture with RGBA16 pixels? If so, how would I extract them from inside GLSL? Let me know if you have any suggestions ^^.
    • By KarimIO
      EDIT: I thought this was restricted to Attribute-Created GL contexts, but it isn't, so I rewrote the post.
      Hey guys, whenever I call SwapBuffers(hDC), I get a crash, and I get a "Too many posts were made to a semaphore." from Windows as I call SwapBuffers. What could be the cause of this?
      Update: No crash occurs if I don't draw, just clear and swap.
      static PIXELFORMATDESCRIPTOR pfd = // pfd Tells Windows How We Want Things To Be { sizeof(PIXELFORMATDESCRIPTOR), // Size Of This Pixel Format Descriptor 1, // Version Number PFD_DRAW_TO_WINDOW | // Format Must Support Window PFD_SUPPORT_OPENGL | // Format Must Support OpenGL PFD_DOUBLEBUFFER, // Must Support Double Buffering PFD_TYPE_RGBA, // Request An RGBA Format 32, // Select Our Color Depth 0, 0, 0, 0, 0, 0, // Color Bits Ignored 0, // No Alpha Buffer 0, // Shift Bit Ignored 0, // No Accumulation Buffer 0, 0, 0, 0, // Accumulation Bits Ignored 24, // 24Bit Z-Buffer (Depth Buffer) 0, // No Stencil Buffer 0, // No Auxiliary Buffer PFD_MAIN_PLANE, // Main Drawing Layer 0, // Reserved 0, 0, 0 // Layer Masks Ignored }; if (!(hDC = GetDC(windowHandle))) return false; unsigned int PixelFormat; if (!(PixelFormat = ChoosePixelFormat(hDC, &pfd))) return false; if (!SetPixelFormat(hDC, PixelFormat, &pfd)) return false; hRC = wglCreateContext(hDC); if (!hRC) { std::cout << "wglCreateContext Failed!\n"; return false; } if (wglMakeCurrent(hDC, hRC) == NULL) { std::cout << "Make Context Current Second Failed!\n"; return false; } ... // OGL Buffer Initialization glClear(GL_DEPTH_BUFFER_BIT | GL_COLOR_BUFFER_BIT); glBindVertexArray(vao); glUseProgram(myprogram); glDrawElements(GL_TRIANGLES, indexCount, GL_UNSIGNED_SHORT, (void *)indexStart); SwapBuffers(GetDC(window_handle));  
  • Popular Now