Sign in to follow this  

OpenGL Fastest way to stream video to Texture

Recommended Posts

Ender1618    254

I am using OpenGL 4.2 in my project, using glew and SDL.


What is the fastest way to transfer an RGB 24-bit or grey scale 8-bit image (decoded video frame) from system memory to an OpenGL texture? Is using PBOs the best way to accomplish this, even with modern OpenGL? I saw a NVidia sample using PBOs for this, but its quite a few years old.


The video frames are coming in at 30-60 hz, 640x480..


What is the best way to allocate the gl texture? Should I force power of 2 for the texture (and update a sub rect)?


Use GL_BGRA8 for the internal format, and GL_BGRA for the pixel format (as apposed to GL_RGB for both)?


Should I use BGRA even for grey scale video?


Thanx for any suggestions.

Edited by Ender1618

Share this post

Link to post
Share on other sites
BornToCode    1185

Create two threads. One thread which is pulling the frames. Main thread which is creating PBO and update the texture with the proper frames. Those PBO can then be assigned to an screen aligned quad to be rendererd. Use two PBO while one is getting renderered you fill the second one with the next frame and ping pong between them.

Share this post

Link to post
Share on other sites
mhagain    13430

At 640x480 you can probably do this in realtime without needing a PBO.  Even if not, I'd advise that you go about it in the following order:


1) Write the basic version that just updates and displays the texture without anything extra.

2) Add double-buffering to it, using two textures; update texture 0/display texture 1, then swap for the next frame.

3) Add a PBO.


Only go to the next step here if the step you're currently on proves too slow for your needs.


GPUs have an annoying tendency to prefer texture data in 1, 2 or 4 byte formats, whereas content deliverers have an annoying tendency to prefer texture data in 3 byte formats so there is no fast way to get 24-bit RGB data into a texture.  You should expand (and swap) it to BGRA at some stage in the process, yes, and this should preferably happen during decompression from source.


For your colour data, create your texture one-time-only as follows:


glTexStorage2D (


You'll see that I'm using glTexStorage2D here rather than glTexImage2D - at this stage we just want to allocate storage for the texture and we're not yet concerned about what data is actually going into it.


Each time you get new data in, update the texture as follows:


glTexSubImage2D (


The key parameters here are the third and second last ones.  You can write a small program to verify this yourself, but the basic summary is that it is absolutely essential that you match these with what the OpenGL driver is going to prefer, otherwise you're going to get nasty slowdowns and this will be irrespective of whether you use a PBO or double-buffer.  I've benchmarked upload performance increases of over 25x (versus GL_RGB/GL_UNSIGNED_BYTE) on some hardware from these two parameters alone.  So get these correct first, then use other methods to make it faster, but only if you need them.


If your data is coming in at 24-bit RGB and if you don't have control over this, then you should expand and swap yourself; don't rely on the driver to do it for you (by e.g. using GL_RGB/GL_UNSIGNED_BYTE).  Expand and swap to a pre-allocated (or static) buffer and then glTexSubImage it and it will still be faster.


For RGB(A) data, this particular combination of format and type parameters should be the fastest on most systems but may not be the fastest on all.  For traditional desktop GL (which I'm assuming you're targetting based on your mention of 4.2) you should be safe enough, but as always, benchmark and find out for yourself.


For greyscale data you should be good enough just using a greyscale format for your texture; if you don't want to create a second texture then you can also expand it, but greyscale formats are still fine and fast - the only tricky cases are around 24-bit RGB data formats.


Finally, and if the video is also displayed at 640x480, don't underestimate the power of glDrawPixels.  The same trickiness around 24-bit RGB also applies here, but it may well work just fine for you.

Share this post

Link to post
Share on other sites
Ender1618    254

mhaigan, you mention using glTexStorage2D with GL_RGBA8 and glTexSubImage2D with GL_BGRA.  What does it mean that these are different?


What is the difference between glTexStorage2D and glTexImage2D? The docs mention mostly things about mipmap generation, I dont need mipmaps.


I also notice that




will fail, so I would have to do:




for it to work, so is that to say that I must convert my RGB to BGRA so that I could use GL_UNSIGNED_INT_8_8_8_8_REV? Since there is no GL_UNSIGNED_INT_8_8_8_REV.

Edited by Ender1618

Share this post

Link to post
Share on other sites
mhagain    13430

TexImage vs TexStorage


TexStorage will just allocate storage for the texture, and can allocate storage for multiple miplevels in one go (ensuring that things are set up correctly for submips).  The miplevels part is not relevant for you here, but using TexStorage is the more correct modern OpenGL way of doing this.


Think of it in terms of malloc, memcpy and free - it's a rough (and not entirely accurate) analogy but should work for the purpose of helping you understand.  TexImage needs to check if the texture storage already exists, delete it if so, allocate new storage, then (if the data pointer is non-NULL) copy in the supplied data.  TexStorage just needs to allocate storage.  Since that's all you need for your initial texture creation, TexStorage is sufficient.


TexStorage vs TexSubImage


The difference here is simple enough - the internal format parameter to TexStorage describes how the texture is represented internally by OpenGL.  The format and type parameters to TexSubImage desribe how the data you're feeding it is laid out.  And now unfortunately we need to get into a hangover from legacy OpenGL.  That internal format parameter - it's not prescriptive.  It just means "give me something that I can read data in this format from".  OpenGL itself is allowed to give you more colour channels and more bits-per-channel than you ask for; this point is going to become important shortly, so remember it.




First off, I need to re-emphasise this: don't use GL_BGR/GL_UNSIGNED_BYTE - that's going to punt you right onto the slow path and you'll end up implementing double-buffering, PBOs, and still wondering why you're not getting good performance from it.  The fact that you've mentioned it shows that you're still thinking along the lines of "saving memory" and avoiding what looks like extra CPU-side work in your own code.  This is important - CPU-side work in your own code is not the only CPU-side work you have to deal with; you've also got CPU-side work in the driver, latency, synchronization, format conversion, etc (all in the driver) to worry about and you have no control over those if you get things wrong.  Burn the extra memory, do the extra work in your own code, it's a tradeoff that will allow you to get in and out of that TexSubImage call as fast as possible and that's where the real key to performance is here.


Remember that bit I said was important?  Here's why.  There's no such thing as a 24-bit texture format on the vast majority of hardware.  Ask for a 24-bit format and you'll instead get a 32-bit one, with the extra 8 bits either ignored or set to 255 (as appropriate).  So you gain nothing by trying to transfer 24-bit data, but you lose a lot because that 24-bit data isn't going to match what's actually being stored internally, and the driver will need to convert it.  It's another unfortunate hangover from legacy OpenGL that these options still exist, because they can lead to much confusion and wrong thinking.  See here for further discussion of this:


So match those parameters and the driver will look at them and say "yes!  I can just suck this data straight in without needing to do anything else, hey!, I can even read it in 32-bit chunks too, thanks very much, I'm done, here's control handed back to your program as quickly as possible".  Get them wrong and the driver will say "oops, you've given something I don't like, now I need to go allocating extra buffers, rummaging around in the data, converting it to something I do like, and by the way - do I need to read the original texture back into system memory first?"  You don't have control over what the driver does when you feed it something it doesn't like, and some drivers can do absolutely awful things.  What you do have control over is feeding it something it does like, and because any conversion you need to do is in your own code, you can optimize it to your heart's content.


So yes, convert your RGB source to BGRA; it's a nice fast simple loop that you do have control over (that you can even unwind some).  That's what most GPUs/drivers are going to prefer, so give them that and you'll get the fast transfer.  8_8_8_8_REV is optional but will put you on the absolute fastest path with even crappy low-quality Intels

Share this post

Link to post
Share on other sites
Ender1618    254

So the glTexStorage2D internal format is just a high level representation of what is going to be in the texture and the depth per channel ? The actual order of R G B A bytes is up to OGL driver.

Share this post

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this  

  • Similar Content

    • By pseudomarvin
      I assumed that if a shader is computationally expensive then the execution is just slower. But running the following GLSL FS instead just crashes
      void main() { float x = 0; float y = 0; int sum = 0; for (float x = 0; x < 10; x += 0.00005) { for (float y = 0; y < 10; y += 0.00005) { sum++; } } fragColor = vec4(1, 1, 1 , 1.0); } with unhandled exception in nvoglv32.dll. Are there any hard limits on the number of steps/time that a shader can take before it is shut down? I was thinking about implementing some time intensive computation in shaders where it would take on the order of seconds to compute a frame, is that possible? Thanks.
    • By Arulbabu Donbosco
      There are studios selling applications which is just copying any 3Dgraphic content and regenerating into another new window. especially for CAVE Virtual reality experience. so that the user opens REvite or CAD or any other 3D applications and opens a model. then when the user selects the rendered window the VR application copies the 3D model information from the OpenGL window. 
      I got the clue that the VR application replaces the windows opengl32.dll file. how this is possible ... how can we copy the 3d content from the current OpenGL window.
      anyone, please help me .. how to go further... to create an application like VR CAVE. 
    • By cebugdev
      hi all,

      i am trying to build an OpenGL 2D GUI system, (yeah yeah, i know i should not be re inventing the wheel, but this is for educational and some other purpose only),
      i have built GUI system before using 2D systems such as that of HTML/JS canvas, but in 2D system, i can directly match a mouse coordinates to the actual graphic coordinates with additional computation for screen size/ratio/scale ofcourse.
      now i want to port it to OpenGL, i know that to render a 2D object in OpenGL we specify coordiantes in Clip space or use the orthographic projection, now heres what i need help about.
      1. what is the right way of rendering the GUI? is it thru drawing in clip space or switching to ortho projection?
      2. from screen coordinates (top left is 0,0 nd bottom right is width height), how can i map the mouse coordinates to OpenGL 2D so that mouse events such as button click works? In consideration ofcourse to the current screen/size dimension.
      3. when let say if the screen size/dimension is different, how to handle this? in my previous javascript 2D engine using canvas, i just have my working coordinates and then just perform the bitblk or copying my working canvas to screen canvas and scale the mouse coordinates from there, in OpenGL how to work on a multiple screen sizes (more like an OpenGL ES question).
      lastly, if you guys know any books, resources, links or tutorials that handle or discuss this, i found one with marekknows opengl game engine website but its not free,
      Just let me know. Did not have any luck finding resource in google for writing our own OpenGL GUI framework.
      IF there are no any available online, just let me know, what things do i need to look into for OpenGL and i will study them one by one to make it work.
      thank you, and looking forward to positive replies.
    • By fllwr0491
      I have a few beginner questions about tesselation that I really have no clue.
      The opengl wiki doesn't seem to talk anything about the details.
      What is the relationship between TCS layout out and TES layout in?
      How does the tesselator know how control points are organized?
          e.g. If TES input requests triangles, but TCS can output N vertices.
             What happens in this case?
      In this article,
      the isoline example TCS out=4, but TES in=isoline.
      And gl_TessCoord is only a single one.
      So which ones are the control points?
      How are tesselator building primitives?
    • By Orella
      I've been developing a 2D Engine using SFML + ImGui.
      Here you can see an image
      The editor is rendered using ImGui and the scene window is a sf::RenderTexture where I draw the GameObjects and then is converted to ImGui::Image to render it in the editor.
      Now I need to create a 3D Engine during this year in my Bachelor Degree but using SDL2 + ImGui and I want to recreate what I did with the 2D Engine. 
      I've managed to render the editor like I did in the 2D Engine using this example that comes with ImGui. 
      3D Editor preview
      But I don't know how to create an equivalent of sf::RenderTexture in SDL2, so I can draw the 3D scene there and convert it to ImGui::Image to show it in the editor.
      If you can provide code will be better. And if you want me to provide any specific code tell me.
  • Popular Now