Jump to content
  • Advertisement
Sign in to follow this  
Dawoodoz

OpenGL CPU + GPU rendering with OpenGL

This topic is 789 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

I just started making a game in OpenGL but everything OpenGL specific is in one tiny C module so that I can easily change to Vulkan when more graphics cards are supported on Linux.

I want a faster texture upload in order to allow drawing of many tiny sprites on the CPU with full control over the depth buffer and then add deferred normal mapping, global volume light, turbulence, bloom, fog, water and gamma correction on the GPU. The problem is that rendering a texture uploaded from the CPU is very slow. Probably from being stored in write often memory by the OpenGL drivers and read often memory would be slow to upload instead.

 

Only CPU rasterization without GPU upload takes 0.3 ms for hard clipping and 2.0 ms using alpha filtering. This is without multi-threading or SIMD optimizations.

Software resterization + upload + sampling write often memory on GPU takes 10.0 ms which barely makes the 15 ms deadline.

 

Only GPU rendering with fixed textures takes 4.0 ms which is okay for OpenGL but then I cannot write freely to the depth buffer unless there is an extension for that. Copying back from fake depth buffers all the time would stall the GPU while waiting for the output as the next input texture.

 

Is there a memory trick that I can use in OpenGL to avoid stalling on sampling an uploaded texture?

Right now I just upload the software rasterized result to an existing texture ID using glTexImage2D.

 

Before you point out the obvious, my game would probably be much faster with hand coded DSP assembly on a Snapdragon 820 SoC with unified memory architecture and a HVX capable mDSP but I don't even like playing mobile games and it would have to be signed as firmware by the hardware vendor to go beyond root access.

Edited by Dawoodoz

Share this post


Link to post
Share on other sites
Advertisement

Check your parameters to glTexImage2D - it's probable that the driver is having to do a format conversion before it can upload, e.g. if you're using GL_RGB (which doesn't actually exist in hardware).

Share this post


Link to post
Share on other sites

Is there a memory trick that I can use in OpenGL to avoid stalling on sampling an uploaded texture? Right now I just upload the software rasterized result to an existing texture ID using glTexImage2D.

 

Maybe try triple buffering, so that you round-robin rotate between 3 different texture IDs on different frames. It's possible that the current usage is creating a sync point where the GPU has to wait for it to finish with the previous texture before it can upload the new contents.

Share this post


Link to post
Share on other sites

@ mhagain:

I have tried both
"glTexImage2D(GL_TEXTURE_2D, 0, GL_RGBA8, width, height, 0, GL_RGBA, GL_UNSIGNED_BYTE, (unsigned char*)cpuBuffer);"

and

"glTexImage2D(GL_TEXTURE_2D, 0, GL_RGBA8, width, height, 0, GL_BGRA, GL_UNSIGNED_BYTE, (unsigned char*)cpuBuffer);"

 

@ C0lumbo:

Thanks, I will look into tripple buffering. Maybe create producer consumer multithreading if SDL has something like that.

Share this post


Link to post
Share on other sites

Double buffering the upload saved one millisecond. Enough to fill the screen with sprites on the CPU each frame. :)

Increasing to three buffers did not improve speed at the moment but maybe with other memory flags.

Drawing the image that I started uploading the previous frame did not improve performance so I probably have to force the memory into slow upload but fast sampling somehow.
If I have a separate CPU thread for handling OpenGL then I should be able to do heavy CPU work without stalling the GPU since 1.5 ms is removed if I stop drawing on the CPU and just upload and draw the background every frame.

Edited by Dawoodoz

Share this post


Link to post
Share on other sites

I tried many types of multi threading for CPU rendering and GPU uploads but none gave any performance increase.

I still get the sum of CPU and GPU rendering times instead of the maximum of both.

Either some resource is stalling or OpenGL already did the same thing for me.

 

Game loop with multiple CPU draw targets: (1 ms slower)

    start rendering thread writing to output[(i + 1) modulo 2] on the CPU

    upload output[i modulo 2] to the GPU and draw to the screen

    wait for the rendering thread

    i = (i + 1) modulo 2

 

Game loop with copy from CPU draw target to extra buffer: (same speed)

    outputB = copy of outputA

    start rendering thread writing to outputA on the CPU

    upload outputB to the GPU and draw to the screen

    wait for the rendering thread

Edited by Dawoodoz

Share this post


Link to post
Share on other sites

Have you tried glTexSubImage2D?  It may be faster as it doesn't need to respecify the texture each time.

Share this post


Link to post
Share on other sites

The performance was about the same with either glTexImage2D or glTexSubImage2D.

 

I might have to try pixel buffer objects and fence objects.

Edited by Dawoodoz

Share this post


Link to post
Share on other sites

You'll need to use a PBO, keeping it mapped with persistent storage and using fences for synchronization.
 
There is one thing I don't get though:
 

Only GPU rendering with fixed textures takes 4.0 ms which is okay for OpenGL but then I cannot write freely to the depth buffer unless there is an extension for that. Copying back from fake depth buffers all the time would stall the GPU while waiting for the output as the next input texture.

I assume you don't know about gl_FragDepth?

What is that depth buffer manipulation you do? why do you need it? what are you trying to achieve?

Share this post


Link to post
Share on other sites
Sign in to follow this  

  • Advertisement
×

Important Information

By using GameDev.net, you agree to our community Guidelines, Terms of Use, and Privacy Policy.

GameDev.net is your game development community. Create an account for your GameDev Portfolio and participate in the largest developer community in the games industry.

Sign me up!