Jump to content

  • Log In with Google      Sign In   
  • Create Account


Member Since 30 Aug 2006
Online Last Active Today, 07:40 AM

Topics I've Started

Problems when moving from Nvidia to ATI card / GPGPU performance comparision

10 September 2014 - 10:29 AM



i have a complex compute shader project that refuses to work since i've replaced gtx670/480 against R9 280X.

Hopefully at the end i can give some useful GPGPU performance comparison without the need to compare OpenCL against Cuda.



The first issue is: I'm unable to modify a Shader Storage Buffers by shader - maybe i miss some stupid little thing...



The setup code is this:


    int sizeLists = sizeof(int) * 4096;
    gpuData.dataDbgOut = (int*)_aligned_malloc (sizeLists, 16);    

    gpuData.dataDbgOut[0] = 10;
    gpuData.dataDbgOut[1] = 20;
    gpuData.dataDbgOut[2] = 30;
    gpuData.dataDbgOut[3] = 40;

    glGenBuffers (1, &gpuData.ssbDbgOut);    
    glBindBuffer (GL_SHADER_STORAGE_BUFFER, gpuData.ssbDbgOut);
    glBufferData (GL_SHADER_STORAGE_BUFFER, sizeLists, gpuData.dataDbgOut, GL_DYNAMIC_COPY);
    glBindBufferBase (GL_SHADER_STORAGE_BUFFER, 1, gpuData.ssbDbgOut);

    gpuData.computeShaderTestATI = GL_Helper::CompileShaderFile ("..\\Engine\\shader\\gi_TestATI.glsl", GL_COMPUTE_SHADER, 1, includeAll);
    gpuData.computeProgramHandleTestATI = glCreateProgram();
    if (!gpuData.computeProgramHandleTestATI) { SystemTools::Log ("Error creating compute program object.\n"); return 0; }
    glAttachShader (gpuData.computeProgramHandleTestATI, gpuData.computeShaderTestATI);
    if (!GL_Helper::LinkProgram (gpuData.computeProgramHandleTestATI)) return 0;




Per frame code:


glBegin (GL_POINTS); glVertex3f (0,0,0); glEnd (); // <- remove this and it works


    glUseProgram (gpuData.computeProgramHandleTestATI);
    glDispatchCompute (1, 1, 1);
    glMemoryBarrier (GL_ALL_BARRIER_BITS);

    glBindBuffer (GL_SHADER_STORAGE_BUFFER, gpuData.ssbDbgOut);
    int* result = (int*) glMapBuffer (GL_SHADER_STORAGE_BUFFER, GL_READ_ONLY);    
    for (int i=0; i<4; i++) base_debug::logF->Print ("dbg: ", float(result[i]));





layout (local_size_x = 1) in;

layout (binding = 1, std430) buffer dbg_block
    uint dbgout[];

void main (void)
    dbgout[0] = 0;
    dbgout[1] = 1;
    dbgout[2] = 2;
    dbgout[3] = 3;




For the output i' expect 1,2,3,4 as modified by shader, but it is still 10,20,30,40


I've tried GL error checking but there is none, also the shader program is definitively called, and there are no shader compiler errors.

Any idea what's wrong? Version is ok too:


OpenGL ok
GL Vendor : ATI Technologies Inc.
GL Renderer : AMD Radeon R9 200 Series
GL Version (string) : 4.3.12967 Compatibility Profile Context 14.200.1004.0
GL Version (integer) : 4.3
GLSL Version : 4.40



EDIT: Added the stupid little thing :)

Can't use imageBuffer with integer data (Compute Shader)?

26 June 2014 - 05:02 AM

Hi there,


layout (binding = 3, rgba32f) uniform imageBuffer Tpos; // compiles ok
layout (binding = 4, rgba32ui) uniform imageBuffer TpackInt; // compiler error:


0(20) : error C1318: can't apply layout(rgba32ui) to image type "imageBuffer"



It seems i can use imageBuffer only with float formats,

but in the documentation there is no hint on that and rgba32ui is listed beside rgba32f.


Does that make any sense? Same error an all integer formats.

Maybe a driver bug (gtx480, gl 4.3)?




What alternatives of memory storage do i have for compute shader?


Initially i have used a struct (5 x vec4 and one ivec4) in a shader storage buffer.

That was terrible slow on random access (need to traverse trees).


Now i changed to store every vec in its own image and that's 10 times faster.

But this stupid error prevents me from doing the same on the traversal shader :(



How to center a rotated line between two angles

18 March 2013 - 04:46 PM

I want to calculate the distance d that moves the beta angled blue line out of unit circle center so that both alpha angles are equal.

Seems not so easy than i initially thought - please help :)



How to use single channel alpha texture

28 August 2011 - 02:32 AM

I'm implementing bitmap fonts as usual:
Use RGBA Texture, R=G=B=1 and A = transparency;
set a color and render... that's ok.

But why waste memory? I want a single channel texture only containing the alpha values.
Loading this texture with GL_ALPHA instead of GL_RGBA does not work as expected, no transparency shows up.

The only thing i've got working is loading the tex as GL_LUMINANCE,
and use glBlendFunc(GL_ONE, GL_ONE_MINUS_SRC_COLOR).
This allows transparency and setting a custom color, but the text can only brighten the frame, so it's invisible on white background.

So my question is, why is it possible to specify a single channel texture as GL_ALPHA, but the alpha component is treated always zero - this gives no sense.
And are there any other ideas to solve the problem? Maybe there's a custom solution for OpenGL ES?

Equal + opposite torques to match rotation offset

12 February 2011 - 06:59 AM

Hi all,

i have two bodies at random orientation and a given rotation offset.
I want to calculate equal but opposite torques for both bodys to get: OrientationBodyA = OrientationBodyB * RotationOffset.
Following method works if both bodies have uniform inertias, like spheres or cubes:

1. Compute QuatA, which rotates OrientationA to match OrientationBodyB * RotationOffset.
2. Convert QuatA to AngularVeloictyA and then to TorqueA, introducing Inertia of BodyA.
3. Do the same like A->B for B->A, leading to: AngularVeloictyB = -AngularVeloictyA, TorqueB is calced with Bs Inertia.
Now i have two Torques with them i can move either A or B to fullfill the desired result, but i want to move both bodies.
4. Scale torques: TorqueA *= TorqueB.length / (TorqueA.length + TorqueB.length), and same opposite for B.

Now i have 2 Torques of equal length and opposite direction. Simulation is fine and correct.
But if Inertias are nonuniform, Torque directions are not opposite because nonuniform Inertia scale, and simulation becomes vibrant.
The shortest path of rotation is not the most easy to do, so in reality bodies would choose another path.
Tuning torque vector by hand i found that it's still possible to match offset by equal opposite torques,
but after days of research i still don't know how to calculate them...

Thanks for any help!