# Transforming Angular Velocity

## Recommended Posts

My question: is it possible to transform multiple angular velocities so that they can be reinserted as one? My research is below:

// This works

glMultMatrixf(GEMat4FromQuaternion(quaternion3).array);

// The first two work fine but not the third. Why?
vec3 vector1 = GETransformQuaternionAndVector(quaternion1, angularVelocity1);

vec3 vector2 = GETransformQuaternionAndVector(quaternion2, angularVelocity2);

// This doesn't work
//vec3 vector3 = GETransformQuaternionAndVector(quaternion3, angularVelocity3);

vec3 angularAcceleration = GESetVector(0.0, 0.0, 0.0);

// Sending it through one angular velocity later in my motion engine

glMultMatrixf(GEMat4FromEulerAngle(angleRadiansVector).array);

Also how do I combine multiple angularAcceleration variables? Is there an easier way to transform the angular values?

##### Share on other sites
39 minutes ago, Jonathan2006 said:

My question: is it possible to transform multiple angular velocities so that they can be reinserted as one?

Do you mean finding the average?

Could you maybe explain or sketch an idea of what your expecting?

##### Share on other sites

That was one I didn't think of. Thanks for the averaging idea. I still cannot get it working. I am trying to mimic OpenGL's glRotate(); except I need to know the final angular velocity and acceleration vectors. The physics engine I am working on only accepts one angular velocity and angular acceleration vector. Below I am posting more of the code:

// The Angle Radians Test (code used both by the matrix and quaternions)

// Matrix (This works and is used to test the accuracy of the quaternions)
mat4 rotateMatrix = GEMultiplyMat4(GEMat4FromRotate(angleRadiansVector1.x, 1.0, 0.0, 0.0), GEIdentityMat4());
rotateMatrix = GEMultiplyMat4(GEMat4FromRotate(angleRadiansVector2.z, 0.0, 0.0, 1.0), rotateMatrix);
rotateMatrix = GEMultiplyMat4(GEMat4FromRotate(angleRadiansVector3.x, 1.0, 0.0, 0.0), rotateMatrix);

glMultMatrixf(rotateMatrix.array);
glutWireSphere(2.0, 7.0, 7.0);

The first few lines are extra code that I was hoping I could leave out in the first post. The rotateMatrix works just like OpenGL's glRotate(); and is the desired output. I am trying to transform the angular vectors the same way OpenGL does when multipling two matrices.

##### Share on other sites

Either way you combine forces that cause object to rotate force1+force2+force

And apply that to final acceleration from torque or correct me if im wrong (bit tired)  you just add acceleration since its a vector

##### Share on other sites

You'll need to transport them too if they don't apply to the same point (see Varignon theorem).

##### Share on other sites

Yes, usually you represent angular velocitiy as a 3D vector. This representation is like axis and angle rotation, but the unit axis is scaled by the angle, so vector length = angle, and it still works with negative angles. As long as those vectors are in the same space, you can add / subtract them like any vector. Otherwise transform them to mach space, add, transform to another space, whatever.

The same is true for torque.

Due to the axis / angle relationship, you can also convert angular velocity to a quaternion, e.g. to integrate the angular velocity to body orientation, but be aware that quaternions have limitatins about max angle they can represent - they can mot represent multiple spins, while angular velocity as a vector can.

It's not clear to me what you try to achieve, but if you want to convert multiple rotations to a single angular velocity,

you do all those rotations in order,

calculate the rotation form original orientation to resulting orientation,

convert this final rotation to axis / angle and finally to angular velocity (where your desired speed affects only the length of the vector)

Edit: Looking at your code again it seems you know the things i wrote already. I assume your problem comes from flipping cases, similar to the problem of averaging 3 or more angles, but i'm not sure (you can test by using different values for your 3 rotations, in some cases it might work as intended).

Do you mean by 'angleRadiansVector' a axis / angle rotation where the vector length = angle?

Oh, wait - all wrong. I think your mistake is this: On top of your snippet you do 3 rotations in a given order. Later you try to add those rotations in a order independent way and expect the same result, but this is wrong if you think about it.

So the solution should be indeed to use the rotation between initial orientation and resulting orientation after 3 rotations has been applied, as explained above.

Edited by JoeJ

##### Share on other sites

Thanks for the posts and ideas. I already know how to get the final values for position, linear velocity, linear acceleration, and Euler angle.

I have been researching quaternions some more. if it’s even possible I would like to know the instant angular velocity and angular acceleration which I tried to find in the code below. If it’s not possible to find the instant value I am confused as to what should be a vector and what should be a quaternion in your post above.

If it matters I am trying to use instant quaternions. I had problems with oscillating after a perfect collision using just the original and resulting vectors (I have not tried it with quaternions though).

The C code below is as close as I can get to the real motion engine:

void rotateShapeX(mat4 *m, quat *q, vec3 *angleRadians, vec3 *angularVelocity, vec3 *angularAcceleration)
{

*m = GEMultiplyMat4(GEMat4FromRotate(angleRadians->x, 1.0, 0.0, 0.0), *m);
}

void rotateShapeZ(mat4 *m, quat *q, vec3 *angleRadians, vec3 *angularVelocity, vec3 *angularAcceleration)
{

*m = GEMultiplyMat4(GEMat4FromRotate(angleRadians->z, 0.0, 0.0, 1.0), *m);
}

void shape(void)
{
mat4 rotateMatrix, rotateMatrix1, rotateMatrix2, rotateMatrix3;
rotateMatrix = rotateMatrix1 = rotateMatrix2 = rotateMatrix3 = GEIdentityMat4();
// These will be static once it works
vec3 angularVelocityVector1 = GERadiansFromDegreesVector(GESetVector(20.0, 0.0, 0.0));
vec3 angularVelocityVector2 = GERadiansFromDegreesVector(GESetVector(0.0, 0.0, 20.0));
vec3 angularVelocityVector3 = GERadiansFromDegreesVector(GESetVector(20.0, 0.0, 0.0));
// This is here so I can test only the angular velocity
vec3 angularAccelerationVector = GESetVector(0.0, 0.0, 0.0);

static quat quaternion, quaternion1, quaternion2, quaternion3;
static char initialze = 0;

if (initialze == 0)
{
quaternion = GESetQuaternion(0.0, 0.0, 0.0, 1.0);
initialze = 1;
}

// Rotate

// Multiply matrices (works)
rotateMatrix = GEMultiplyMat4(rotateMatrix1, rotateMatrix);
rotateMatrix = GEMultiplyMat4(rotateMatrix2, rotateMatrix);
rotateMatrix = GEMultiplyMat4(rotateMatrix3, rotateMatrix);

// Multiply Quaternions? (does not work)
quaternion = GETransformQuaternions(quaternion, quaternion1);
quaternion = GETransformQuaternions(quaternion, quaternion2);
quaternion = GETransformQuaternions(quaternion, quaternion3);

vec3 angularVelocityVector = GEAxisFromQuaternion(quaternion);
// Not sure if this is correct but I took out multipling globalTimeStep since it is already used in rotateShape*();

glColor3f(0.0, 1.0, 1.0);
glPushMatrix();
glTranslatef( 2.0, 0.0, 0.0);
// This spins really fast
glutWireSphere(2.0, 7.0, 7.0);
glPopMatrix();

glColor3f(1.0, 0.0, 0.0);
glPushMatrix();
glTranslatef(-2.0, 0.0, 0.0);
glMultMatrixf(rotateMatrix.array); // (Works)
glutWireSphere(2.0, 7.0, 7.0);
glPopMatrix();
}

Edited by Jonathan2006

##### Share on other sites
2 hours ago, Jonathan2006 said:

I am confused as to what should be a vector and what should be a quaternion in your post above.

Some code:

	quat curOrientation = object.worldSpaceOrientation;
quat targetOrientation = curOrientation;
targetOrientation *= quat::FromaxisAndAngle(vec(1,0,0), PI*0.5f);
targetOrientation *= quat::FromaxisAndAngle(vec(0,1,0), PI*-0.3f);
targetOrientation *= quat::FromaxisAndAngle(vec(0,0,1), PI*1.5f);
// now calculate rotation from current to target
quat diff;
quat &qA = curOrientation; &quat qB = targetOrientation;
{
quat q;
if (qA.Dot(qB) < 0.0f) // use dot product so we pick the shortest arc rotation, and invert either axis or angular part
{
q[0] = qA[0]; q[1] = qA[1]; q[2] = qA[2];
q[3] = -qA[3];
}
else
{
q[0] = -qA[0]; q[1] = -qA[1]; q[2] = -qA[2];
q[3] = qA[3];
}

diff = qB * q;
}
// now convert diff to angular velocitiy
float targetDuration = timestep;
vec3 angVel (0,0,0);
{
// this is axis to angle conversation as you expect, but with some confusing 'fix close to zero angle issues' approach.
const float matchTolerance = 0.0001f;
const float faultTolerance = 0.0005f;
quat q = diff;
if (sql > (matchTolerance * matchTolerance))
{
float length = sqrt (sql);
if (q[3] < -1) q[3] = -1;
if (q[3] >  1) q[3] =  1;
angVel = (*omegaDir / length) * (2.0 * acos(q[3]));
if (length < faultTolerance) angVel *= (length - matchTolerance) / (faultTolerance - matchTolerance);
angVel *= targetDuration;
}
}


If we integrate angVel for targetDuration, we reach targetOrientation as desired, but we won't have zero velocity, so the body will keep spinning so overshooting the target. The easiest way to avoid this is to recalculate angVel every step, but then integrate only a fraction of it like angVel*0.3 This way the body will come to rest at desired orientation, but no longer we are able to tell how long it will take exactly.

I think my math lib has unusual rotation order, so when i write  diff = qB * q, you may need to write  diff = q * qB eventually.

##### Share on other sites
2 hours ago, Jonathan2006 said:

vec3 angularVelocityVector = GEAxisFromQuaternion(quaternion); // Not sure if this is correct but I took out multipling globalTimeStep since it is already used in rotateShape*();

This seems wrong (you need to use angle and time as well), it should be:

vec3 angularVelocityVector = GEAxisFromQuaternion(quaternion) * GEAngleFromQuaternion(quaternion) * targetDuration;

which would be the replacement for my confusing quat to angular velocity code.

By using timestep for duration as i did  you do the rotation as fast as possible - of course you can do it slower as well.

2 hours ago, Jonathan2006 said:

Not sure what you intend with this.

2 hours ago, Jonathan2006 said:

Looks wrong too. You can not add angular velocity to euler angles. Euler angles are always a series of 3 rotations in order, while angular velocity encodes a single rotation about its unit direction, by axis and speed encoded in its length.

I guess this code is a try to integrate angular velocity? Here something simple i have, hope it helps:

	void IntegrateAgularVelocity (const sQuat &curOrn, const sVec3 &angvel, float timeStep, sQuat &predictedOrn)
{
sVec3 axis;
float fAngle = angvel.Length();

if ( fAngle < float(0.00001) )
{
predictedOrn = curOrn;
return;
}
else
{
axis = angvel * (sin (fAngle * timeStep * float(0.5)) / fAngle);
}
sQuat dorn (axis.x(), axis.y(), axis.z(), cos (fAngle * timeStep * float(0.5)));
predictedOrn = dorn * curOrn;
predictedOrn.Normalize();
}


##### Share on other sites

Thank you again for all the help! I am still trying to wrap my head around quaternions. I just thought of something. Even if I use something like quatCur - quatOld should the angular oscillating stop if I normalize the quaternion like you did in your code? I think I understand your last post on quaternions but I am getting weird acceleration. I also added the time step back into the last integrator (hope that's the name). The reason I use the last angleEulerRadians and angularVelocityVector (integrator?) is to only test the final angular velocity. I created a gif of the rotations below. Also Here's the code I am using:

void integrateAngularVelocity(const quat curOrn, const vec3 angvel, const float timeStep, quat *predictedOrn)
{
vec3 axis;
float fAngle = GEMagnitudeVector(angvel);

if (fAngle < 0.00001)
{
*predictedOrn = curOrn;
return;
}
else
{
axis = GEMultiplyVectorAndScalar(angvel, sinf(fAngle * timeStep * 0.5) / fAngle);
}

quat dorn = GESetQuaternion(axis.x, axis.y, axis.z, cosf(fAngle * timeStep * 0.5));
*predictedOrn = GEMultiplyQuaternions(dorn, curOrn);
*predictedOrn = GENormalizeQuaternion(*predictedOrn);
}

void Shape(void)
{
vec3 angularVelocityVector1 = GERadiansFromDegreesVector(GESetVector(20.0, 0.0, 0.0));
vec3 angularVelocityVector2 = GERadiansFromDegreesVector(GESetVector(0.0, 0.0, 20.0));
vec3 angularVelocityVector3 = GERadiansFromDegreesVector(GESetVector(20.0, 0.0, 0.0));

static quat quaternion;

integrateAngularVelocity(quaternion, angularVelocityVector1, globalTimeStep, &quaternion);
integrateAngularVelocity(quaternion, angularVelocityVector2, globalTimeStep, &quaternion);
integrateAngularVelocity(quaternion, angularVelocityVector3, globalTimeStep, &quaternion);

vec3 angularVelocityVector = GEMultiplyVectorAndScalar(GEAxisFromQuaternion(quaternion), GEAngleFromQuaternion(quaternion));

glColor3f(0.0, 1.0, 1.0);
glPushMatrix();
glTranslatef( 2.0, 0.0, 0.0);
// This spins really fast
glutWireSphere(2.0, 7.0, 7.0);
glPopMatrix();
}

It's not the best quality but the red sphere is the correct rotation and the cyan is the quaternion code above.

##### Share on other sites

You should post the code that produces the red sphere too, so i can figure out what you try to do.

I assume you try to match physics to some animation - why so?

Also, if you control a physical object by changing its velocity, typically you do not care about acceleration. You care about acceleration only if you control the object by torque. E.g. for a question like "i want angular velocity of (10,0,0) - what torque do i need to achieve this?" Here acceleration matters, but by setting velocity directly you bypass any current physical state and just change it.

(I assume that's one cause of what you mean with oscillation)

At this line:

vec3 angularVelocityVector = GEMultiplyVectorAndScalar(GEAxisFromQuaternion(quaternion), GEAngleFromQuaternion(quaternion));

You calculate velocity from orientation after rotating 3 times. This makes no sense. You should calculate it from a rotation.

Note the difference between both terms (comparing angular to linear stuff here):

Orientation is like position or a point, describing state in space.

Rotation is like a displacement or vector, describing a difference between two orientations, or how orientation changes if we rotate it.

Differentiating between those two terms also when thinking about a problem helps a lot to deal with 3D rotations   I do not differentiate between 'point' and 'vector' - that's always easy, but for 'orientation' vs. 'rotation' it really helps.

So i assume you want to calculate a single rotation that has the same effect than applying your 3 hardcoded rotations in order. And then you want to convert this single rotation to angular velocity so a physical body matches the result. Is that correct? This is what my 'some code' snippet does.

angleEulerRadians = GEAddVectors(angleEulerRadians, GEMultiplyVectorAndScalar(angularVelocityVector, globalTimeStep));

This makes no sense. If you need to convert to euler angles you need a function that does the conversation (worth its own topic), but i assume you don't need them so avoid them to do yourself a favour.

Give your final orientation to glLoatMatrix,

or give a rotation to glMultMatrix or glRotate (the latter using axis and angle, not eulers)

(assuming you work in global space and you are not inside a transform hierarchy or something)

##### Share on other sites
3 hours ago, Jonathan2006 said:

I just thought of something. Even if I use something like quatCur - quatOld should the angular oscillating stop if I normalize the quaternion like you did in your code?

No, actually this normalization has no effect and should be removed.

3 hours ago, Jonathan2006 said:

I am still trying to wrap my head around quaternions.

You and everybody else  Thinking of it as an axis and angle rotation is good enough to work with them. It's more important to understand what you can do with various rotation representation, e.g.:

Quat: Easy to interpolate, easy to convert to axis and angle

Rotation Vector (axis times angle): You can sum multiple rotations, no angle limit

Matrix: You can also represent scale and skew

Euler angles: Good for numerical user interaction

##### Share on other sites

The red wire sphere code is the “mat4 rotateMatrix” that I posted further above. Thank you JoeJ for your time. I just don’t know enough to figure this out. Below are some things I have learned.

After skimming through some game developer physics books, euclideanspace.com, and your posts here’s what I think I have learned so far...
1) Axis Angle can be used to represent the length and direction of angular velocity/rotations in quaternions and matrices.
2) Multiplying two quaternions is just like multiplying two matrices.
3) Angular velocity can be represented in a 3x3 skew matrix.

What went wrong
1) I could never get any of my axis angle functions to work with more than one integrator.
2) Not sure about multiplying quaternions since I could never get the correct output after multiplying.
3) I tried multiplying several skew matrices together and the final matrix had all zero values.

I used the GLM library for most of my quaternion and matrix code but I was never really able to debug the axis angle quaternions. I tried axis angle matrices from euclideanspace.com and I did find a bug in the GLM code that I fixed in my code. The output of angular velocity in the matrices never came out correctly after I added more than one integrator.

Since nothing worked I will have to use my old unreliable code below:

Hope this helps anyone else with this problem.

Thanks again,
Jonathan

Edited by Jonathan2006

##### Share on other sites
10 hours ago, Jonathan2006 said:

1) Axis Angle can be used to represent the length and direction of angular velocity/rotations in quaternions and matrices.

Axis and Angle: Axis has always unit length.

"Rotation Vector" = axis * angle (so just 3 numbers instead 4) Unfortunately i don't know a common name for this - "Rotation Vector" is just my own term.

Both of them can represent angular velocity because both support multiple revolutions, Rotation Vector is preferred because you can sum up multiple velocities before you perform any rotation.

10 hours ago, Jonathan2006 said:

2) Multiplying two quaternions is just like multiplying two matrices.

Yes (as long as they both represent a rotation / orientation. Exception is e.g. a scaled or sewed matrix. Quats can't handle this.)

10 hours ago, Jonathan2006 said:

3) Angular velocity can be represented in a 3x3 skew matrix.

Don't know.

10 hours ago, Jonathan2006 said:

1) I could never get any of my axis angle functions to work with more than one integrator.

You typically use only one integrator: You sum up all forces that affect an object, calculate velocity and integrate it for the time duration you want. In your case it's easier as you don;t deal with forces i guess.

I see you integrated 3 times in your code but that's wrong - you want one single rotation that does the same as your 3 hardcoded rotations. So one single angular velocity calculated from 3 ordered rotations to integrate.

I would do this for example for a problem like this: I have animation data with euler angles (which are 3 ordered rotations each as well), and i need to feed the physics engine of a game so a simulated rigid body matches the animation. I assume your problem is equivalent, and all my response is based on that assumption. (Why else would you want to calculate angular velocity? You don't tell me....)

10 hours ago, Jonathan2006 said:

2) Not sure about multiplying quaternions since I could never get the correct output after multiplying.

If you have some rotation matrices, multiply them in order, the result must be the same if you first convert each matrix to quat and multiply them in the same order. Check this out to be sure your math lib works as expected.

10 hours ago, Jonathan2006 said:

3) I tried multiplying several skew matrices together and the final matrix had all zero values.

You'd need to look at the numbers, probably there are zero rows or columns that cause this.

10 hours ago, Jonathan2006 said:

Since nothing worked I will have to use my old unreliable code below:

Hope this helps anyone else with this problem.

Noooo!

Angular velocity is not the difference between Euler Angles. This may work in special cases, e.g. two angles are zero or cancel each other out due to gimbal lock. But in general it's terribly wrong. The reason is Euler angles has a specified order. The angle numbers alone don't know that order, but you can't interpret them without defining that order. Taking the difference ignores all this.  (I remember a game physics book that did that same mistake but not the title. If you have it, burn it.)

Rule of the thumb: Avoid using euler angles unless you have a very good reason.

Understanding 3D rotations requires practice and it's hard to communicate. To get better help you need to describe very precisely what your problem is in practice, and why you need to do conversations from a reference to something else.

Edited by JoeJ

## Create an account

Register a new account

• 11
• 11
• 10
• 11
• 12
• ### Similar Content

• Hi, I'm on Rastertek series 42, soft shadows, which uses a blur shader and runs extremely slow.
http://www.rastertek.com/dx11tut42.html
He obnoxiously states that there are many ways to optimize his blur shader, but gives you no idea how to do it.
The way he does it is :
1. Project the objects in the scene to a render target using the depth shader.
2. Draw black and white shadows on another render target using those depth textures.
3. Blur the black/white shadow texture produced in step 2 by
a) rendering it to a smaller texture
b) vertical / horizontal blurring that texture
c) rendering it back to a bigger texture again.
4. Send the blurred shadow texture into the final shader, which samples its black/white values to determine light intensity.

So this uses a ton of render textures, and I just added more than one light, which multiplies the render textures required.

Is there any easy way I can optimize the super expensive blur shader that wouldnt require a whole new complicated system?
Like combining any of these render textures into one for example?

If you know of any easy way not requiring too many changes, please let me know, as I already had a really hard time
understanding the way this works, so a super complicated change would be beyond my capacity. Thanks.

*For reference, here is my repo, in which I have simplified his tutorial and added an additional light.

• I have never quite been a master of the d3d9 blend modes.. I know the basic stuff, but have been trying for a while to get a multiply/add blending mode... the best I can figure out is mult2x by setting:
SetRenderState(D3DRS_DESTBLEND, D3DBLEND_SRCCOLOR);
SetRenderState(D3DRS_SRCBLEND, D3DBLEND_DESTCOLOR);
//this isn't quite what I want.. basically I wonder if there is a way to like multiply by any color darker than 0.5 and add by any color lighter than that..? I don't know, maybe this system is too limited...
• By EddieK
Hello. I'm trying to make an android game and I have come across a problem. I want to draw different map layers at different Z depths so that some of the tiles are drawn above the player while others are drawn under him. But there's an issue where the pixels with alpha drawn above the player. This is the code i'm using:
int setup(){ GLES20.glEnable(GLES20.GL_DEPTH_TEST); GLES20.glEnable(GL10.GL_ALPHA_TEST); GLES20.glEnable(GLES20.GL_TEXTURE_2D); } int render(){ GLES20.glClearColor(0, 0, 0, 0); GLES20.glClear(GLES20.GL_ALPHA_BITS); GLES20.glClear(GLES20.GL_COLOR_BUFFER_BIT); GLES20.glClear(GLES20.GL_DEPTH_BUFFER_BIT); GLES20.glBlendFunc(GLES20.GL_ONE, GL10.GL_ONE_MINUS_SRC_ALPHA); // do the binding of textures and drawing vertices } My vertex shader:
uniform mat4 MVPMatrix; // model-view-projection matrix uniform mat4 projectionMatrix; attribute vec4 position; attribute vec2 textureCoords; attribute vec4 color; attribute vec3 normal; varying vec4 outColor; varying vec2 outTexCoords; varying vec3 outNormal; void main() { outNormal = normal; outTexCoords = textureCoords; outColor = color; gl_Position = MVPMatrix * position; } My fragment shader:
precision highp float; uniform sampler2D texture; varying vec4 outColor; varying vec2 outTexCoords; varying vec3 outNormal; void main() { vec4 color = texture2D(texture, outTexCoords) * outColor; gl_FragColor = vec4(color.r,color.g,color.b,color.a);//color.a); } I have attached a picture of how it looks. You can see the black squares near the tree. These squares should be transparent as they are in the png image:

Its strange that in this picture instead of alpha or just black color it displays the grass texture beneath the player and the tree:

Any ideas on how to fix this?

• This article uses material originally posted on Diligent Graphics web site.
Introduction
Graphics APIs have come a long way from small set of basic commands allowing limited control of configurable stages of early 3D accelerators to very low-level programming interfaces exposing almost every aspect of the underlying graphics hardware. Next-generation APIs, Direct3D12 by Microsoft and Vulkan by Khronos are relatively new and have only started getting widespread adoption and support from hardware vendors, while Direct3D11 and OpenGL are still considered industry standard. New APIs can provide substantial performance and functional improvements, but may not be supported by older hardware. An application targeting wide range of platforms needs to support Direct3D11 and OpenGL. New APIs will not give any advantage when used with old paradigms. It is totally possible to add Direct3D12 support to an existing renderer by implementing Direct3D11 interface through Direct3D12, but this will give zero benefits. Instead, new approaches and rendering architectures that leverage flexibility provided by the next-generation APIs are expected to be developed.
There are at least four APIs (Direct3D11, Direct3D12, OpenGL/GLES, Vulkan, plus Apple's Metal for iOS and osX platforms) that a cross-platform 3D application may need to support. Writing separate code paths for all APIs is clearly not an option for any real-world application and the need for a cross-platform graphics abstraction layer is evident. The following is the list of requirements that I believe such layer needs to satisfy:
Lightweight abstractions: the API should be as close to the underlying native APIs as possible to allow an application leverage all available low-level functionality. In many cases this requirement is difficult to achieve because specific features exposed by different APIs may vary considerably. Low performance overhead: the abstraction layer needs to be efficient from performance point of view. If it introduces considerable amount of overhead, there is no point in using it. Convenience: the API needs to be convenient to use. It needs to assist developers in achieving their goals not limiting their control of the graphics hardware. Multithreading: ability to efficiently parallelize work is in the core of Direct3D12 and Vulkan and one of the main selling points of the new APIs. Support for multithreading in a cross-platform layer is a must. Extensibility: no matter how well the API is designed, it still introduces some level of abstraction. In some cases the most efficient way to implement certain functionality is to directly use native API. The abstraction layer needs to provide seamless interoperability with the underlying native APIs to provide a way for the app to add features that may be missing. Diligent Engine is designed to solve these problems. Its main goal is to take advantages of the next-generation APIs such as Direct3D12 and Vulkan, but at the same time provide support for older platforms via Direct3D11, OpenGL and OpenGLES. Diligent Engine exposes common C++ front-end for all supported platforms and provides interoperability with underlying native APIs. It also supports integration with Unity and is designed to be used as graphics subsystem in a standalone game engine, Unity native plugin or any other 3D application. Full source code is available for download at GitHub and is free to use.
Overview
Diligent Engine API takes some features from Direct3D11 and Direct3D12 as well as introduces new concepts to hide certain platform-specific details and make the system easy to use. It contains the following main components:
Render device (IRenderDevice  interface) is responsible for creating all other objects (textures, buffers, shaders, pipeline states, etc.).
Device context (IDeviceContext interface) is the main interface for recording rendering commands. Similar to Direct3D11, there are immediate context and deferred contexts (which in Direct3D11 implementation map directly to the corresponding context types). Immediate context combines command queue and command list recording functionality. It records commands and submits the command list for execution when it contains sufficient number of commands. Deferred contexts are designed to only record command lists that can be submitted for execution through the immediate context.
An alternative way to design the API would be to expose command queue and command lists directly. This approach however does not map well to Direct3D11 and OpenGL. Besides, some functionality (such as dynamic descriptor allocation) can be much more efficiently implemented when it is known that a command list is recorded by a certain deferred context from some thread.
The approach taken in the engine does not limit scalability as the application is expected to create one deferred context per thread, and internally every deferred context records a command list in lock-free fashion. At the same time this approach maps well to older APIs.
In current implementation, only one immediate context that uses default graphics command queue is created. To support multiple GPUs or multiple command queue types (compute, copy, etc.), it is natural to have one immediate contexts per queue. Cross-context synchronization utilities will be necessary.
Swap Chain (ISwapChain interface). Swap chain interface represents a chain of back buffers and is responsible for showing the final rendered image on the screen.
Render device, device contexts and swap chain are created during the engine initialization.
Resources (ITexture and IBuffer interfaces). There are two types of resources - textures and buffers. There are many different texture types (2D textures, 3D textures, texture array, cubmepas, etc.) that can all be represented by ITexture interface.
Resources Views (ITextureView and IBufferView interfaces). While textures and buffers are mere data containers, texture views and buffer views describe how the data should be interpreted. For instance, a 2D texture can be used as a render target for rendering commands or as a shader resource.
Pipeline State (IPipelineState interface). GPU pipeline contains many configurable stages (depth-stencil, rasterizer and blend states, different shader stage, etc.). Direct3D11 uses coarse-grain objects to set all stage parameters at once (for instance, a rasterizer object encompasses all rasterizer attributes), while OpenGL contains myriad functions to fine-grain control every individual attribute of every stage. Both methods do not map very well to modern graphics hardware that combines all states into one monolithic state under the hood. Direct3D12 directly exposes pipeline state object in the API, and Diligent Engine uses the same approach.
Shader Resource Binding (IShaderResourceBinding interface). Shaders are programs that run on the GPU. Shaders may access various resources (textures and buffers), and setting correspondence between shader variables and actual resources is called resource binding. Resource binding implementation varies considerably between different API. Diligent Engine introduces a new object called shader resource binding that encompasses all resources needed by all shaders in a certain pipeline state.
API Basics
Creating Resources
Device resources are created by the render device. The two main resource types are buffers, which represent linear memory, and textures, which use memory layouts optimized for fast filtering. Graphics APIs usually have a native object that represents linear buffer. Diligent Engine uses IBuffer interface as an abstraction for a native buffer. To create a buffer, one needs to populate BufferDesc structure and call IRenderDevice::CreateBuffer() method as in the following example:
BufferDesc BuffDesc; BufferDesc.Name = "Uniform buffer"; BuffDesc.BindFlags = BIND_UNIFORM_BUFFER; BuffDesc.Usage = USAGE_DYNAMIC; BuffDesc.uiSizeInBytes = sizeof(ShaderConstants); BuffDesc.CPUAccessFlags = CPU_ACCESS_WRITE; m_pDevice->CreateBuffer( BuffDesc, BufferData(), &m_pConstantBuffer ); While there is usually just one buffer object, different APIs use very different approaches to represent textures. For instance, in Direct3D11, there are ID3D11Texture1D, ID3D11Texture2D, and ID3D11Texture3D objects. In OpenGL, there is individual object for every texture dimension (1D, 2D, 3D, Cube), which may be a texture array, which may also be multisampled (i.e. GL_TEXTURE_2D_MULTISAMPLE_ARRAY). As a result there are nine different GL texture types that Diligent Engine may create under the hood. In Direct3D12, there is only one resource interface. Diligent Engine hides all these details in ITexture interface. There is only one  IRenderDevice::CreateTexture() method that is capable of creating all texture types. Dimension, format, array size and all other parameters are specified by the members of the TextureDesc structure:
TextureDesc TexDesc; TexDesc.Name = "My texture 2D"; TexDesc.Type = TEXTURE_TYPE_2D; TexDesc.Width = 1024; TexDesc.Height = 1024; TexDesc.Format = TEX_FORMAT_RGBA8_UNORM; TexDesc.Usage = USAGE_DEFAULT; TexDesc.BindFlags = BIND_SHADER_RESOURCE | BIND_RENDER_TARGET | BIND_UNORDERED_ACCESS; TexDesc.Name = "Sample 2D Texture"; m_pRenderDevice->CreateTexture( TexDesc, TextureData(), &m_pTestTex ); If native API supports multithreaded resource creation, textures and buffers can be created by multiple threads simultaneously.
Interoperability with native API provides access to the native buffer/texture objects and also allows creating Diligent Engine objects from native handles. It allows applications seamlessly integrate native API-specific code with Diligent Engine.
Next-generation APIs allow fine level-control over how resources are allocated. Diligent Engine does not currently expose this functionality, but it can be added by implementing IResourceAllocator interface that encapsulates specifics of resource allocation and providing this interface to CreateBuffer() or CreateTexture() methods. If null is provided, default allocator should be used.
Initializing the Pipeline State
As it was mentioned earlier, Diligent Engine follows next-gen APIs to configure the graphics/compute pipeline. One big Pipelines State Object (PSO) encompasses all required states (all shader stages, input layout description, depth stencil, rasterizer and blend state descriptions etc.). This approach maps directly to Direct3D12/Vulkan, but is also beneficial for older APIs as it eliminates pipeline misconfiguration errors. With many individual calls tweaking various GPU pipeline settings it is very easy to forget to set one of the states or assume the stage is already properly configured when in fact it is not. Using pipeline state object helps avoid these problems as all stages are configured at once.
While in earlier APIs shaders were bound separately, in the next-generation APIs as well as in Diligent Engine shaders are part of the pipeline state object. The biggest challenge when authoring shaders is that Direct3D and OpenGL/Vulkan use different shader languages (while Apple uses yet another language in their Metal API). Maintaining two versions of every shader is not an option for real applications and Diligent Engine implements shader source code converter that allows shaders authored in HLSL to be translated to GLSL. To create a shader, one needs to populate ShaderCreationAttribs structure. SourceLanguage member of this structure tells the system which language the shader is authored in:
SHADER_SOURCE_LANGUAGE_DEFAULT - The shader source language matches the underlying graphics API: HLSL for Direct3D11/Direct3D12 mode, and GLSL for OpenGL and OpenGLES modes. SHADER_SOURCE_LANGUAGE_HLSL - The shader source is in HLSL. For OpenGL and OpenGLES modes, the source code will be converted to GLSL. SHADER_SOURCE_LANGUAGE_GLSL - The shader source is in GLSL. There is currently no GLSL to HLSL converter, so this value should only be used for OpenGL and OpenGLES modes. There are two ways to provide the shader source code. The first way is to use Source member. The second way is to provide a file path in FilePath member. Since the engine is entirely decoupled from the platform and the host file system is platform-dependent, the structure exposes pShaderSourceStreamFactory member that is intended to provide the engine access to the file system. If FilePath is provided, shader source factory must also be provided. If the shader source contains any #include directives, the source stream factory will also be used to load these files. The engine provides default implementation for every supported platform that should be sufficient in most cases. Custom implementation can be provided when needed.
When sampling a texture in a shader, the texture sampler was traditionally specified as separate object that was bound to the pipeline at run time or set as part of the texture object itself. However, in most cases it is known beforehand what kind of sampler will be used in the shader. Next-generation APIs expose new type of sampler called static sampler that can be initialized directly in the pipeline state. Diligent Engine exposes this functionality: when creating a shader, textures can be assigned static samplers. If static sampler is assigned, it will always be used instead of the one initialized in the texture shader resource view. To initialize static samplers, prepare an array of StaticSamplerDesc structures and initialize StaticSamplers and NumStaticSamplers members. Static samplers are more efficient and it is highly recommended to use them whenever possible. On older APIs, static samplers are emulated via generic sampler objects.
The following is an example of shader initialization:
Creating the Pipeline State Object
After all required shaders are created, the rest of the fields of the PipelineStateDesc structure provide depth-stencil, rasterizer, and blend state descriptions, the number and format of render targets, input layout format, etc. For instance, rasterizer state can be described as follows:
PipelineStateDesc PSODesc; RasterizerStateDesc &RasterizerDesc = PSODesc.GraphicsPipeline.RasterizerDesc; RasterizerDesc.FillMode = FILL_MODE_SOLID; RasterizerDesc.CullMode = CULL_MODE_NONE; RasterizerDesc.FrontCounterClockwise = True; RasterizerDesc.ScissorEnable = True; RasterizerDesc.AntialiasedLineEnable = False; Depth-stencil and blend states are defined in a similar fashion.
Another important thing that pipeline state object encompasses is the input layout description that defines how inputs to the vertex shader, which is the very first shader stage, should be read from the memory. Input layout may define several vertex streams that contain values of different formats and sizes:
// Define input layout InputLayoutDesc &Layout = PSODesc.GraphicsPipeline.InputLayout; LayoutElement TextLayoutElems[] = {     LayoutElement( 0, 0, 3, VT_FLOAT32, False ),     LayoutElement( 1, 0, 4, VT_UINT8, True ),     LayoutElement( 2, 0, 2, VT_FLOAT32, False ), }; Layout.LayoutElements = TextLayoutElems; Layout.NumElements = _countof( TextLayoutElems ); Finally, pipeline state defines primitive topology type. When all required members are initialized, a pipeline state object can be created by IRenderDevice::CreatePipelineState() method:
// Define shader and primitive topology PSODesc.GraphicsPipeline.PrimitiveTopologyType = PRIMITIVE_TOPOLOGY_TYPE_TRIANGLE; PSODesc.GraphicsPipeline.pVS = pVertexShader; PSODesc.GraphicsPipeline.pPS = pPixelShader; PSODesc.Name = "My pipeline state"; m_pDev->CreatePipelineState(PSODesc, &m_pPSO); When PSO object is bound to the pipeline, the engine invokes all API-specific commands to set all states specified by the object. In case of Direct3D12 this maps directly to setting the D3D12 PSO object. In case of Direct3D11, this involves setting individual state objects (such as rasterizer and blend states), shaders, input layout etc. In case of OpenGL, this requires a number of fine-grain state tweaking calls. Diligent Engine keeps track of currently bound states and only calls functions to update these states that have actually changed.
Direct3D11 and OpenGL utilize fine-grain resource binding models, where an application binds individual buffers and textures to certain shader or program resource binding slots. Direct3D12 uses a very different approach, where resource descriptors are grouped into tables, and an application can bind all resources in the table at once by setting the table in the command list. Resource binding model in Diligent Engine is designed to leverage this new method. It introduces a new object called shader resource binding that encapsulates all resource bindings required for all shaders in a certain pipeline state. It also introduces the classification of shader variables based on the frequency of expected change that helps the engine group them into tables under the hood:
Static variables (SHADER_VARIABLE_TYPE_STATIC) are variables that are expected to be set only once. They may not be changed once a resource is bound to the variable. Such variables are intended to hold global constants such as camera attributes or global light attributes constant buffers. Mutable variables (SHADER_VARIABLE_TYPE_MUTABLE) define resources that are expected to change on a per-material frequency. Examples may include diffuse textures, normal maps etc. Dynamic variables (SHADER_VARIABLE_TYPE_DYNAMIC) are expected to change frequently and randomly. Shader variable type must be specified during shader creation by populating an array of ShaderVariableDesc structures and initializing ShaderCreationAttribs::Desc::VariableDesc and ShaderCreationAttribs::Desc::NumVariables members (see example of shader creation above).
Static variables cannot be changed once a resource is bound to the variable. They are bound directly to the shader object. For instance, a shadow map texture is not expected to change after it is created, so it can be bound directly to the shader:
m_pPSO->CreateShaderResourceBinding(&m_pSRB); Note that an SRB is only compatible with the pipeline state it was created from. SRB object inherits all static bindings from shaders in the pipeline, but is not allowed to change them.
Mutable resources can only be set once for every instance of a shader resource binding. Such resources are intended to define specific material properties. For instance, a diffuse texture for a specific material is not expected to change once the material is defined and can be set right after the SRB object has been created:
m_pSRB->GetVariable(SHADER_TYPE_PIXEL, "tex2DDiffuse")->Set(pDiffuseTexSRV); In some cases it is necessary to bind a new resource to a variable every time a draw command is invoked. Such variables should be labeled as dynamic, which will allow setting them multiple times through the same SRB object:
m_pSRB->GetVariable(SHADER_TYPE_VERTEX, "cbRandomAttribs")->Set(pRandomAttrsCB); Under the hood, the engine pre-allocates descriptor tables for static and mutable resources when an SRB objcet is created. Space for dynamic resources is dynamically allocated at run time. Static and mutable resources are thus more efficient and should be used whenever possible.
As you can see, Diligent Engine does not expose low-level details of how resources are bound to shader variables. One reason for this is that these details are very different for various APIs. The other reason is that using low-level binding methods is extremely error-prone: it is very easy to forget to bind some resource, or bind incorrect resource such as bind a buffer to the variable that is in fact a texture, especially during shader development when everything changes fast. Diligent Engine instead relies on shader reflection system to automatically query the list of all shader variables. Grouping variables based on three types mentioned above allows the engine to create optimized layout and take heavy lifting of matching resources to API-specific resource location, register or descriptor in the table.
This post gives more details about the resource binding model in Diligent Engine.
Setting the Pipeline State and Committing Shader Resources
Before any draw or compute command can be invoked, the pipeline state needs to be bound to the context:
m_pContext->SetPipelineState(m_pPSO); Under the hood, the engine sets the internal PSO object in the command list or calls all the required native API functions to properly configure all pipeline stages.
The next step is to bind all required shader resources to the GPU pipeline, which is accomplished by IDeviceContext::CommitShaderResources() method:
m_pContext->CommitShaderResources(m_pSRB, COMMIT_SHADER_RESOURCES_FLAG_TRANSITION_RESOURCES); The method takes a pointer to the shader resource binding object and makes all resources the object holds available for the shaders. In the case of D3D12, this only requires setting appropriate descriptor tables in the command list. For older APIs, this typically requires setting all resources individually.
Next-generation APIs require the application to track the state of every resource and explicitly inform the system about all state transitions. For instance, if a texture was used as render target before, while the next draw command is going to use it as shader resource, a transition barrier needs to be executed. Diligent Engine does the heavy lifting of state tracking.  When CommitShaderResources() method is called with COMMIT_SHADER_RESOURCES_FLAG_TRANSITION_RESOURCES flag, the engine commits and transitions resources to correct states at the same time. Note that transitioning resources does introduce some overhead. The engine tracks state of every resource and it will not issue the barrier if the state is already correct. But checking resource state is an overhead that can sometimes be avoided. The engine provides IDeviceContext::TransitionShaderResources() method that only transitions resources:
m_pContext->TransitionShaderResources(m_pPSO, m_pSRB); In some scenarios it is more efficient to transition resources once and then only commit them.
Invoking Draw Command
The final step is to set states that are not part of the PSO, such as render targets, vertex and index buffers. Diligent Engine uses Direct3D11-syle API that is translated to other native API calls under the hood:
ITextureView *pRTVs[] = {m_pRTV}; m_pContext->SetRenderTargets(_countof( pRTVs ), pRTVs, m_pDSV); // Clear render target and depth buffer const float zero[4] = {0, 0, 0, 0}; m_pContext->ClearRenderTarget(nullptr, zero); m_pContext->ClearDepthStencil(nullptr, CLEAR_DEPTH_FLAG, 1.f); // Set vertex and index buffers IBuffer *buffer[] = {m_pVertexBuffer}; Uint32 offsets[] = {0}; Uint32 strides[] = {sizeof(MyVertex)}; m_pContext->SetVertexBuffers(0, 1, buffer, strides, offsets, SET_VERTEX_BUFFERS_FLAG_RESET); m_pContext->SetIndexBuffer(m_pIndexBuffer, 0); Different native APIs use various set of function to execute draw commands depending on command details (if the command is indexed, instanced or both, what offsets in the source buffers are used etc.). For instance, there are 5 draw commands in Direct3D11 and more than 9 commands in OpenGL with something like glDrawElementsInstancedBaseVertexBaseInstance not uncommon. Diligent Engine hides all details with single IDeviceContext::Draw() method that takes takes DrawAttribs structure as an argument. The structure members define all attributes required to perform the command (primitive topology, number of vertices or indices, if draw call is indexed or not, if draw call is instanced or not, if draw call is indirect or not, etc.). For example:
DrawAttribs attrs; attrs.IsIndexed = true; attrs.IndexType = VT_UINT16; attrs.NumIndices = 36; attrs.Topology = PRIMITIVE_TOPOLOGY_TRIANGLE_LIST; pContext->Draw(attrs); For compute commands, there is IDeviceContext::DispatchCompute() method that takes DispatchComputeAttribs structure that defines compute grid dimension.
Source Code
Full engine source code is available on GitHub and is free to use. The repository contains two samples, asteroids performance benchmark and example Unity project that uses Diligent Engine in native plugin.
AntTweakBar sample is Diligent Engine’s “Hello World” example.

Atmospheric scattering sample is a more advanced example. It demonstrates how Diligent Engine can be used to implement various rendering tasks: loading textures from files, using complex shaders, rendering to multiple render targets, using compute shaders and unordered access views, etc.

Asteroids performance benchmark is based on this demo developed by Intel. It renders 50,000 unique textured asteroids and allows comparing performance of Direct3D11 and Direct3D12 implementations. Every asteroid is a combination of one of 1000 unique meshes and one of 10 unique textures.

Finally, there is an example project that shows how Diligent Engine can be integrated with Unity.

Future Work
The engine is under active development. It currently supports Windows desktop, Universal Windows and Android platforms. Direct3D11, Direct3D12, OpenGL/GLES backends are now feature complete. Vulkan backend is coming next, and support for more platforms is planned.
• By regnar
Hi!
I've been trying to implement simple virtual globe rendering system using "3D Engine Design for Virtual Globes" book as a reference.  What I do is I use 6 planes to form a cube, send it to GPU and use vertex shader to form a sphere and add random noise to simulate surface of the planet. The problem is how do I do CPU work on the vertex data from now on - how do I get world space coordinates of a terrain patch to perform LOD techniques, how do I do camera-terrain collision detection etc. ?