# OpenGL Drawing 3D "manually"...

## Recommended Posts

Hi all I'm currently working on a project (along with a few others) to draw some 3D stuff, which will hopefully turn into a rendering engine eventually. I searched for a bit on these forums, and google with no luck. What i would like to do is draw 3D points on the 2D screen, with a depth buffer. I understand that the camera is a 4x4 matrix, along with the point, and i know all rotation matricies ant stuff like that. What is getting me is, given the projection (is it a matrix? in OGL it's gluPerspective(50,1,1,100) ) then how to I get the 2D point on the screen and how far away it is for Depth?? I've worked with OpenGL for a while now, with a little DirectX, and i think that will help me.. Thanks for anything (sorry if this topic has been brought up before) ~zix~ [Edited by - zix99 on November 22, 2004 3:05:15 PM]

##### Share on other sites
Yes, projection is a matrix. The DX SDK help files detail how the projection matrix is created.

What then occurs is: vertexposition (with an assumed w of 1...[x,y,z,1]) * world * view * projection.

The resulting vector is the divided by W. Now you're in clip space. In DX thats -1<x<1, -1<y<1, and 0<z<1. I think I've seen somewhere that in OpenGL z goes from -1 to 1 too... it's a full cube.

From here, you discard and/or clip the triangles to fit within those boundaries. You can also do this before dividing by W, where -w<x<w, -w<y<w, etc..

Then you apply viewport scaling. Basically scale and offset the x and y into pixel coordinates. For a 640x480 screen that would be x*320+320 and y*240+240. Then, rasterize your result.

##### Share on other sites
Any good tutorials that might show exaclty how to do this (I'll look into the DirectX SDK)?
Including drawing triangles/texturing triangles...

Thanks again
~zix~

##### Share on other sites
Yeah, look up the D3DXMatrixPerspectiveLH function in the DirectX SDK help file, it shows you exactly what the perspective matrix looks like.

It's this:

w       0       0               00       h       0               00       0       zf/(zf-zn)      10       0       -zn*zf/(zf-zn)  0where:h is the view space height. It is calculated from h = cot(fovY/2);w is the view space width. It is calculated from w = h * aspect   (where aspect = screenWidth/screenHeight)zf = farthest visible distance, zn = nearest visible distance

So, this works with homogenous coordinates. for a vertex:

  [x, y, z, 1]

you get the following: output vertex

  [x*w, y*h, (z-zn)*zf / (zf-zn), z]

which becomes (After the homogenous divide by w)

  [x*w/z, y*h/z, (1-zn/z)*zf / (zf-zn), 1]

In the case of D3D's matrices, you end up with x and y in the range [-1, 1] and z in the range [0, 1], 1 being farthest from the camera (the z-far plane) and 0 being nearest (z-near plane).

Hope that helps :)

##### Share on other sites
Drillan, thank you very much for that response, quite helpful. Now, sense I have one subject down, all that's left is drawing filled triangles (dont forget i'm using depth so if it is translated to 2D then drawn, i still need depth). Also texturing, but this can be covered later if needed.

One question for namethatnobodyelsetook
Quote:
 What then occurs is: vertexposition (with an assumed w of 1...[x,y,z,1]) * world * view * projection.

So, if I am correct it would look something like (if camera was at 0,0,0 and point was at 5,5,0):
1  0  0  5                 1  0  0  0    w  0  0  00  1  0  5   *  WORLD?  *  0  1  0  0  * 0  h  0  00  0  1  0                 0  0  1  0    0  0  zf/(zf-zn)  10  0  0  1                 0  0  0  1    0  0  -zn*zf/(zf-zn)  0

I figure rotating the camera is simply just multiplying the viewport coordinates by the rotation matrix?

Thanks again
~zix~

PS: I rated u guys up.. but it's not doing anything.. is the rating system broken??

##### Share on other sites
To draw a triangle, start by drawing the lines of the triangle. Look into Bresenham's line algorithm for details. When you draw the lines to the screen you only need to keep track of the smallest x value, and the greatest x value of the line.

So you could keep two array's of min x and max x.

MIN X MAX X
[0] [0]
[1] [1]
. .
. .
[480] [480]

The array would be the height of the screen, then just iterate through the array and connect the lines horizontally. In array position [0] you could have a value of 10, and array 2 position [0] you could have a value of 100, that means that left most outer edge of the traingle 0 pixels down on the screen is at 10, and the right most edge of the triangle 0 pixels down is at 100. A simple for loop from 10 to 100, filling in each pixel with an interpolated color or a sampled UV coord will give you a colored/textured triangle.

- Hope thats clear.

##### Share on other sites
Quote:
 Original post by zix99So, if I am correct it would look something like (if camera was at 0,0,0 and point was at 5,5,0):1 0 0 5 1 0 0 0 w 0 0 00 1 0 5 * WORLD? * 0 1 0 0 * 0 h 0 00 0 1 0 0 0 1 0 0 0 zf/(zf-zn) 10 0 0 1 0 0 0 1 0 0 -zn*zf/(zf-zn) 0I figure rotating the camera is simply just multiplying the viewport coordinates by the rotation matrix?

The correct order of operations is world*view*proj.

I don't understand what you mean by point was at 5,5,0 do you mean the look at point for the camera? I'm not sure what you mean but, for camera, every thing is inverted, because the world is transformed to move the camera back to 0,0,0. So if the camera was translated 5,5,0 forward the matrix would have -5,-5,0 in the translation portions.

##### Share on other sites
Maybe this is clearer?
(5,5,0)                    (0,0,0)Vertex Pos      World      Camera        Projection1  0  0  5                 1  0  0  0    w  0  0  00  1  0  5   *  WORLD?  *  0  1  0  0  * 0  h  0  00  0  1  0                 0  0  1  0    0  0  zf/(zf-zn)  10  0  0  1                 0  0  0  1    0  0  -zn*zf/(zf-zn)  0

According to namethatnooneelsetook it is
Quote:
 vertexposition (with an assumed w of 1...[x,y,z,1]) * world * view * projection.

And i'm a little confused on what the world matrix vs view matrix is. I'm guessing the view matrix is the same as the camera (inverted); in that case what is the world?

Thanks
~Zix

PS: Thanks for the triangle algo, once you have the 3 positions of the triangle, then you simple interpoliate the UV coordinates and draw accordingly (relativly simple).. correct? I'll have to do some thinking on this (along with the aspect of depth buffer), but i think i may have it.

##### Share on other sites
OpenGL doesn't necessarily have a world, you have a modelview matrix which is both world and view together (I could be wrong, I've never used OpenGL).

In DirectX the World matrix is the matrix that transforms a vertex to world space. ie: It positions and rotates a mesh.

The view matrix is the inverse of the camera's world matrix. If your camera was at 1,2,3, the inverse of that would be a translation matrix of -1,-2,-3. Instead of actually moving a camera through the world, the world is dragged (and rotated) to the camera.

So, lets say we have a vertex at the origin. Lets say our mesh is at 0,0,10. Lets say the camera is at 0,0,2.

vertex pos = 0,0,0
*worldmatrix = 0,0,10
*viewmatrix = 0,0,8
*projection = ...

##### Share on other sites
Thanks everyone.. I'll test all these ideas out and let you know how it goes.

I'll also rate you guys up as soon as it lets me.
EDIT: It just wont let me rate Drillan... is it because he's a GDNet member? o well, i guess i'll report to bugs

~zix~

##### Share on other sites
Quote:
 Original post by NamethatnobodyelsetookWhat then occurs is: vertexposition (with an assumed w of 1...[x,y,z,1]) * world * view * projection.The resulting vector is the divided by W. Now you're in clip space. In DX thats -1

I have a question on this. What happens if W is now 0? I've been working on a project like this for a class at school and I am having that problem. Where I have a unit pyramid with the top point at (0, .5, 0, 1) and after I do all my transformations and go to do perspective division W is now 0 so it screws it all up. BTW, I have my world axes with X->right, Z->up, Y->in (because our teacher wants it that way).

So maybe I'm doing something wrong with the transformations. I don't know. But if you could just answer that question for me I'd be grateful. :)

##### Share on other sites
If your w is 0 when it comes to perspective division, you probably screwed up your transformation matrix. IIRC w should come out equal to the distance between the point and the camera...

For the most part it seems like you can treat w as a binary flag. If it's 1, translations will be applied to the vector. If it's 0, they won't.

Quote:
 Original post by zix99I'll also rate you guys up as soon as it lets me.EDIT: It just wont let me rate Drillan... is it because he's a GDNet member? o well, i guess i'll report to bugs
Don't worry, your rating's been recorded. It just doesn't have any effect at the moment because you and he both have the same score. I, however, can rate him up, and change his score. *rates*

##### Share on other sites
Quote:
 I have a question on this. What happens if W is now 0? I've been working on a project like this for a class at school and I am having that problem. Where I have a unit pyramid with the top point at (0, .5, 0, 1) and after I do all my transformations and go to do perspective division W is now 0 so it screws it all up. BTW, I have my world axes with X->right, Z->up, Y->in (because our teacher wants it that way).So maybe I'm doing something wrong with the transformations. I don't know. But if you could just answer that question for me I'd be grateful. :)

W=0 implies that the point is 'at infinity'. If you imagine a line passing through your eye (from in front of you, through your head and out the back), the perspective transform kind of turns space inside out (it's hard to visualize).

After the perspective transform you are in clip space and it is perfectly possible for primitives to wrap around infinity and come back (e.g. a line like above). You MUST clip to your view frustum (which is actually a 'parallelpiped' or box in clip space). After clipping you will only get W=0 for a point if you don't have an offset in your z (in other words Zn=0).

##### Share on other sites
Yeah, w=0 implies that the point is at infinity. Another way to look at it is that, if w=0, the x,y, and z coordinates are the direction TO that point (so if the coordinate is [0,1,0,0] then the point is at infinity down the +Y axis).

You definitely shouldn't end up with w = 0 after the perspective matrix is applied.

Really, you want to treat your vectors as a 4x1 matrix, so it'd be:

(5,5,0)                    (0,0,0)Vertex Pos      World      Camera        Projection                           1  0  0  0    w  0  0  05  5  0  1   *  WORLD   *  0  1  0  0  * 0  h  0  0                           0  0  1  0    0  0  zf/(zf-zn)  1                           0  0  0  1    0  0  -zn*zf/(zf-zn)  0

The transform you're doing won't come out correctly because a vector is not the same thing as a transform matrix :)

Hope that helps!

##### Share on other sites
Thanks all again for the information.. and good thing i kept up on this otherwise i would've never treated the vector as a 4x1 matrix, i would've treated it as a 4x4.

Thanks again Drillan, seems you have some background in this.

And thanks superpig for taking care of the rating

~zix~

##### Share on other sites
No, you can perfectly well have W=0 because after the perspective transform W is proportional to the original Z coordinate.

In fact, taking Drillians example:

(I'll discard the World and camera transform, presume we're in camera space and this point happens to fall at Z=0)

v' = v * P

x' = (5 * w) + (5 * 0) + (0 * 0) + (1 * 0)
y' = (5 * 0) + (5 * h) + (0 * 0) + (1 * 0)
z' = (5 * 0) + (5 * 0) + (0 * ...) + (-zn*zf/(zf-zn))
w' = (5 * 0) + (5 * 0) + (0 * 1) + (1 * 0)

So this point does have W=0!

The point is that this point will be clipped away since z' < Zn.

##### Share on other sites
Ah! You're correct. Forgot about the ol' "z = 0" case. Yeah, because the perspective is, at its most basic level, a divide by z (which is why you set the w coordinate to z: because in order to get the normalized coordinate you divide by the w parameter [x y z w] -> [x/w, y/w, z/w, 1]). But when z = 0, the perspective divide becomes a divide by zero, which effectively puts the coordinate at infinity in post-perspective space.

But JuNC is right, that point gets clipped away due to the near Z value, so in practice you should never actually see this.

Good catch!

##### Share on other sites
This seems interesting. I might give it a try and program my own software renderer.

My only question is, what does the projected point look like? Are the coordinates from (-1,-1) to (1,1), where (0,0) is the center of the screen, and outside that range is outside of the screen?

And whats the standard algorithm for, say, a color filled polygon? You project the 3 vertices, draw the edges to find min and max values in terms of rows, and then fill each row?

##### Share on other sites
Quote:
 Original post by Max_PayneMy only question is, what does the projected point look like? Are the coordinates from (-1,-1) to (1,1), where (0,0) is the center of the screen, and outside that range is outside of the screen?

To do simple project to -1,-1 to 1,1 you use this matrix.

1 0 0 0
0 1 0 0
0 0 1 1/D
0 0 0 0

The X, Y, Z get multiplied by the 1 across the diagnal, and the Z get's multipled by the recipical of D. So the Z value is stored in the W value of the vertex you multiply it by. This is useful when you actually do projection and divide (x,y,z,w) by w, to get (x/w, y/w, z/w, 1) you can discard the z and w and get your projected coordinates (x/w, y/w).

Those numbers are in-dependent on your viewport or field of view. The viewport is factored into the w(x) and h(y) values of the above matrix. In directX the width can be calculated using field of view or viewport. If you want to depend on the view port you would do this.

w = 2 * z-near / ViewportWidth;
h = 2 * z-near / ViewportHeight;

w 0 0 0
0 h 0 0
0 0 1 1
0 0 0 0

Also, you probally are going to want to factor z in for the near and far plane, to do that use
Q = Zfar / ( zfar - znear )

to get the matrix
w 0 0 0
0 h 0 0
0 0 q 1
0 0 0 0

That will account for a viewport that can be set rather then the standard -1,-1, to 1,1.

Quote:
 And whats the standard algorithm for, say, a color filled polygon? You project the 3 vertices, draw the edges to find min and max values in terms of rows, and then fill each row?

Read my post above, but basically what you said sums it up.

## Create an account

Register a new account

• ### Forum Statistics

• Total Topics
628308
• Total Posts
2981979
• ### Similar Content

• By mellinoe
Hi all,
First time poster here, although I've been reading posts here for quite a while. This place has been invaluable for learning graphics programming -- thanks for a great resource!
Right now, I'm working on a graphics abstraction layer for .NET which supports D3D11, Vulkan, and OpenGL at the moment. I have implemented most of my planned features already, and things are working well. Some remaining features that I am planning are Compute Shaders, and some flavor of read-write shader resources. At the moment, my shaders can just get simple read-only access to a uniform (or constant) buffer, a texture, or a sampler. Unfortunately, I'm having a tough time grasping the distinctions between all of the different kinds of read-write resources that are available. In D3D alone, there seem to be 5 or 6 different kinds of resources with similar but different characteristics. On top of that, I get the impression that some of them are more or less "obsoleted" by the newer kinds, and don't have much of a place in modern code. There seem to be a few pivots:
The data source/destination (buffer or texture) Read-write or read-only Structured or unstructured (?) Ordered vs unordered (?) These are just my observations based on a lot of MSDN and OpenGL doc reading. For my library, I'm not interested in exposing every possibility to the user -- just trying to find a good "middle-ground" that can be represented cleanly across API's which is good enough for common scenarios.
Can anyone give a sort of "overview" of the different options, and perhaps compare/contrast the concepts between Direct3D, OpenGL, and Vulkan? I'd also be very interested in hearing how other folks have abstracted these concepts in their libraries.
• By aejt
I recently started getting into graphics programming (2nd try, first try was many years ago) and I'm working on a 3d rendering engine which I hope to be able to make a 3D game with sooner or later. I have plenty of C++ experience, but not a lot when it comes to graphics, and while it's definitely going much better this time, I'm having trouble figuring out how assets are usually handled by engines.
I'm not having trouble with handling the GPU resources, but more so with how the resources should be defined and used in the system (materials, models, etc).
This is my plan now, I've implemented most of it except for the XML parts and factories and those are the ones I'm not sure of at all:
I have these classes:
For GPU resources:
Geometry: holds and manages everything needed to render a geometry: VAO, VBO, EBO. Texture: holds and manages a texture which is loaded into the GPU. Shader: holds and manages a shader which is loaded into the GPU. For assets relying on GPU resources:
Material: holds a shader resource, multiple texture resources, as well as uniform settings. Mesh: holds a geometry and a material. Model: holds multiple meshes, possibly in a tree structure to more easily support skinning later on? For handling GPU resources:
ResourceCache<T>: T can be any resource loaded into the GPU. It owns these resources and only hands out handles to them on request (currently string identifiers are used when requesting handles, but all resources are stored in a vector and each handle only contains resource's index in that vector) Resource<T>: The handles given out from ResourceCache. The handles are reference counted and to get the underlying resource you simply deference like with pointers (*handle).
And my plan is to define everything into these XML documents to abstract away files:
Resources.xml for ref-counted GPU resources (geometry, shaders, textures) Resources are assigned names/ids and resource files, and possibly some attributes (what vertex attributes does this geometry have? what vertex attributes does this shader expect? what uniforms does this shader use? and so on) Are reference counted using ResourceCache<T> Assets.xml for assets using the GPU resources (materials, meshes, models) Assets are not reference counted, but they hold handles to ref-counted resources. References the resources defined in Resources.xml by names/ids. The XMLs are loaded into some structure in memory which is then used for loading the resources/assets using factory classes:
Factory classes for resources:
For example, a texture factory could contain the texture definitions from the XML containing data about textures in the game, as well as a cache containing all loaded textures. This means it has mappings from each name/id to a file and when asked to load a texture with a name/id, it can look up its path and use a "BinaryLoader" to either load the file and create the resource directly, or asynchronously load the file's data into a queue which then can be read from later to create the resources synchronously in the GL context. These factories only return handles.
Factory classes for assets:
Much like for resources, these classes contain the definitions for the assets they can load. For example, with the definition the MaterialFactory will know which shader, textures and possibly uniform a certain material has, and with the help of TextureFactory and ShaderFactory, it can retrieve handles to the resources it needs (Shader + Textures), setup itself from XML data (uniform values), and return a created instance of requested material. These factories return actual instances, not handles (but the instances contain handles).

Is this a good or commonly used approach? Is this going to bite me in the ass later on? Are there other more preferable approaches? Is this outside of the scope of a 3d renderer and should be on the engine side? I'd love to receive and kind of advice or suggestions!
Thanks!
• By nedondev
I 'm learning how to create game by using opengl with c/c++ coding, so here is my fist game. In video description also have game contain in Dropbox. May be I will make it better in future.
Thanks.

• So I've recently started learning some GLSL and now I'm toying with a POM shader. I'm trying to optimize it and notice that it starts having issues at high texture sizes, especially with self-shadowing.
Now I know POM is expensive either way, but would pulling the heightmap out of the normalmap alpha channel and in it's own 8bit texture make doing all those dozens of texture fetches more cheap? Or is everything in the cache aligned to 32bit anyway? I haven't implemented texture compression yet, I think that would help? But regardless, should there be a performance boost from decoupling the heightmap? I could also keep it in a lower resolution than the normalmap if that would improve performance.
Any help is much appreciated, please keep in mind I'm somewhat of a newbie. Thanks!

• Hi,
I'm trying to learn OpenGL through a website and have proceeded until this page of it. The output is a simple triangle. The problem is the complexity.
I have read that page several times and tried to analyse the code but I haven't understood the code properly and completely yet. This is the code: