Jump to content

  • Log In with Google      Sign In   
  • Create Account


Member Since 26 May 2004
Offline Last Active Aug 25 2014 09:18 AM

#5139482 can I go without events?

Posted by on 16 March 2014 - 11:23 AM

The problem with polling is you can miss something.  If the user presses and releases a key before you get around to polling again, you will completely miss it.  Mouse clicks are fast, and if your frame rate stalls for whatever reason (not even your program's fault; maybe the dreaded McAfee just started to do an update in the background...) and you're out of luck.


I get that you want to just poll to check if a key is pressed in your code.  I like to do that too.  But it's easy to do with events:  (Rough pseudocode; haven't done win32 events in a while:

bool keystate[MAX_KEY];
bool mousebutton[MAX_MOUSEBUTTON];
int  mousex, mousey;

#define MOUSE_LEFT 0
#define MOUSE_RIGHT 1


  while (event = getnextevent(...)) 

    if(event==KEY_DOWN && keycode < MAX_KEY)
        keystate[keycode] = 1;

    if(event==KEY_UP && keycode < MAX_KEY)
        keystate[keycode] = 0;

    if (event==L_MOUSE_DOWN)
         mousebutton[MOUSE_LEFT] = 1;
    if (event==L_MOUSE_UP)
         mousebutton[MOUSE_LEFT] = 0;

    if (event==MOUSE_MOVE)
        mousex = event_mouse_x;
        mousey = event_mouse_y;

Then, in your code you can do  if(keystate[KEY_L_CTRL]) all you want.  Or sprite.draw(mousex, mousey).


The above could still suffer from missing presses.  If the system slows down, and you call the event loop once per frame, if a keydown AND a keyup press bunch up against each other, then in one frame's processing of events you would set, and unset a single key in the same call to the loop.  One way to do that is handle the player's inputs in the event handler itself:

   if(event==KEY_DOWN && keycode < MAX_KEY)
        keystate[keycode] = 1;
        if (keycode == playerconfig.firekey)

        if (keycode == playerconfig.jump)
            player.jump= 1;

This way the event loop is looking for particular events, and sets those flags in the player object.  Each frame, after the event handler is run, when the player's control code is run, it checks it's action flags and does the appropriate action.  This way you can't miss a jump or fire event, even if the next event is the queue is releasing the button.


You can abstract that completely and do this:

while (event = getnextevent(...))
    memset(keypressed, 0, sizeof(keypressed)); //clear pressed events

    if(event==KEY_DOWN && keycode < MAX_KEY)
        keystate[keycode] = 1;
        keypressed[keycode] = 1;

    if(event==KEY_UP && keycode < MAX_KEY)
        keystate[keycode] = 0;


Now, you can't miss an event.  If the key was pressed at all since the last frame, keypressed will be '1'.  And if its still pressed keystate will be '1'.  So now anywhere in your game loop you can do this:

if (keypressed[ player_config.fire_key] || ( keystate[player_config.fire_key] && (this_frame - last_fire_frame >= 20)) 
    last_fire_frame = this_frame;

What this does is fire a bullet every time the player presses the fire key OR if the player holds down the fire key, one bullet every 20 frames.


Also this is all assume a single main loop thread, that looks something like this:


#5134181 using other compilers generator

Posted by on 24 February 2014 - 01:44 PM

Your language is very similar to C.  I would suggest you write a simple preprocessor that reads your langauge, applies a few tweaks to it, and outputs C.  Yes, generating C form a non-C language can be ugly, but generating C from almost-C could just be a matter of some basic text manipulation.  Even if you intend to write a full-features compiler, this exercise will be useful because it will allow you to write (and run) functioning programs in your model language, and quickly make changes to it as you discover new features you want to add, or discover that certain features just don't work they way you thought they would.  Then when your language design is finished, you can then write a full compiler for it, generate llvm code, or whatever.  That's exactly how C++ was made:  It started as a C++ to C converter.

#5133631 deinstancing c-strings

Posted by on 22 February 2014 - 01:20 PM


enum Colors
  red = 0xffff0000,
  green = 0xff00ff00,
  blue = 0xff0000ff


I really like that.  I don't think enums are guaranteed to hold 32 bits, but this is more likeley to work than merging strings.  If the compiler can't produce 32 bit enums for whatever reason, it should at least spit out a warning or error to alert you of this.


Fir, I know everyone's already told you why this is a bad idea.  I used to code like this at one point.  I completely agree with you that the string literals are just turned into a magic number by the compiler.  Each time you compile, its different, but its a unique number because that literal needs a unique address in memory.  It would work perfectly well if all the constants could be merged together across files.  But, thats not guaranteed behavior, and even if one compiler does it today, it might not tomorrow.


If you want to keep the idea of an ad-hoc enum, where you can just use it without defining anything, then you could use a syntax like:



At the top of each file have:

#include <adhoc_enums.h>

Then in your code use a notation like:

void color_something(thing* thing, adhoc color)

   if (color == adhoc(red) )
   if (color == adhoc(green) )


Your makefile can call a short script, that finds all the adhoc references, and produces adhoc_enums.h:

#define adhoc(x) adhoc_enum_ ## x

typedef enum {
    adhoc_enum_null = 0,
    adhoc_enum_red = 1,

} adhoc_enum;


  • You get real enums
  • Your values are basically ints now.  Then can be safely =='d against each other
  • You can use switch/case for if/else chains.
  • Type safety against other char*.  Also probably compiler warning against ints or other enum types.  adhoc_enums only 'like' each other
  • No dependence of weird compiler or linker options


  • Extra build step w/ an extra utility you need to write (not as gross as you think.  Ever use qmake? )

You also have the added advantage, that you could do this once, and keep the generated enum file.  When you use new enum values, you can add them to adhoc_enums.h manually.  Then maybe at some point you will decide to split your enums into different types (enum_color, enum_shape).  You might find yourself going down a standards-compliant path.

#5131143 stack overflows: why not turn the stack upside down?

Posted by on 13 February 2014 - 05:41 PM

We all know about stack overflows.  You overwrite your buffer on the stack, and you are now trashing another buffer, or even worse, the return address.  Then when 'ret' (I'm using x86 lingo here), runs, the cpu jumps to a bad place.  This is because on most platforms the stack grows downward.  When you exceed your stack frame's bouundary, you start tracking the previous stack frame.


This downward growth is mostly hisorical.  On older computers, before virtual memory, your stack started at the top of RAM, and grew downwards.  The heap started somewhere else and grew up.  You ran out of memory when the two pointer met.  With virtual memory, you can put the stack 'anywhere' and even grow it on a page fault in some cases.  And I think it's time to turn the stack around:



Suppose f() calls g() and that calls h():

classic stack:  'R' is where the return address goes:
low addr  [empty-----------------R][--f--]   high addr
low addr  [empty------------[--g-R][--f--]   high addr
low addr  [empty-----[--h-R][--g-R][--f--]   high addr

How H overflows:

low addr  [empty-----[--h--*R*****][--f--]   high addr

we crash, or worse, run an explot from h inserting an return address to who-knows-where

upside-down stack:
low addr  [--f--][-empty------------------]   high addr
low addr  [--f--][R-g--][empty-----------]   high addr
low addr  [--f--][R-g--][R-h--][empty----]   high addr

Now H overflows:
low addr  [--f--][R-g--][R-h***********--]

We've overflowed into empty space.  Return address and previous stack frames are safe

This wouldn't be too hard to do.  Most stack access during a function is done pointer arithmetic:

mov eax, [esp-4]

mov eax, [esp+4]


I know there are some hardware platforms that already have stacks that grow upwards.  Redefining the ABI for x86 would break using standard libraries, but for inside your own application, this might enhance security.  I suppose its even possible to use an 'upwards' stack for your application, and then when you call a 3rd party library, switch the stack pointer to a seperate area where you have a standard downwards stack defined.   I imagine this would involve a lot of hacking around in gcc or llvm to make it work.  In an open OS like linux, maybe you could recompile the whole system to use upwards stacks. 


Just a thought.  Downvote if it stinks! 








#5050660 C99: strict aliasing rule vs compatible pointers

Posted by on 06 April 2013 - 02:28 PM

I'm cleaning up some of my code by compiling with gcc -Wall and -Wstrict-aliasing=1 to look for unclean things I've done.



This is a fairly common pattern in my C code:


typedef struct generic_list_s
	int type;
	struct generic_list_s* next;
} generic_list_t;

typedef struct
	generic_list_t gen;
	char* str;
} string_t;

typedef struct 
	generic_list_t gen;
	int x;
} int_t;


The general idea is I have some generic structure that I reuse as the 1st element of larger, more specific structures.  The C standard allows you to cast a pointer back and forth between a struct, and it's 1st member.  A pointer to an int_t or a string_t in the above examples is also a pointer to a generic_list_t.


GCC however likes to complain.  If I compile a snippet like this:

	string_t* node0;
	int_t* node1;
	string_t* node2;
	node0 = mk_string("Hello");
	node1 = mk_int(123);
	node2 = mk_string("World");

	/* Below seems correct
	  node1 is a pointer to an int_t
	  int_t starts with generic_list_t, so
	  a pointer to int_t is also a pointer to int_t
	  ( says )
	node0->gen.next = (generic_list_t*) node1;
	node1->gen.next = (generic_list_t*) node2;

with -Wstring-aliasing =1, I get this warning:


warning: dereferencing type-punned pointer might break strict-aliasing rules


This makes sense, in that node1 is a pointer to an int_t, and node0->gen.next is technically a pointer to a generic_list_t.  I have two different types of pointers pointing to the same thing.  I can make it go away by doing this:


node0->gen.next = &(node1->gen);
node1->gen.next = &(node2->gen);


Which is more correct, although a pointer to the first element of a struct or a pointer to the struct are supposed to be compatible with each other.


The real problem comes with functions like:


void print_thing(generic_list_t* n)
	if (n->type == TYPE_STRING)
		string_t* s = (string_t*) n;/* warning: dereferencing type-punned pointer might break strict-aliasing rules*/
		printf("%s\n", s->str);

	if (n->type == TYPE_INT)
		int_t* i = (int_t*) n;  /* warning: dereferencing type-punned pointer might break strict-aliasing rules*/
		printf("%d\n", i->x);


 Hoe do I clean up these warnings? 

#5045358 Thoughts on Nasm, etc..

Posted by on 21 March 2013 - 01:59 PM

When I was a kid, the progression to learning programming was BASIC first, then assembly. You couldn't get a C compiler for free back then, and DOS came with 'debug' letting you write little assembly programs.  Later I moved onto C(++), Java, etc, and never went back to it.  But I do understand your fascination with it, and at times, something I think about getting in to x64 assembly programming.  Maybe later.


Don't let others give you grief about it though.  I'm a hardcore C programmer, and even program C in an OO style (function pointers in structs for polymorphism, etc), and some people give me trouble for it.  I understand C++ is there, but for my own personal projects (not at work: at work I use whatever the rest of the team is using because its not my project) I choose C.


Keep doing what you're doing.  We need more assembly programmers, because somebody needs to fix the compiler when it spits out broken code, or write those high performance hardware drivers!  

#5027768 Anyone here a self-taught graphics programmer?

Posted by on 01 February 2013 - 12:40 AM

My first hardware accelerated application was a MS Word document!  No kidding!  Before I get to that, let me tell the story of learning to use opengl in basic, assembly and then C, and yes in that order.  In early high school I played around a lot with qbasic and was writing simple wireframe 3d mazes with horrible performance.  I didn't have a C compiler, so I started playing around with debug.com, and started writing little assemly language routines to speed up certain slow things in basic.  Debug sucks, it can't even do labels, you have to specify the exact jump address. You literally have to write JMP 0x322 and hope you put some code at 0x322.  So I wrote a qbasic program that reads in assembly source with labels, strips them out, runs it through debug with dummy addresses (JMP 0x1234 or whatever).  It looks at the redirected output of debug to see what address the assembler said it was using for each line, figured out the labels, and then reassembled it a second time.  It was so cruddy.  While I was doing this, I also got involved in a high school robotics program called Botball, where you programed lego robots in a language called Interactive C.


Back to graphics.  I wanted to try using GL, but like I said, I didn't have any compilers.  MS Word 2000 had a built in version of VB, called VBA.  It turns out VBA can load and run functions from DLLs (scary), so I wrote a word VBA macro that loaded system32.dll, and called the function to get the application's window handle.  I played around with GDI, and got to draw dots and lines onto the word document window using only the win32 api.  So then I loaded opengl32.dll.  After crashing word several times, I managed to get a textured quad on the screen, put there by the video card!   I then shortly discovered that the Windows DDK for Window 98 came with a FREE COPY of MASM.  So I started writing programs in the psychotic mix of dll's written in MASM that were loaded into MS Word's Visual Basic.  A teacher at high school saw what I was doing, and gave me a copy of Borland C++.  I was able to apply what I learned about writing PC programs in MASM and Basic to what I learned in Botball's 'Interactive C' and from there everything took off.  After I finished highschool, I did EECS at UC Berkeley, and now I do software for a living.


I'm so glad visual studio has free editions.  I would have free loved visual studio as a kid.  Ubuntu would have also rocked.  Kids today can access this stuff so easily now.

#5026882 OpenGL Procedural Planet Generation - Quadtrees and Geomipmapping

Posted by on 29 January 2013 - 01:26 PM

Ok, I see what you mean.  When I get home tonight I can put up the function that creates subpatches.  For now, lets separate geometry from topology.  The topology is how vertices are connected.  The topology of every single patch is the same.  It is a rectangular array of 33x33 vertices, which have been pieced together in 32x32 quads.  In GL , the index buffer defines topology because it says which points go together to make triangles.  It doesn't matter that the patches have been scaled, rotated distorted, whatever, the topology is still just a grid like 'graph paper.'  By setting a fixed patch size, we don't have to worry about different tesselation levels or anything like that.  


Now to the geometry.  To make things easy, each point in the patch is identified by two integer coordinates, 'a' and 'b', both varying from 0-32.  You can think of it as a 33x33 pixel image.  Except each 'pixel' contains float x,y,z coordinates instead of a color. Now what you want to do is three things:


a. Define the source regions.  If I'm cutting a 33x33 pixel image up into quarters, each quarter will be 17x17.  yes, the quarters will overlap their neighbors by 1 pixel.  This is correct, since these 'pixels' in this map are actually points, and we want to share an edge of points with the other quarters so there are no gaps.


b. upscale each 17x17 region to a 33x33 map.  This is easy.  You can see if you go along a row 17 original source points, and average every two adjacent values you get 16 in-between values. 17+16 is 33 again.  You have reused 17 points from the previous generation, and created 16 new points.  You can do this for all 17 source rows.  Now you have a 33x17 map.  Going vertically, there are 16 in-between rows you can average, creating a final map of 33 in height.  I recommend NOT trying to literally reuse the same points (as in same VBO) as the parent, but instead copy the positions into the child's VBO.  1 VBO per patch is reasonable to start with; maybe later adjacent quadtrees can share VBOs or something.


c. Displace.  When averaging and creating in-between values, this is the place to add displacement.



I'll post that bit of code up tonight.


I haven't worked on this in a few weeks, but this thread is inspiring me to find the time to fix the cracks between different detail levels!  Thanks!

#5026705 OpenGL Procedural Planet Generation - Quadtrees and Geomipmapping

Posted by on 29 January 2013 - 03:32 AM

 So currently my cube has thousands of vertices when i export it from
blender and load it into my application. I'd like to cut this loading
time a lot.

 Oh, now I know what you mean when you said that you were having trouble deciding which vertices go to which of the 6 planes.  I think it would be a lot easier if instead you were to build the sphere inside the application, instead of loading a sphere made somewhere else.  Before I show you how I build the sphere in-program, let me show you the struct I use first.  I removed a few things specific to my app, so it's pretty basic:


#define EDGE_LEFT    0
#define EDGE_RIGHT   1
#define EDGE_TOP     2
#define EDGE_BOTTOM  3

typedef struct quadarray_s
	gx_vbuffer_t* vb;      //vertex buffer storing geometry for this patch
	vindex startindex;     //start and end draw index for this patch
	vindex endindex; 

	zint32 w;  		//2d array dimension.  w and h can both be 33 for example, for 33x33 patch
	zint32 h;

	struct quadarray_s* children[4];   //4 children patches to subdivide
	struct quadarray_s* parent;
	int self; 			   //which one of my parent's children am i?

	vec3 center;  //center of the quadarray
	float size;   //area metric       
	float dsize;  //displacement metric  

	struct quadarray_s* adjacent[4];  //the 4 adjacent neighbors at the same detail level	
	int adjacent_edge[4];       //inverse relationship from neighbor

} quadarray_t ;


Probably the most confusing part is tracking neighbors, but you can probably ignore that at the moment.  You can get started without it, there will just be cracks and seems between patches.


I create 6 quadarray patches, one for each face of the cube: (warning, messing C code!) 

for (face = 0; face < 6; face++)  //6 sides of cube...

	//make quadarray

	qa = quadarray_mk(33,33);
	qa->size = .3;    //I set the 'size' of this patch to .3.  it's rather arbitray, I just declared this patch has area metric of .3
	qa->dsize = .01;   //I set the detail size of this patch to .1.  It's also arbitray, it just affects how 'big' the displacements are

	// a and b will visit each vertex in the quadarray.
	for (a=0;a < qa->w;a++)			//0 to 'width'
		for (b=0;b<qa->h;b++)		//0 to 'height'
			vec3* vv = quadarray_get_vertex(qa, a, b);

			//convert each of the integer vertex coordinates to float -1 to 1

			fa = ((a-qa->w/2)/  (float) (qa->w-1)) *2 ;
			fb  =  ((b-qa->h/2)/ (float) (qa->h-1)) *2 ;

			//now decide on a plane depending on what face we are on

			switch (face)
				case 0: //top Y=1
					vec3set(*vv, fa, 1, fb);	

				case 1: //bottom Y=-1
					vec3set(*vv, -fa, -1, fb);

				case 2: //front Z=1
					vec3set(*vv, -fa, fb, 1);

				case 3:  //back Z=-1
					vec3set(*vv, fa, fb, -1);

				case 4:  //left  X=-1
					vec3set(*vv, -1, fb, -fa);

				case 5:  //right X=1
					vec3set(*vv, 1, fb, fa);


			//normalize vv to unit length
			d = vec3dot(*vv,*vv);   
			d = sqrt(d);
			vec3scale(*vv, 1/d);		

	vector_add(active_quadarrays, qa);  /* Add to the vector of enabled quadarrays */


The above creates a perfect ball radius 1 centered around <0,0,0>. The quadarrays vector stores the current list of active quadarrays.  'Active' means current detail level.

The main algorithm I use is in the pseudocode here:


//once per render/update loop:

for qa = all quadarray_t* in active_quadarrays {

	//test for visibility and draw
	if (  qa is in camera frustum )
		draw (qa);

	if ( patch_too_small( qa, camera_position))

	if (patch_too_big(qa, camera_position))


split_patch (quadarray_t* qa)
	remove qa from active_quadarrays

	if (!qa->children[0])
		//patch has bot been split before, need to generate or load children

		qa->children[0] = ...
		qa->children[1] =...;
		qa->children[2] =...
		qa->children[3] = ...	


	add qa->children[0] , children[1], children[2], and children[3] to active_quadarrays 

combine_patch (quadarray_t* qa) {

	//remove all siblings from active_quadarrays
	remove qa->parent->child[0]
	remove qa->parent->child[1]
	remove qa->parent->child[2]
	remove qa->parent->child[3]

	place qa->parent on active_quadarrays


I currently alpha-blend details in and out, so I am doing something more complicated than this.  I have a 'fade in' and 'fade out' list.  I hope to find some time to fix the cracks between detail levels and then cleanup the code.  After that, I'll put it on github, and everyone can rip it apart and flame my archaic C style :)

#5026483 OpenGL Procedural Planet Generation - Quadtrees and Geomipmapping

Posted by on 28 January 2013 - 01:50 PM

I have also been working on/off on a planet generator.  While I make no claims to the 'idealness' of my method, it does seem to work.  You're on the right track for an adaptive quadtree, because that is what I'm using.  I looked into a bunch of terrain algorithms, and (ignoring older, more dynamic algorithms like ROAM), I've seen two main strategies:

  • Fixed number of patches, dynamic detail levels:  You need a 'small' outdoor environment.  The ground is divided into patches which can be preloaded into GL (or DX if you prefer).  Each patch can be drawn at different detail levels.  So, a given 'square' of ground might be drawn at 1x1 all the way to 32x32, or even 64x64 quads.  This seems fine for environments where the draw distance will always be limited.  If this is a game 'on foot' or a flying game with a low maximum altitude, this works fine.  You can even 'scroll' over a larger world by loadeding in extra patches where you need them.  The basic render loop for this just needs to decide which patches are visible (frustum cull) and then at what detail level.  I started with this one, and it seems to work great until you get too far up from the planet surface.


  • Fixed patch detail, dynamic number of patches:  This is what I'm using because it makes more sense for planets, where the scale and total amount drawn can change a bit.  I chose I fixed patch detail amount.  Suppose its 32x32 quads.  All patches are drawn at the fixed patch detail level.  As detail is needed, patches are split into 4 children subpatches.  Detail is taken away by recombining the 4 children back into their parent patch.


I use the same basic approach you are taking: I start with a cube of quads that I have squished into a sphere.  


I do not understand this question: 

How will i extract the planes to subdivide them from vertex data?


Are you squishing a cube into a sphere, and they trying to decide which quads in the sphere belong to which plane?  I did it this way:


  • I am using opengl , so I assume the coordinate system of positive Y is 'up', positive X is 'right' and positive Z sticks out of the monitor poking me in the eye.
  • Decide the center of the planet is <0,0,0>.  And that the planet will fit in a box from <-1,-1,-1> to <1,1,1>.  That makes all the math easy.  You can move the planet around and resize it later.
  • Start with a working quadtree algorithm for a single plane, at Y=1.  The plane should cover from <-1,1,-1> to < 1,1,1>.  A nice square.  No deformations yet, just draw a quadtree in wireframe.  As you get closer and farther than the plane, it should split/recombine.  
  • After that is working, 'bend' the square around the top of the sphere.  For each point, simply normalize it, so it is a distance of 1 from the center point of <0,0,0>.  This will create the top 'lid' of the planet.  You should have the top part of the ball.
  • Displace the points on the top of the ball.  Not in the 'y' direction, but in the 'up' direction relative to the inhabitants of the ball.  You will find this direction is the same direction as the 'normal' of the ball, and since the radius of the ball is 1, and it is centered at <0,0,0>, the position of your point IS also the direction you need to displace in.  Very convenient.
  • You should have a working 'top' of a planet, where you can fly around and get more/less detail where you need it.
  • Add in the other 5 planes.


That was the basic idea.  Do you have any specific questions?

#5007847 3D algorithm

Posted by on 06 December 2012 - 01:10 PM

If you want to do simple 3d lines and such, an easy way to get going is to try simple perspective. This is what I played around with before I went into matrix math. You will probably get into matrices at some point, but if you just want something basic, you can start with this.

3d without rotation or translation:

A single perspective transform is as easy as this:

//sw is screen width, sh is screen height
//zcut is z plane cutoff (don't render anything closer to or behind z) WHY? z=0 is forbidden. z > 1 means point is in front of camera. z=-1 means point is behind camera. zcutoff of .1, .01 etc are reasonable.

int perspective(float x, float y, float z, int* sx, int* sy)
	 if (z < zcutoff)
		 return 0;   //refuse to transform

	 sx = x* sw / z;		//scale X by screen height and distance.  
								  //So, an object sw pixels wide at Z=1 is the width of the screen.  
								  //Z=2, half the widht of the screen.  Z=.5, twice the width of the screen. etc
	 sy = y* sh / z;
	 return 1;

To draw a line in 3d:
void line3d( float x, float y, float z, float x2, float y2, float z2)
	int sx, sy, sx2, sy2;
	//only draws if both points are in front of camera
	//later, if you want to get fancy, if one is in front, and the other is behind, you clip at z=zcutoff

	if (perspective(x,y,z, &sx, &sy) && perspective(x2,y2,z2,&sx2, &sy2))

With the above snippets, you should be able to draw a 3d perspective object from the point of view of the origin.

To move the camera around, just subtract the camera position from the coordinates:

int perspective(float x, float y, float z, int* sx, int* sy)

	 x -= camera_x;
	 y -= camera_y
	 z -= camera_z;

	 if (z < zcutoff)
		 return 0;   //refuse to transform
	 sx = x* sw / z;		//scale X by screen height and distance.
								  //So, an object sw pixels wide at Z=1 is the width of the screen.
								  //Z=2, half the widht of the screen.  Z=.5, twice the width of the screen. etc
	 sy = y* sh / z;
	 return 1;

With that, you should be able to move around the 3d environment, but view is constrained to always looking down the Z axis. But, its a start.

The last thing you can do, is allow camera rotation along the y axis (like wolf3d). It's been a while but if http://www.siggraph....tran/2drota.htm is correct, then:

int perspective(float x, float y, float z, int* sx, int* sy)

   float xr, yr, zr;
	//translate to camera position

	 x -= camera_x;
	 y-= camera_y
	 z -= camera_z;

	 //rotate 2D about y axis:

	 xr = x * cos(camera_angle) - z * sin(camera_angle);
	 zr = z* cos(camera_angle) + x * sin(camera_angle);
	 yr = y; // height does not change

	 if (zr < zcutoff)
		 return 0;   //refuse to transform
	 sx = xr* sw / zr;		//scale X by screen height and distance.
								  //So, an object sw pixels wide at Z=1 is the width of the screen.
								  //Z=2, half the widht of the screen.  Z=.5, twice the width of the screen. etc
	 sy = yr* sh / zr;
	 return 1;

That should give you 5 degrees of freedom: You can move up/down, left/right, forward/back and rotate about Y. So, it's 'DOOM' controls. You can add another rotation to look up/down, but at the point you should consider trying to understand matrices.

#5002492 Catering for multiple vertex definitions

Posted by on 19 November 2012 - 05:16 PM

This is the format I use for VBO data, in a plain C application:
[source lang="cpp"]#define POS_COUNT 3#define NORMAL_COUNT 3#define TEXCOORD_COUNT 2#define COLOR_COUNT 4struct vertex_data_s { int this_size; //size of this structure float* databuffer; //holds all data //pointers into data buffer: float* pos; //pointer to pos for 0th vertex float* normal; //pointer to normal for 0th vertex float* color; //pointer to color for 0th vertex float* texcoord; //pointer to 0th texcoord for 0th vertex int texcoord_count; int stride; //number of floats to next vertex int count; //number of vertices} vertex_data_t;//In the above, pos, normal, color, texcoord are all staggered pointers into the same databuffer. To access a particular type of data, its just:// assume 'd' is a vertex_data_t*d->pos[ d->stride * n]; //pointer to nth vertex positiond->normal[ d->stride * n]; //pointer to nth normalSuppose I have data that has position, and normal, but no color or texcoords. d->color and d->texcoord will be null, and d->stride will be set to NORMAL_COUNT + POSITION_COUNT;What if I want 3 textures? Then d->stride can be NORMAL_COUNT + POSITION_COUNT + 3* TEXCOORD_COUNT. And the three texcoords are:d->texcoord[ d->stride * n] d->texcoord[ d->stide *n + TEXCOORD_COUNT ]d->texcoord[d->stride * n +TEXCOORD_COUNT *2]in general :d->texcoord[stride * n + TEXCOORD_COUNT * tn ] for the 's'd->texcoord[stride * n + TEXCOORD_COUNT * tn +1 ] for the 't'[/source]

These all get simplified by some macros, so I don't have to think about it. I just have macros like vertex_position( my_vbo, n) to get a pointer to the nth 3d vector (pointer to 3 floats).

With macros, it makes it easy for my to change the underlying structure and recompile, without rewriting everything else. For C++, you could use accessor functions instead of macros. You can also return the correct data types if you do this. For instance, when I use the vertex_position macro, the macro casts the float* into a vertex_3*, which is my app is just 3 floats. So I can do vertex_position(my_vbo, n)->y or vertex_texcoord(my_vbo, n, tn)->y if I want. But in C++, you can do all kinds of other things with accessor functions, such as check for valud values of n, tn, etc, so you can use the same idea, but refine it a bit and make it more crash proof.

This structure also translates data easy to opengl: Call glBufferData using a pointer to d->databuffer, size of d->stride * sizeof(float) * d->count. All the data is transfered to gl (and probably the gfx card vram) in one quick call. Then when its time to draw, just call glVertexPointer / glAttribPointer with these pointers, give it d->stide for each one, and then draw. Fast and easy. If a particular buffer has no normals, colors, etc, its d->normal or d->colors pointer will be null and stride will be smaller: no wasted space.

#5000342 C++ - Is Goto a Good Practice?

Posted by on 12 November 2012 - 04:01 PM

Goto still exists for a reason. At work, we have a large C codebase that has a few goto's in it. It's used as 'goto cleanup', to bail-out of a function, but allow for freeing objects. In C++ you would use an exception or just let the destructors do their job, in C there is no such luxury. You might also be able to make a case for using it in a C++ program if you need to manually free something that doesn't have a destructor ( maybe you were interfacing to something written in plain C), but you would probably want to wrap your C structs into a C++ object with a proper destructor instead.

I do see this sometimes, and I think this construct is worse than a goto:
[source lang="C++"] do { stuff; if (!stuff) break; //really just a 'goto' in disguise stuff; if (oh_crap) break; //another goto for (i=0;xxx;xxx) { if (whatever(i)) { outer_break=true; break; } } if (outer_break) break; // well wasn't that awkward stuff;} while (0);free(things);[/source]
As you can see above, break is really just as bad as goto, and if you need to break from inside an inner loop, it gets awkward.

The other acceptable use I see for goto are the gcc extenstion of 'computed gotos' where you can have label pointers, and build a jump table. This is handy if you are writing a bytecode interpreter, BUT is really just the same thing as the jump table that a good compiler is supposed to try to make for a switch/case statement.

Other than these two cases, goto is almost always a mistake.

#4985883 wolfenstein 3D (How To)

Posted by on 01 October 2012 - 03:50 PM

What rendering API are you targeting? If you are rendering in software (pixel by pixel), then you will want to proper raycasting like the old games did. There's many tutorials on how raycasting was done for wolf-3d style engines.

However, I would recommend you put a modern spin on it and draw with a hardware accelerated API. There are many advantages:
  • When you are raycasting, you aren't drawing, you are just building a list of walls or floors that are visible to the player. You can then sort these by texture/material or whatever and just draw it.
  • You can rotate on other axes for 'free'; just mess with the modelview matrix.
  • Texturing walls and floors are easy; you don't have to deal with constant-z texture mapping
  • You can use 3d models for characters instead of sprites
  • If you're clever, a bump or displacement-mapped wolf3d level might look pretty good. You will be stuck in a 2.5D maze, but that doesn't mean the brick wall can't look bumpy and metal surfaces can't be environment mapped.
  • When the engine gets advanced enough that you want to move on from raycasting to something else (like portals & sectors), you will have a bunch of code you can reuse.

#4979833 FX compiler using more temporary registers than necessary

Posted by on 13 September 2012 - 02:59 PM

With the /Od flag (optimizations disabled), I get the least number of temporary registers used. Without /Od or with /O0 or /O1, I get 50% more registers used.

The extra registers are probably being used to speed things along. If the arch of the shader units are anything like the arch of any modern cpu, the instructions overlap each other on execution. Suppose you have an expression like a*b * c*d. You can do that with 1 temp register:

tmp = a * b
tmp = tmp * c
tmp = tmp * d

Or do you do it with 2 temp registers:

tmp = a*b
tmp2 = c*d
tmp = tmp * tmp2

Both are three multiplies. The difference is in the second one, the multiplies can overlap. tmp1=a*b and tmp2=c*d have no dependencies on each other, so they can happen at the same time. The price you pay for these find of speed optimizations is extra registers being used.