Jump to content

  • Log In with Google      Sign In   
  • Create Account


Ameise

Member Since 13 Jan 2007
Offline Last Active Jun 17 2014 12:47 PM
-----

Topics I've Started

Precompiled Bytecode format

15 February 2013 - 12:31 AM

I need to execute Angelscript bytecode in my own Virtual Machine as I need to perform tasks as part of the execution that the AngelScript VM is incapable of. Currently, I am using SaveByteCode on the module, and then trying to parse the resultant bytecode. The problem so far is that I am unsure what the format of this bytecode is... it's not straight bytecodes going straight in.

I wrote a very simple AS script:

 

void run ()
{
	print("RUN CALLED");
}

 

 

Which results in the following output:

 

01 00 00 00 00 00 01 66 6E 03 		? ? ? ? ? ? ? f n ? 
72 75 6E 40 4E 00 00 00 00 01 		r u n @ N ? ? ? ? ? 
00 00 00 0C 3F 3C 00 3D 00 3B 		? ? ? ? ? < ? = ? ; 
04 01 3D 01 04 01 3D 02 04 01 		? ? = ? ? ? = ? ? ? 
3D 03 3F 0A 00 01 01 6F 6E 06 		= ? ? ? ? ? ? o n ? 
73 74 72 69 6E 67 00 04 01 00 		s t r i n g ? ? ? ? 
02 06 01 01 0A 01 00 00 00 01 		? ? ? ? ? ? ? ? ? ? 
72 00 00 00 00 05 61 6E 10 5F 		r ? ? ? ? ? a n ? _ 
73 74 72 69 6E 67 5F 66 61 63 		s t r i n g _ f a c 
74 6F 72 79 5F 05 6F 72 01 00 		t o r y _ ? o r ? ? 
00 00 01 01 02 40 42 00 01 40 		? ? ? ? ? @ B ? ? @ 
4A 01 01 00 00 00 00 00 61 6E 		J ? ? ? ? ? ? ? a n 
07 5F 62 65 68 5F 30 5F 00 00 		? _ b e h _ 0 _ ? ? 
01 00 01 01 01 00 00 6F 72 01 		? ? ? ? ? ? ? o r ? 
00 00 61 6E 05 70 72 69 6E 74 		? ? a n ? p r i n t 
00 00 01 00 01 01 01 00 00 00 		? ? ? ? ? ? ? ? ? ? 
00 61 6E 07 5F 62 65 68 5F 31 		? a n ? _ b e h _ 1 
5F 00 00 00 00 00 00 6F 72 01 		_ ? ? ? ? ? ? o r ? 
00 00 6E 00 01 6E 0A 52 55 4E 		? ? n ? ? n ? R U N 
20 43 41 4C 4C 45 44 00 00 00 		  C A L L E D ? ? ? 

 

 


How would I go about loading this and other scripts? The AngelScript documentation doesn't go into detail about the specifics of precompiled bytecode, only the API side of things.


Need advice on which architecture to go with

27 March 2012 - 10:01 PM

I am currently reworking my game's engine, and trying to rework the architecture further. One of the primary beliefs behind the architecture so forth is parallelism and scalability.

The basic structure has it so that there are two running asynchronous tasks - game processing, and rendering. I've already implemented and tested the linking between them, and it exhibits good latency and performance. The most the render thread can be behind is one frame due to how the game thread synchronizes data. Both threads are of course able to spawn their own dispatched tasks that are local to them.

My consideration is further splitting the rendering thread into a game rendering thread(s) and a UI thread. The game rendering thread(s) would draw to a render buffer, which the UI thread would lock and draw for game UI objects (shared by display-list sharing across contexts). My thinking is that this would allow the UI to keep working properly (menus, for instance) even if the game is being slow/lagging. The downside is that it increases the maximum frame-behind state from 1 frame to two frames (due to the extra step). Also, I am unsure how well OpenGL implementations would take to this; technically, this sort of multi-threading is safe, but I somehow doubt that it will be efficient -- I am not confident that the drivers don't just wrap each function in a common critical section.

I'm curious what other people think about this.

Fastest Instancing method in OpenGL

08 November 2011 - 04:37 AM

What is currently the fastest way to pass object-data such as position and bone offsets into the pipeline, so that an instance drawing call (such as glDrawElementsInstanced) can access it? The issue I am seeing is that if one uses a texture/pixelbuffer to pass the data, for large numbers of objects, the dataset can get quite large -- too large to handle efficiently.

My target in regards to this is OpenGL 3.2.

Thank you!

EDIT: Mod may want to move this to the OpenGL forum?

Problems setting up SSAO

07 October 2011 - 05:07 AM

First off, I've already read the articles on gamedev and a few other places about SSAO.

After trying their shaders (which either gave me noise or strange results), I tried to write my own:


#extension GL_ARB_texture_multisample : enable

uniform sampler2DMS	normalMap;
uniform sampler2DMS	depthMap;

uniform vec2		screenResolution;

smooth in vec2 texCoord;

out vec4 vFragColor0;

vec4 getTexelI (sampler2DMS tex, int samples, ivec2 coord)
{
	vec4 res;
	for (int i = 0; i < samples; ++i)
	{
		res += texelFetch(
			tex,
			coord,
			i
		);
	}
	return res * (1.0 / float(samples));
}

const float rayLength = 0.0000001;

const int numSamples = 4;

const vec3 s_directions[numSamples] = vec3[] (
	normalize(vec3(-1, 0, -1)),
	normalize(vec3(1, 0, -1)),
	normalize(vec3(0, -1, -1)),
	normalize(vec3(0, 1, -1))
);

vec3 adjustNormal (vec3 inNormal, vec3 bumpNormal)
{
	mat3 mat;
	vec3 left = bumpNormal;
	left.xy = left.yx;
	left.x = -left.x;
	vec3 up = cross(bumpNormal, left);
	left = cross(bumpNormal, up);
	mat[0] = left;
	mat[1] = bumpNormal;
	mat[2] = up;
	
	return mat * inNormal;
}

void main ()
{
	// Get the pixel coordinate as a texel coordinate
	ivec2 texelCoord = ivec2(
		int(texCoord.x * screenResolution.x),
		int(texCoord.y * screenResolution.y)
	);

	// Get screen-normal and depth
	vec3 centerNormal = getTexelI(normalMap, 4, texelCoord).rgb;
	float centerDepth = getTexelI(depthMap, 4, texelCoord).g;
	
	// Flip the distance from 1-0 to 0-1
	float distanceAdjusted = -(centerDepth - 1.0);

	// Decide how long the ray should be, given depth.
	vec2 distanceResAdjusted = vec2(
		distanceAdjusted * screenResolution.x,
		distanceAdjusted * screenResolution.y
	);
	
	float occlusion = 0.0;
	float samples = float(numSamples);
	
	for (int i = 0; i < numSamples; ++i)
	{
	
		// Get the sample direction.
		vec3 sampleDirection = s_directions[i];
		
		// Rotate the sample direction to be in the same rotation space as the screen normal.
		// I believe that this is where it is broken.
		sampleDirection = adjustNormal(centerNormal, sampleDirection);

		// Adjust the length of the Z coordinate, as the range is only 0 - 1.
		sampleDirection.z *= distanceAdjusted * rayLength;
		
		// Get the length of the offset in texel coordinates
		vec2 iSampleDirection = vec2(
			distanceResAdjusted.x * sampleDirection.x,
			distanceResAdjusted.y * sampleDirection.y
		);
		
		// Get the new offset to look up.
		vec2 texOffset = texelCoord + iSampleDirection;
		
		// Clamp the offset to be within the screen.
		// This causes some artifacts at the screen edges.
		texOffset.x = clamp(texOffset.x, 0, screenResolution.x);
		texOffset.y = clamp(texOffset.y, 0, screenResolution.y);
		
		// Get the height of the offset texel
		float sampleHeight = getTexelI(depthMap, 4, ivec2(texOffset)).g;
		
		// Compare them. If sampleHeight is less than compareHeight, add 1.0 to the occlusion value.
		float compareHeight = centerDepth - sampleDirection.z;
		
        occlusion += ceil(clamp(-(sampleHeight - compareHeight), 0.0, 1.0));
	}
	
	vFragColor0.rgb = vec3(0.0);
	vFragColor0.a = occlusion / samples;
}


There are some obvious problems with it, but I have not been able to find it.

The inputs are:
normalMap - a multisampled framebuffer, consisting of screen-space normals. X and Y are X and Y, Z is depth (with positive meaning towards the camera).
depthMap - a multisampled framebuffer, consisting of Z-depth.
screenResolution - the resolution of the framebuffers, as a float (IE, 1024x768)
texCoord - the texture coordinate based upon the drawn quad

The only output that is needed is the alpha channel of the fragment - it controls how dark to draw the shading. RGB is ignored.

The effect I am seeing is this:
Posted Image


This is obviously incorrect. Past that, the shading moves heavily with the camera. Under my understanding of the algorithm, the shading should be relatively static as it is relative to geometry depth (and hence positioning), which means that that should -not- happen -- it looks more like broken shadowing than anything. I am assuming that my call to 'adjustNormal' is written incorrectly, but my brain is simply not in a state to debug that.

Can anyone assist me or offer some suggestions as to what to do? Thank you!

Fastest Occlusion method with OpenGL

06 October 2011 - 05:34 AM

I am currently torn between two different means of handling occlusion in OpenGL, and wanted to get some input before I actually implement either form.

As I see it, there are two (reasonable) means by which to do it:

Delayed Occlusion - You send out an occlusion request, and then read the result of it several frames later. This reduces or eliminates the stall by not requiring a read from the graphics device to occur immediately, but rather after several frames. The drawback is it CAN still stall, and also forces occlusion to be several frames off, causing objects that just 'appear' at the edge of the screen.

Deferred Occlusion - You create a second GL Context within the same process, but in a different thread. You give it its own framebuffer, or possibly use the depth buffer of the primary thread with writing disabled. You do all occlusion testing within that thread, and feed the results back to the primary thread. The advantage SHOULD be no stalling at all, as they use separate framebuffers. I, however, do NOT know how well OpenGL or the graphics device would handle this.

Any thoughts?

PARTNERS