Jump to content

  • Log In with Google      Sign In   
  • Create Account

Awesome job so far everyone! Please give us your feedback on how our article efforts are going. We still need more finished articles for our May contest theme: Remake the Classics

kalle_h

Member Since 05 May 2012
Online Last Active Today, 05:16 PM

Posts I've Made

In Topic: Matrix palettes in vertex shader

12 May 2013 - 05:53 PM

 


also I think you should consider encoding the matrix palette transforms as a quat rotation + translation to take up half as many registers.

 

That's a tradeoff as you're going to need to convert it to a matrix for each bone in your vertex shader, so you could end up with less storage but at a cost of a high additional per-vertex computational overhead (if hardware supported quaternion transforms it would be different, of course).

 

 

vec4	orientation	= u_quaternionArray[index];

float invS  = 2.0 / dot(orientation, orientation);
vec3 s = orientation.xyz * invS;
vec3 w = orientation.w   * s;
vec3 x = orientation.x   * s;
vec3 y = orientation.yyz * s.yzz;
			
vec4	posAndScale	= u_positionAndScaleArray[index];//xyz = translation, w = scale

mat4	objectToWorld = mat4(
        posAndScale.w * vec4(1.0 - (y.x + y.z), x.y - w.z, x.z + w.y, 0.0),
	posAndScale.w * vec4(x.y + w.z, 1.0 - (x.x + y.z), y.y - w.x, 0.0),
	posAndScale.w * vec4(x.z - w.y, y.y + w.x, 1.0 - (x.x + y.x), 0.0),
	vec4(posAndScale.xyz, 1.0));

Additional cost per vertex is not that big after all. Didn't even notice performance drop with iphone4s.  It's quite gpu friendly ALU code after all.


In Topic: Is software rasterisation processor-heavy?

06 May 2013 - 12:57 PM

In the occlusion rasterizer that I'm using, my frame-buffer is only 1 bit per pixel (drawn to yet, or not drawn to). I render all the triangles for occluders and occludees from front-to-back, with occluders filling pixels, and occludees testing if all their pixels are filled or not (if any of their pixels aren't filled, the occludee is visible).
Using SSE, you can fill 128 pixels at a time with this algorithm, so the actual rasterization is not at all a bottleneck. I can run at pretty much any resolution without much difference in speed. The real bottleneck for me is actually transforming all of the vertices from model space into screen space (i.e. the 'vertex shader').
 
I break the frame-buffer into 'tiles', each of them 128 pixels wide and <height> pixels tall. Each of these tiles can then be independently rasterized by a different thread.

How you are sorting triangles? How about intersections? Or do sort by using max z and raster whole triangle with that? Sound great technique but I want to hear more details.

In Topic: Is software rasterisation processor-heavy?

06 May 2013 - 12:54 AM

http://fgiesen.wordpress.com/2013/02/17/optimizing-sw-occlusion-culling-index/

Sixteen blog posts about how to write occulusion culling system and then optimize it. Pure gold.

In Topic: Libgdx camera - pixel issues

28 April 2013 - 04:52 PM

I have thought about that, but won't that cause problems with the physics when resizing?

Physics don't have to know anything about rendering or screen size. Physics use meters and when you need to render physics objects you just use that size as it. It's camera responsibility to know what size is the physic world and what size is the screen and then do the scaling.

In Topic: Libgdx camera - pixel issues

27 April 2013 - 02:44 PM

Just plan everything in meters and use camera to scale box2d world to pixel units when rendering.

PARTNERS