• entries
    18
  • comments
    34
  • views
    28699

Vertex Shaders and Textures for YAGSS

Sign in to follow this  
fastcall22

1360 views

This a continuation of the Yet Another Generic Space Shooter (YAGSS) game.

Most of the graphics programming I've learned has been done through software rendering and fixed-function OpenGL, so I wasn't sure what to expect when writing my first real vertex shader. The sprites themselves would be simple, only having x, y, rotation, and scale as properties, and a simple tinting effect.

It was confusing at first to get the input layout just right for both of the two vertex buffers, the vertex and instance buffers. It turns out that InstanceDataStepRate is very important, otherwise you'll draw the same instance repeatedly. Oops.

As for the vertex shader itself, I had the greatest idea ever: Why not have the GPU do the between-state interpolation? The properties are simple enough, and it would save bandwidth, as I'd only need to update the constant buffer containing the interpolation amount in between frames, rather than the entire instances' state every frame. To do this, I packed the sprite's position, rotation, and scale into a single float4. Doubling this, I have a previous and current state. From there, I can interpolate between the two states.

// bufferscbuffer constants { float4 camera_transform; // (x,y,z,w) = (x,y,scale_x,scale_y) float4 time; // (x,y,z,w) = (app_time,game_time,interp,0)};// typesstruct in_instance { // vertex float2 pos : POSITION; float4 color : COLOR; // instance // 0 previous state, 1 current state (backwards for some reason, w/e) float4 transform[2] : TEXCOORD0; // (x,y,z,w) = (x,y,rotation,scale) float4 tint[2]: TEXCOORD2;};struct in_pixel { float4 pos : SV_POSITION; float4 color : COLOR;};// mainin_pixel main( in in_instance IN ) { // interpolate float interp = time.z; float4 t = lerp(IN.transform[1],IN.transform[0],interp); float4 tint = lerp(IN.tint[1],IN.tint[0],interp); // setup model transform float c = cos(t.z) * t.w; float s = sin(t.z) * t.w; float2 U = float2( c, s ); float2 V = float2( -s, c ); // transform float2 vpos = IN.pos; float2 pos = U*vpos.x + V*vpos.y + t.xy - camera_transform.xy; pos /= camera_transform.zw; // output in_pixel OUT; OUT.pos = float4(pos,0,1); OUT.color = float4(lerp(IN.color.rgb,tint.rgb,tint.a),IN.color.a); return OUT;}The first sprite shader. Textures not included.

The update logic went something like this:

bool did_update = false;for ( ticks_bucket += ticks_elapsed; ticks_bucket >= ticks_per_update; ticks_bucket -= ticks_per_update ) { update(time_per_update); // mutates instance array did_update = true;}if ( did_update ) update_instance_buffer(); // copies instance array directly to vertex bufferfloat interp = ticks_bucket / (float)ticks_per_update;shader_constants.interp = interp;update_constant_buffers();render(interp);Game loop excerpt (pseudo).

What I didn't realize until rewriting the shader, was that there was a significant performance penalty because the model's transform is rebuilt for every instance's vertex, even though there were only four vertices. I decided to see how it compared to a more "standard" approach. Here's what the second approach looked like:

// bufferscbuffer constants { row_major float2x3 camera_transform; row_major float2x3 atlas_transform; // TODO: remove this, sprite atlas should be normalized floats float4 time; // (x,y,z,w) = (app_time,game_time,interp,0)};// typesstruct in_instance { // vertex float2 pos : POSITION; float2 texcoord : TEXCOORD0; float4 color : COLOR; // instance row_major float2x3 transform : TEXCOORD1; row_major float2x3 tex_transform : TEXCOORD3; float4 tint : TEXCOORD5;};struct in_pixel { float4 pos : SV_POSITION; float4 color : COLOR; float2 texcoord : TEXCOORD;};// mainin_pixel main( in in_instance IN ) { // transform position in_pixel OUT; float2 pos = IN.pos; pos = mul(IN.transform, float3(pos,1)); pos = mul(camera_transform, float3(pos,1)); // transform texture float2 texcoord = IN.texcoord; texcoord = mul(IN.tex_transform, float3(texcoord,1)); texcoord = mul(atlas_transform, float3(texcoord,1)); // OUT.pos = float4(IN.camera_transform * float3(IN.transform*float3(IN.pos,1),1),0,1); OUT.pos = float4(pos,0,1); OUT.color = float4(lerp(IN.color.rgb,IN.tint.rgb,IN.tint.a),IN.color.a); OUT.texcoord = texcoord; return OUT;}Second attempt. Includes textures and spooky matrix math!

And the game loop:

// updatefor ( ticks_bucket += ticks_elapsed; ticks_bucket >= ticks_per_update; ticks_bucket -= ticks_per_update ) update(time_per_frame);float interp = ticks_bucket / (float)ticks_per_update;update_constant_buffer();copy(begin(entities),end(entities),begin(instances),entity_renderer(interp));update_instance_buffer();render(interp);Compared to the first attempt (and before textures were added in to the shader), I was able to push out 38% more untextured sprites from ~55000 to ~76000 sprites. It was then I decided that I probably won't even reach 3% of that number for this game. Oh well. Now about this second approach...
The interpolation is done inside the entity_renderer, taking entities and translating their interpolatable properties into instances to be sent to the GPU. The matrix-vector math was tricky, but by arranging it like so, I was able to shave off a row from each matrix:

[[px] [py] [ 1]][[Ux Vx dx] [[Ux*px + Vx*py + dx] [Uy Vy dy]] [Uy*px + Vy*py + dy]]Anyways, here's the engine now. Textures 'n' stuff:
f70ncEu.png

I'm still playing with the blending modes. And, I may end up ditching baking the glow effect into the textures. I might be able to save on fillrate by using a fullscreen shader effect, rather than rendering a bunch of bloated sprites, just for their glow effects.

In any case, I guess I should start actually building the game now...
Sign in to follow this  


4 Comments


Recommended Comments

I wonder if you can do that with one pass. Glow is additive, your shapes are not. But this AO effect is nice.

Share this comment


Link to comment
I should also mention that the comparisons between performance of the two shaders isn't exactly reliable, which is why I say I should "start actually building the game now." I'll have something to base the performance on.

Share this comment


Link to comment

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now