What I've got/want is to render a load of splines. The game I'm considering is based around line-art, and I want sharper images than I could reasonably get using textures (and nicer animation, too), so I want to actually draw the lines at runtime.
Happily, it's a pretty easy task, and there are some lovely opportunities for scalability and suchlike. I should be able to net the benefit of hardware antialiasing and suchlike as well.
Firstly, all splines, regardless of the number of segments actually composing them, will be broken down into quadric curves (four control points). It's actually very easy to do this, to 'stitch together' a series of quadric curves into a spline; you get C0 continuity (no gaps) by making the end point of one curve equal to the start point of the next, and you get C1 continuity (smooth curve) by making the point3->point4 vector on one curve the same as the next curve's point1->point2 vector.
So, I need to write rendering code for a whole load of quadric bezier curves. To give my artists some extra control, I also want to support variable thickness of the curves. Naturally, just as with C0/C1 continuity, I need to match thicknesses at the endpoints of neighbouring curves to get smooth connections and I need to match thickness gradients to get a smooth rate of change.
The variable thickness means that I can't just use line primitives, which to be honest, is fine by me, because they're not terribly highly optimised in hardware anyway. Instead I'll be using quads... possibly textured, if I want to give my lines some extra smoothness at the edges. (Might be better than antialiasing).
The basic principle I'm going to apply is this: given a set of four control points, and a vertex with a 'parameter' value attached, I can use that parameter value to interpolate between my four control points and calculate a position for the vertex. I'll also want to calculate a tangent vector at that position, and multiply it by the thickness of the spline at that point (also interpolated from the control points) to offset the vertex to either side.
So, right here is a great opportunity for some scalability: the number of segments can be variable. The higher the number of segments, the smaller the gap between each parameter value, and the 'smoother' the curve.
How do I actually implement this? Vertices are going to need to be given to my vertex shader in pairs; both vertices in a pair will have the same parameter value, but will have another property set differently so that the shader will send them to different sides of the curve. I'll need [number_of_segments_plus_one] pairs of vertices. Could I render them as a triangle strip? Hmm... maybe, but I need to be careful if I want to render multiple curves in a single call.
Hardware instancing might be a good first step. I create a single buffer with my pairs of vertices in, and run all the curve control points into a dynamic VB. Then, setup the stream frequency to run over the whole pair-buffer with each set of curves. If the dynamic VB is large enough to contain all of my curves, then I get to render all of them in a single call (as I think strips get cut between stream loops).
However, hardware instancing isn't that often available. I could take a constant skinning approach; load my curve data into vertex shader constant registers, and then have a bunch of copies of the pair-buffer, with a 'curve index' on each vertex of a copy. I need to store a 2D position per control point, plus a thickness value, so I can pack a curve into three vec4 registers. That's about 25-30 curves per batch for vertex shader 1... not nearly as nice as hardware instancing.
I could use a multiple stream approach, which kind of combines instancing with constant skinning: multiple copies of the basic pair data laid end-to-end in a static buffer, and repeated copies of the curve data (AAABBBCCC sort of repeat, not ABCABCABC) in a dynamic buffer, and use the input assembler to bring together the data for the shader... If my buffers are large enough I could do as many curves per call as I want.
Anyway, beyond that... I think I'm going to want to store curves as a linked list, because it makes sense to expose the curves to the rest of the code as splines instead of quadric curve lists. When I flatten those curves into dynamic VBs or constant registers or whatever, I can break them into quadric curves at the same time.