OpenGL Manually calculating projected texture coordinates

Hi all, I'm trying to manually calculate the texture coordinates to project a texture onto a mesh. I'm writing code for the iPhone (OpenGL ES 1.1) device, for those of you that don't know the glTexGen functions are not part of the spec, which means you can't simply do the old GL_EYE_LINEAR business and be done with it. Now, the thing that is making this interesting is that the iPhone has a "known limitation" and only does perspective correct interpolation for the S and T coordinates, which leaves me with a right mess. My original plan was to setup the OpenGL texture matrix with the bias+viewmatrix of the projector, and then provide the mesh vertices as the coordinates to be transformed, this requires no CPU processing of vertices and works beautifully on the simulator (which uses the desktop GL driver which DOES do the correct interpolation of STRQ) but when running on the device any reasonably sized, non aligned triangles cause the lack of Q-coord awareness to be blatantly visible. The simple solution is to capitalize on the per-vertex Q-coord division for S and T, and increase tessellation of the mesh, this works, but I'm worried it's going to cause unnecessary burden on the artist and also affect performance, additionally it's never going to be 100% correct. So, my next move was to try and calculate the texture coordinates on the CPU, but for some reason, I can't seem to get it to work, none of the projections come out looking remotely correct, I was wondering if anyone had any code from a software renderer or similar to do the same thing? My code:
            VECTOR4 tmp;
            MATRIX m;
            MatrixMultiply(m, shadowMatrix, modelMatrix);
            for(size_t i=0; i<gb.getVertexCount(); ++i)
                VECTOR3& vtx = *((VECTOR3*)gb.getVertexBufferPtr(i*gb.getVertexStride()));
                tmp.x = vtx.x;
                tmp.y = vtx.y;
                tmp.z = vtx.z;
                tmp.w = 1.0f;
                MatrixVec4Multiply(tmp, tmp, m);
                tmp.x = tmp.x/tmp.w;
                tmp.y = tmp.y/tmp.w;
                tmp.z = tmp.z/tmp.w;
                projTexCoords = tmp;

I know that shadowMatrix and modelMatrix are 100% correct as they are unchanged and work for the non-cpu coord generation.

I think the only different thing I do is scale and offset the projection matrix by half (because the projection matrix will place the origin at the center of the screen, but the texture origin is in the corner).

I assume "shadow" is your projection matrix?
Also, is "modelMatrix" a local-to-world matrix, or a local-to-view matrix?
TransformedPos = Pos * LocalToView

Proj = Camera.Projection
Proj.Scale ( 0.5, 0.5, 1 )
Proj.Translate( 0.5, 0.5, 0 );

UV = TransformedPos * Proj

Hi Hodgman,
shadowMatrix = Bias * Proj * View


MatrixMultiply(shadowMatrix, shadowMatrix, shadowBias);
// shadow bias is your projection typical scale by 0.5 and translate by 0.5 matrix

modelMatrix is the object's local->world transform matrix.

I'm pretty stumped on this one.

