It all starts off as floating point (the height map, the vector, and the start point) and at the moment I am scaling the vector scale3d(vector,1.0/max(vector.x,vector.y)) then convert the x and y components to 8:8 fixed point for stepping through the map and do the height map z comparison as floats
p_loop:
paddw mm3,mm4 //add x and y for next element (fixed point)
Addss xmm5,xmm6 //add z (float)
Movd edx,mm3 //get index for height map lookup
Shr edx,8
Shl DX,cl
Shr edx,cl
And edx,ebx //mask for wrapping coords
Comiss xmm5,[esi+edx*4] //compare z with heightmap
Jb p_loop //repeat if still lower
So either the x step or the y step (whichever is largest) is equal to 1 and the other is fractional.I've tried increasing the step to 2 and then when the comparison passes then step back 1 to do a final check, although not petfect that's adequate and does give me some speed up but I'm wondering if there's better ways of going about this.
It seems like a similar problem to voxel landscapes but I can't use the method of vertical scanning used there.