I'm writing a game which could loosely be described as a strategy game, with a typical camera - it looks down onto the world from a high angle with a perspective projection. The player will interact with the game by tapping or clicking things on the "ground" and will also be able to move the camera a bit. For various reasons I'm writing my own engine instead of using a ready-made one. It's based on OpenGL ES 2.0 (or an equivalent subset of OpenGL for PCs) with glm for the maths.

With the help of a diagram and schoolboy trigonometry I managed to come up with an equation for calculating the minimum zfar value to use in glm::perspective to render the ground at its farthest from the camera (at the top of the screen) but I don't know how to work out where a point on the screen corresponds to on the ground. I think glm::unproject will be useful but the trouble is I don't have a meaningful z-coordinate to plug in to glm::unproject. I know you can read the depth buffer to get this, but I also want to be able to work out which bits of the ground are visible (at the four corners of the viewport) and use this to limit the camera's movement (and to work out which bits of the terrain are outside the view and don't need to be rendered), so it would be better if I could do this before rendering anything.

I thought I could get a line joining the same NDC X and Y on the near and far clipping planes then use my diagram/equation to work out where this line crosses my ground plane, but it didn't work. I think the main problem is that I assumed a linear mapping of Z between world space and device space, and I don't think this is the case for a perspective projection. I'm also not sure of the Z values for the near and far planes to use as input to glm::unproject. Is it 1.0 for far and -1.0 for near?

Rambling on a bit, I had a lot of trouble understanding the perspective divide. Am I right in thinking this is an extra step OpenGL automatically performs after the matrix transformations, and it just converts [x, y, z, w] into [x/w, y/w, z/w]? And that an orthogonal projection matrix sets w to 1 and a perspective one sets it to z? But z from which "space"?