I can give you some simple formulas to work with. We'll start with a very simple situation, where we measure everything in pixels and the camera is an eye at a fixed position (0, 0, -1000), looking towards the origin. Our resolution is 1024x768. We can just think of this situation as having the screen contained between (-512, -384, 0) and (+512, +384, 0). So a point (x, y, z) when z = 0 will map to the pixel (512+x,384-y).

If that's clear, we just need to know how to scale things that are farther away from or closer to the eye than the screen. The answer is that point (x,y,z) maps to pixel (512+x*1000/(z+1000), 384-y*1000/(z+1000)). You can convince yourself that this is correct by making a picture of the situation (I suggest from above, to get the formula for the x coordinate) and using triangle similarity.

Now you need to know how to handle other camera positions and angles. Let's handle yaw first. You can rotate the whole world around the origin on the XY plane by using these formulas:

x' = x * cos(angle) - y * sin(angle)

y' = x * sin(angle) + y * cos(angle)

Do that to every point in the scene and your whole world has now rotated around the origin. If you want the rotation to happen around any other point, first subtract the coordinates of the center of rotation, then rotate, then add them back. In particular, you can now perform rotations around the camera, which has the same effect as rotating the camera.

For pitch you can do the same thing in the YZ plane. The order of rotations does matter. You probably want to apply pitch first and then yaw, but if that doesn't do what you want, reverse the order.

Moving the camera around is pretty easy, since all you have to do is -again- move the whole world in the opposite direction, by subtracting some vector from every point.

That's the closest I can get to "simplified math". I hope it helps. But you should really really try to find a gentle introduction to the algebra of matrices and vectors and try to understand the formulas you've found in other places.