Thinking about this for a minute, I don't think you can implement the whole chain of operations using matrices and the standard vertex processing stages. I believe it's the perspective division stage, which is introducing the perspective effect by dividing by the depth, that is the core of the problem since it is applied to both the X and the Y axes. You can either have no perspective, or perspective along both axes, since the divisor is the same for both axes.
You would have to implement your own vertex processing. This should be trivial with a vertex shader since you can just do the division yourself (divide only the X, Z and W components of the transformed vertex); then, OpenGL's own perspective division has no effect since the vertex has a unit W-component.
I can think of how this would work visually, but not the steps needed to produce the transform matrix. What I see is an orthographic frustum with the "near" side squished vertically to make a trapezoidal prism.
Thanks Brother Bob, you're right. I was initially thinking that this would be possible to construct just by modifying the projection matrix. Doing the perspective divide in the vertex shader only on the x, z and w components did the trick!