It's fairly straight forward to render a mesh for each face on a texture cube, the restriction is that you must preform 6 draw calls because each face of the texture cube will need assigning as a render target. In our engine we use RenderTargetCube in combination with CubeMapFace and individually set view matrices for each face, we do this to perform shadow mapping for point lights. It incurs 6 draw calls like I said, but our engine is optimized to only draw relevant geometry so it's very fast. In the following MyRenderTargetCube can be cast to a TextureCube:
I still maintain that the technique you are using will likely drive you bonkers later with the restrictions it imposes and the extra overhead that may not be apparent yet. But it's not my place to tell you what to do, and to tell you the truth I am intrigued to find out if you manage to get it working
Oh hang on I think I mis-interpreted what you were asking, lets take another stab at it. So you have one huge cube map that surrounds the scene and you want to render distant meshes onto that 1 cube map so distant meshes don't have to be rendered individually.
In this case for outdoor scenes where the ground is flat this may work, but for uneven ground and indoor scenes you'll likely encounter problems with how you deal with depth and perspective. To add a little irony when you move through the scene, you would likely have to re-construct the 1 cube map often as things get closer or further away, which means any performance gain you get from the technique will be close to nullified.
Although this may quash what I said previously in parts, a lot of the previous post still applies, did you know for example that in Half Life 2 they used a technique where all geometry that would permanently be far away from the player was very low poly? Just another very helpful and common technique that could offer what you are after.
It took me a few minutes to figure out what you were asking, but I think I understand now and no it's not really practical. This is how I understand it:
You want distant meshes to instead draw as cubes that each have an individual cube map representing the view of the mesh at each face (6 in total).
Using which ever faces the view matrix can see, you wish the mesh to be "reconstructed" to look as if it is a fabrication of a mesh (like a billboard).
If I have understood it correctly, this means there are some unfortunate flaws as follows:
This means you will be storing a cube map per mesh that would be subject to this technique, which will likely consume far more vram than a large number of vertices and indices. Here is a quick example:
For example a 512 * 512 * 6 faces * 4 bytes = approximately 6 megabytes per cube map where each face is 512 x 512 pixels in 32 bit.
Versus (300,000 vertices * 20 bytes stride) = 5.72 megabytes per 100,000 unique triangles where each vertex is a VertexPositionTexture.
When looking at this I think most people would rather distribute 100,000 triangles between meshes than have to store 6 MB per single mesh, more efficient.
I have seen a technique talked about by Renaud Bédard (guy who programmed fez) where a Texture3D is linearly sampled in shader, technically this could be used to achieve what you asked, but it is complicated and very shader heavy. Should be noted he used the technique to store many tiles in one texture, so in his case although a 3D texture is to the power of 3 pixel wise (far larger than a cube map) it was a worth while trade off versus texture swapping.
Even if you do elect this technique, it will mean on start-up of the scene you will need to make the user wait while each face is rendered for each cube map, the waiting time may not be noticeable with a few cube maps in the scene, but I bet it becomes more and more undesirable the more you cube maps you throw in the mix,
I think your idea is intuitive, but at the same time there is a big part of me crying out saying this technique is like trying to open a door with your shoulder blades. There are time tested techniques for dealing with distant object, such as frustum checking, back face culling, fog (oldie but goldie), bsp tree's, and many many more, I bet you are familiar with many of them, but it's worth noting that these techniques are common because people find them reliable.
Anyway down to brass tax, when all else fails remember that the most common technique is often the best.
Many of the questions you asked have already been answered many many times before so I recommend you try the search features of the forum, however here is a brief summary of the answers you'll find:
XNA is no longer supported correct, however it works and will continue to work for years to come. So if you are new to coding games, stick with XNA until you are happy to move on.
MonoGame would be the next stage *after* building your game, it's very straight forward (not easy) to port to as long as the game is optimized and does not rely on windows only features. Also as they are still working on the content pipeline, it's very useful to still have XNA installed.
XNA doesn't actually have any plugin's, you can get 3rd party libraries that give you extra features, but no plugins that directly tamper with the API. Which 3rd party libraries you are planning on using? (note: most of the popular ones like physics engines etc... provide a DLL that works on MonoGame too now days).
Try not to dwell on trying to port before your game is built, it's a mistake I and others have made in the past over and over, this tends to lead to games never getting built, but that's just my two cents