for(int i=0;i<1000;i++) mRenderer->getDevice()->DrawPrimitive(D3DPT_TRIANGLESTRIP,0,2);
You are on the right track. One of the most time demanding things are API calls ! So, the trick is, to reduce your API calls alot. When using quads (the right way), the next step would be to group quads and render them in a single call. This is called batching. You are quite new to 3d rendering, but here are some hints in which direction you should investigate to render a few millions sprites in the same time
1. Use indexed primitives, that is, you have an array of vertices and an array of indicies into this vertices.
2. Use batching by putting multiple sprites into a single array (10 sprites = 40 vertex array) and draw them with a single index draw call. Either use triangle list (6 indicies per quad) or a triangle strip and connect them. The triangle list is much easier and the performance impact would not be hard.
3. Use a texture atlas, that is, put multiple sprites on a single larger texture. Then group (->batch) all sprites which use the same texture into a single batch and draw it.
The trick is, to get rid of too many API calls !