Back to General and Gameplay Programming

Display screen remotely

General and Gameplay Programming Programming

Started by David Lake June 27, 2013 08:18 PM

58 comments, last by David Lake 10 years, 8 months ago

David Lake

117

Author

June 27, 2013 08:18 PM

I have a program I made for remotely controlling PC's that uses bitblt, bitmaps, a bit of processing and compression and I would like to improve the way it captures and displays the desktop using SlimDX if that can be used to increase performance.

At the moment the capture side is very heavy on the CPU, I use the threadpool for threading and split the screen capture and processing/sending of frames into different threads but would like to reduce CPU utilization without compromising performance if possible.

Main problem is I cant control the login screen I would like to know if theres a way arround that.

Also I need a faster way(than GDI+) to scale the images before sending, I'm hoping SlimDX can speed this up using the GPU

Is there a relatively easy way to use SlimDX to capture the screen and display it remotely thats at least as fast as using bitblt, GDI+ and bitmaps?

Vortez

2,714

June 27, 2013 09:02 PM

Use dx to scale your image, not GDI, it will be super fast. I have a similar working project and that's what i do, execpt i use OpenGL but it's almost the same thing.

For the cpu, i don't really see any way other than add a sleep before or after you capture the screen. It might be a bit slower, but keep in mind that your image is going to

be streamed on the network anyway, which isn't that fast. In fact, i multithreaded mine so, after sending a screenshot, it start a thread to make the next one, while the current one is being sent. Keep in mind though that multithreading add some complexity like syncronisation but i think it's easier in c# than c++, like i did.

good luck

EDIT: oh, i didn't see your already multithreaded that. Experiment by adding some Sleep somewhere, it might help.

My Projects

David Lake

117

Author

June 27, 2013 09:20 PM

Thanks, I can easily slow it down but ultimately I want it to go faster without using 2 whole cores.

I'm looking for code samples for speedy transfer of screen capture, possibly with SlimDX, over the network.

I have tried it before but it was too slow because I had to turn it into a Drawing.Bitmap but if theres a way to avoid System.Drawing all together and convert the sprite or whatever it was into a byte array to send it over the network that might make it faster.

Also I'm a complete n00b when it comes to any sort of GPU programming but if I could process and compress the frames on the GPU that would be nerdgasmtastic!

Vortez

2,714

June 27, 2013 09:58 PM

I don't know c# very well, but in c++ i have direct access to the bitmap buffer memory so i don't know. Im using Delphi for the interface part and a c++ dll to do the hard work (networking, screenshot, hooks, compression).

Not sure if compression on the gpu is fesable, im using zlib to compress mine, but the real speed up isn't that. In fact, i use a QuadTree, basically, i split the image in 4 like 4 or 5 times, then i check wich block have changed, and send only the part that changed. The quadtree is used to send different sized part of the images to the other side, so i only need to update some part of the texture most of the time, and, if nothing changed, it just send an empty screenshot message. It's a bit complex but the best optimization i've found yet.

It's a bit the same thing you do when optimizing a terrain mesh using quadtree but with an image instead. I also tried using mp1 compression to send frames, like a movie, and it worked, but the image is blurry, and it's slower than my other method so i don't see any reason to use it.

Let directx do the scaling for you. And make sure to replace, not recreate, the texture each time you receive a screenshot, it's way faster this way.

My Projects

Adam_42

3,664

June 28, 2013 01:26 AM

The main problem with using the GPU for this is that the transfers to and from the GPU are relatively slow - the CPU can read and write memory faster. This means unless the processing required is expensive then it probably won't help compared to some optimized CPU code. However if you want to go that way you may find http://forums.getpaint.net/index.php?/topic/18989-gpu-motion-blur-effect-using-directcompute/ useful, there's some source code available there.

For data compression I'd go with using the xor operator on pairs of adjacent frames. The result will be zeros where they are identical. You can then apply zlib to the results, which should compress all those zeros really well. Reconstructing at the other end is done with xor too. As the xor operator is really cheap that should be reasonably quick to do even on a CPU.

To cut down CPU load make sure you have a frame rate limiter in there. There's no point processing more than 60 frames per second, and you can probably get away with far less than that.

David Lake

117

Author

June 28, 2013 01:37 AM

I store the previous frame and zero ARGB of any unchanged pixils (xor), then compress twice with QuickLZ this makes highly compressible frames much smaller then a single compression, all the processing is done by an optimized dll built using the Intel C++ compiler.

Then when displaying the frame (in a picturebox with GDI+) I simply draw over the previous one and since the alpha of the unchanged pixels is zero, the unchanged pixels of the previous frame remain (ingenious I know!).

Whats the fastest way to capture the screen, scale it and get the frame into a byte array using the GPU then display it without using GDI which I find slows down when rendering at a high resolution such as 1920x1200 even on an i7 3820 at 4.3GHz?

Oh and as for framerate, I like it to be as fast as possible thats why I dont use Remote Desktop and if it did go over 60 FPS I dont know how to limit it to exactly that.

Vortez

2,714

June 28, 2013 03:03 AM

For data compression I'd go with using the xor operator on pairs of adjacent frames. The result will be zeros where they are identical. You can then apply zlib to the results, which should compress all those zeros really well. Reconstructing at the other end is done with xor too. As the xor operator is really cheap that should be reasonably quick to do even on a CPU.

That's actually a very good idea. Dunno why i didn't think about it before...

My Projects

David Lake

117

Author

June 28, 2013 03:42 AM

If anything that uses the GPU is slower then wont using it for scaling be slow?

Vortez

2,714

June 28, 2013 04:29 AM

GPU are not slower than cpu, they are just more optimized to do parallel tasks and work with vectors and 3d/graphics/textures stuffs, while cpus are more for serials operation. Most compression algorithm are serial by nature i think. I think what Adam_42 meant is that it take time to transfer the data to be compressed from normal memory to gpu memory and back, and that time could be used to compress on the cpu, rendering gpu compression useless.

I can't really tell why scaling a texture using the gpu is faster, but it is, it's one of the thing the gpu is good at. Also, think about it, isn't it better to send pictures of a fixed size and render it the size you want on the other side, or scale it first, then be stuck with that size on the other side? I prefer the first solution. This way you can resize the window that draw the screenshot and directx will scale it for you effortlessly, all you have to do is draw a quad the size of the renderer window and the texture will stretch with it automatically. If you want the picture not to be distorted, then it's a little bit more work since you have to draw black borders, but it's not complicated to compute either. (In fact, it's not the border you must draw, but rather adjust the quad size so it leave some area black, or whaterver you set the background color to)

PS: Sorry if im not explaining very well but english is not my native language.

My Projects

Bacterius

13,181

June 28, 2013 04:34 AM

GPU are not slower than cpu, they are just more optimized to do parallel tasks and work with vectors and 3d/graphics stuffs, while cpus are more for serials operation.

Most compression algorithm are serial i think.

I think he meant the data transfer. GPU's are good for games because the frame stays on the GPU. For his application he would need to constantly send the frame to the GPU, scale it, and then copy it back to the CPU for transfer, which may be too slow (PCI-E has very real bandwidth limits) and can have nontrivial latency.

There is always some overhead in doing GPU stuff, which is why in some cases it is best to let the CPU do it because the cost of getting the GPU involved exceeds the benefits.

“If I understand the standard right it is legal and safe to do this but the resulting value could be anything.”

Display screen remotely

This topic is closed to new replies.

Popular Topics

Recommended Tutorials

Display screen remotely

This topic is closed to new replies.

Popular Topics

Recommended Tutorials

Reticulating splines