Image Processing Server

Started by
11 comments, last by alnite 6 years, 6 months ago

Hello,

I hope this is the right forum.  A product I am working on will have a computer with camera attached that will be taking real-time video in grayscale.  The frames will need to be sent over a fast network to an "image processing server" that will have multiple GPUs.  Each frame will be assigned to the next available GPU for processing.  Then the output will be sent to another computer in the network.  The idea is for the frames to be processed in real time for computer vision type application.

I don't have that much experience with network programming, but I think this should be pretty simple?  Would I just create a basic client/server program using TCP/IP or UDP to send the image byte data, and a small amount of meta data (image dimensions, frame #, time stamp, etc)?  Speed is the most important thing so that we can process in real-time.  Any suggestion on protocols and design?

-----Quat
Advertisement

Http-like (or actual http) protocol would be a good choice here, as you could send the metadata in the headers and the image data in the request body.

 

Niko Suni

For real-time processing, the delivery guarantees of HTTP (delivery receipt, retries) are an overhead you might not want. UDP does, superficially, seem like the way to go.

You're taking a video stream and converting it to individual still images?  Most video frames are composed by listing changes over time, and most mainstream video formats are lossy. Extracting individual frames and sending those across the wire will be both processor intensive and bandwidth intensive.

At higher resolutions and framerates, even grayscale you're looking at multiple gigabits per second. If you're on lower resolution like NTSC's 720x480 @ 8 bits per pixel you're still looking at several hundred Mbps unless you run it through a video compressor. The choice between UDP or TCP will largely be irrelevant. Unless you've got a thumbnail video or your video stream is a slideshow, the protocol will be dwarfed by the volume of data involved. 

But let's ignore the volume of data involved in a video stream, and assume you've got that figured out....

 

You're asking a question about how to move data from place to place.

In the real world when you need to move things you don't start by laying pavement and designing vehicles; you use existing roads and existing vehicles.  But your question about transport-layer communications is much like starting with pavement design.

Communications is mostly a solved problem, you shouldn't spend your time solving it again.

There are plenty of message passing and data streaming tools out there. At the most basic, you could maintain an open HTTP request/response stream to each machine as was mentioned earlier. Both SOAP and REST are easy, well-established protocols, and there are tools that make these work as easily as calling a long-running function. Or there are other tools. If you've following more of a high-performance cluster environment, MPI could handle the work of passing images around. If you're following the current trendy software then Apache has several projects, such as Spark or Flink, that can probably handle that part for you instead. If for some reason none of the hundreds of existing tools and technologies don't work for you, and for whatever reason you MUST implement your own version at the transport layer, I'd choose TCP over UDP in this case, simply because you don't need to re-invent the handling of ensuring your packets arrive in the correct order across the line.  But really, if you're reduced to going at that level, you're writing the wrong code.

No sense re-inventing the wheel when there are so many existing communications tools that easily do all the work for you.

 

Thanks for the replies.  I did a little more research and was curious where the socket APIs fit in.  Those seem to be built on top of transport protocols, or are sockets still considered too low-level?  I like that sockets are mostly portable API, as we may use Linux.

HTTP sounds simple and that it could work.  I found this MS Rest API https://msdn.microsoft.com/en-us/library/jj950081.aspx and they even have an example of pushing a chunk of data to the server, which is pretty much what I need.  I have a question though for HTTP.  Can the server push data to a specific client, or must the client put in a request to get the image processing output?

So basically, I'm leaning towards sockets (TCP/IP) or HTTP, as they seem like the simplest for what I need to do.  Would one be significantly faster that the other?  Or is HTTP likely using sockets under the hood?  

-----Quat

PDNTSPA.  Physical, Data, Network, Transport, Session, Presentation, Application.  Those are the standard layers of network communications.

UDP and TCP are the typical Transport layer choices, although other protocols exist over IP (the Network layer).  The Transport layer is the layer raw sockets live in.

HTTP is all the way out on the Application level. So are SOAP and REST. So are protocols like FTP, BitTorrent, and many others.   They do ultimately use TCP as the transport protocol, and so use Sockets to do it.  But that's down the stack a bit.

If you really want to you can write your own communications system at the transport layer. It is time consuming, but can be done. Many games do this, and sometimes there are good reasons to do it, particularly since many games need to build their protocols around the Session layer, such as keeping a bunch of users synced together as a play-group, possibly tunneling and using forwarding techniques to keep the group together between games.

But in this case, I don't see any compelling reasons to work at the socket level. There are many existing protocols and tools that do all that work for you. You'll be transferring enormous blocks of data, and that will dwarf any transport-layer choice.  Use the tool of your choice to transfer a long-running series of data transfers.  Don't reinvent the wheel for this.

Send the video stream to your "Image Processing Server".

Split the video into individual frames there.

Assign frames to GPUs.

 

You will save a lot of bandwidth this way.

Video compression would create latency and quality loss. 

Niko Suni

Are all of the cameras and processing cards in the same local network (such as in a car or in a closet) and do you control all the traffic on the systems?

If you control all the traffic, then TCP and TCP-based protocols (such as HTTP) could work fine, because packet loss will be very rare, and your "real time" requirements won't be impacted.

If the system requires some kind of shared link -- internet uplink, shared switch with other application data, office LAN, or what have you -- then there is some risk that TCP-based protocols will end up sometimes "hiccup-ing" because of packet loss and re-send intervals. If you'd rather just have the next frame of data, than wait for the previous frame, when packet loss happens, then you need to use some protocol on top of UDP.

When going over UDP, there are also encodings that allow you to send a few more packets per frame, but recover from some number of lost packets. (Look into self-correcting codes, self-healing protocols, and so forth.)

The rest of your problem isn't really possible to given good advice about unless we have a lot more information about your specific system hardware and requirements.

 

enum Bool { True, False, FileNotFound };

Don't handle the transport by yourself.

Use a message queue. That will take care of managing connections, fail overs, etc...

It will allow you to do such things, as take a server off line and add more servers to a role.

You can even dynamically change the roles of different nodes in your cluster.

 

But one serious remark, regardless of what you choose to do:

If a frame has multiple stages of processing, it might be wiser to do all of the stages on a single machine.

For exaple: If you are processing 10 frames on 10 machines, if you dedicate each machine to a different stage, you are transmitting each frame 10 times = 100 IO operations.

However, if instead you send each frame to a separate machine, and keep it there, you will only be doing 10 I/O operations.

Your run time should remain the same because you are still processing 10 frames in parallel, but your network will be much less stressed.

Also, you will utilize your machines better in the case that one of the steps takes shorter to do. If each machine is dedicated to a specific step (instead of being dedicated to a frame): Your entire pipeline will only be as efficient as your slowest step.

My Oculus Rift Game: RaiderV

My Android VR games: Time-Rider& Dozer Driver

My browser game: Vitrage - A game of stained glass

My android games : Enemies of the Crown & Killer Bees

This topic is closed to new replies.

Advertisement