Jump to content

  • Log In with Google      Sign In   
  • Create Account

We're offering banner ads on our site from just $5!

1. Details HERE. 2. GDNet+ Subscriptions HERE. 3. Ad upload HERE.


Don't forget to read Tuesday's email newsletter for your chance to win a free copy of Construct 2!


Distributing CPU load onto the network?


Old topic!
Guest, the last post of this topic is over 60 days old and at this point you may not reply in this topic. If you wish to continue this conversation start a new topic.

  • You cannot reply to this topic
13 replies to this topic

#1 slayemin   Members   -  Reputation: 2796

Like
0Likes
Like

Posted 27 February 2012 - 02:43 PM

Has anybody tried offloading CPU load to a networked computer within a game? Obviously the first step would be to avoid it if possible by writing efficient code or using optimal algorithms. But, assuming everything has been optimized and you've used multithreading to take advantage of all possible CPU cores, would there be an advantage in distributing the load onto a network?

Semi-fictional example: I have a game which uses an entity component model. Objects are composed of components and use a messaging system to communicate with each other. If I have a spare computer on a 1000mbps LAN which isn't doing anything, I could use it to run some of the components (like physics or AI processing). The only concern I can think of is the difference in latency between CPU->Memory and CPU->Network. They'd be orders of magnitude in difference. But, if I'm going for a target framerate of 30-60 fps, and maximizing the local CPU produces 10-15fps, offloading CPU work onto the network would free up some local CPU resources and give +X extra fps, even with the added network latency, possibly achieving the target framerate of 30-60fps. The only other possible issue would be maintaining synchronized game states... But, the pay off would be that you'd be able to add a TON of horsepower to whatever you need to calculate.

Granted, the added complexity this would add to a game would have to require a long hard think on cost vs. benefits before going forward...

Eric Nevala

Indie Developer | Dev blog


Sponsor:

#2 slicer4ever   Crossbones+   -  Reputation: 3947

Like
0Likes
Like

Posted 27 February 2012 - 02:59 PM

i believe the problem with doing this is pretty much network latency, even in a local network setup, your generally talking ~1-2ms worth of lag, for a game waiting for a physic's simulation step, your ganna most likly bottleneck big time waiting for your network to respond. this isn't counting problems of the network going down suddenly, and timing out.

so, if you were to offload real-time work onto another computer, your going to need to employ some major predictive algorithm's to smooth out gameplay while waiting for w/e data the network is sending back, and integrating it seamlessly.

i think the best example of this being possible is the on-live service, although it's only taking input, and doing all the work on a single server, but it is showing that real-time over the network interface's can be done.

so, if we were to look at this realistically, your game loop is running at 60fps, so each frame is being rendered in about ~30ms, this means your network computer needs to finish and send it's data within 30ms to maintain a consistent 60 frames. I think it's possible, but defiantly not easy.
Check out https://www.facebook.com/LiquidGames for some great games made by me on the Playstation Mobile market.

#3 Ravyne   GDNet+   -  Reputation: 7802

Like
0Likes
Like

Posted 27 February 2012 - 04:23 PM

The first rule of client-server security is "Never trust the client." -- In any multi-player game, the server has to be responsible for anything that really matters, otherwise you open the door to all kinds of hacks -- big and small. For example, when one client moves, his local instance goes ahead making its best-guess as to what the server will say is the outcome, but the other client's don't see it until the server vets those movements, and if the originating client falsified a movement that the server doesn't agree with, the server will force it back to where the server says he should be.

There's also latency issues of course -- especially in light of most home internet connections being asymetrical (far less upload bandwidth/greater latency, than download). It's just going to be quicker to do the calculation on the server than it'll be to wait for a client to do some chunk of work. you'd be better off to offload to some hardware local to server -- maybe a GPU or other processing card for suitable tasks, or another machine on a fast backplane network, if you really had to distribute the work-load off-CPU. The cheapest and easiest solution, given all the costs involved, is usually to just throw more Mhz, CPU cores, and RAM at the problem, or to optimize/re-architect the code.

Now, you can offload some stuff to *each* client -- if you have a projectile or particle effect, you only need to send its initial state and let the clients calculate it for themselves -- there's no risk here because ultimately the server will still decide who might take damage from it, so no big deal if a client wants to fool themselves. If you can ensure that your game is deterministic (Halo 3 and Reach do this, its also the basis for the in-game recordings) then you can do a lot of this kind of thing.

#4 slayemin   Members   -  Reputation: 2796

Like
0Likes
Like

Posted 27 February 2012 - 04:47 PM

I was thinking more along the lines of single player games, so network security wouldn't be as big of an issue.

Consider these commerical games and how this could be applied:
Supreme Commander: You could play HUGE maps with up to 8 players. You could set the maximum number of units at something like 1,500 which gets split evenly among the total number of players. Unfortunately, supreme commander starts to choke really bad when you start to get a high number of units on the screen (~1-3 fps). It made long games close to unplayable... and you'd want to use nukes since it clears out a bunch of units all at once, thus increasing the frame rate. If you can offload the CPU processing for the physics and AI to a networked computer sitting next to you, you could share the CPU burden and get a more decent framerate.
Total War Series: This game has an RTS component where you have armies fighting against each other, with thousands of units on each side. The individual AI for fighting units is actually pretty bad (I've seen a single swordsman take on 20+ guys because of choke points and glitching movement). The problem is that the AI has to be simple enough where it can be instanced thousands of times for each unit on the battle field, otherwise it hogs the CPU. The other problem with the total war series is that you have a limit on how large your armies can be (sorry Ghengis Khan, you can only have a couple hundred horse archers because of memory and CPU limits). If you have no memory limits and CPU limits, you could have better AI code and more units on the battlefield at once. You'd then only be limited by how many polygons your GPU could render.

Though, if you start putting a huge amount of units into a game world and you distribute the memory and CPU load across the network, then that's probably going to bog down the network with all the data flowing back and forth, possibly bringing us back to square one... Fiber optics? I suppose I can code up a proof of concept to see how feasible it is.

Eric Nevala

Indie Developer | Dev blog


#5 hplus0603   Moderators   -  Reputation: 5532

Like
0Likes
Like

Posted 27 February 2012 - 07:59 PM

Multi-computer single-player games? That sounds cool, but probably not very practical :-)

It's hard enough to make a game that scales from an Atom-based netbook that's 5 years old to the latest 20-core dual-Xeon monsters. Trying to also throw networking into the mix sounds like quite a challenge to me, and I'm not sure the return on investment (in terms of game appeal to the market) would be there.
enum Bool { True, False, FileNotFound };

#6 Ravyne   GDNet+   -  Reputation: 7802

Like
0Likes
Like

Posted 27 February 2012 - 08:11 PM

It could be done across a local network, sure. If you take competition out of it then fairness is a non-issue.

But then, thinking along a different line, all of the sudden your system requirements are not one dual-core CPU 3.0 Ghz, but 2 dual-core CPU 2.6 Ghz or whatever -- We're in an age where a lot of people have a home network, true -- but how many have computers sitting on all the time, or how many will be willing to fire up another machine just to play a game? Of those, how many people have "secondary" machines that have sufficient CPU and RAM resources to be helpful, and how many of those *won't* be occupied by some other user or task?

It's not a technical hurdle to do it, per se, but there are huge accessibility hurdles. Most users have one primary machine that falls somewhere along the line of bleeding-edge to border-line ancient, and maybe a laptop kicking around. If you're lucky, both will be relatively modern, but not infrequently one of those is outdated or under-powered enough to be relatively useless.

#7 slayemin   Members   -  Reputation: 2796

Like
0Likes
Like

Posted 27 February 2012 - 09:54 PM

:) I have 5+ computers and I always feel it's such a shame to have 4 computers idling while the the main computer is under heavy load (I usually turn them off). If a game was designed to distribute some of its CPU load to unused computers, it would boost performance and the game play experience.

*chuckle* and of course, the hardest of the hardcore gamers would probably go out and buy a second powerhouse computer for these bonus benefits. The algorithm for deciding how to split the resources would certainly have to take into account the computing power of the connected clients and adjust resource allocation as necessary to maximize performance. (or alternatively, leave it up to the user to decide)

I think the entity-component model would probably synergize the best with this kind of setup. I think I may have to think about this some more and give this a try to see how it works.

Eric Nevala

Indie Developer | Dev blog


#8 Ravyne   GDNet+   -  Reputation: 7802

Like
0Likes
Like

Posted 28 February 2012 - 12:50 PM

And the first rule of game design is "Don't assume *you* are the target audience" :)

#9 slayemin   Members   -  Reputation: 2796

Like
0Likes
Like

Posted 29 February 2012 - 10:56 AM

Touche.

I'm finding that I'm more interested in tackling the hard problems that few people have been able to solve or just haven't thought of. This would be one of them, hence the interest :) Maybe if I figure this out, I'll write an article on it so other people can replicate it or know that it's not worth pursuing.

Eric Nevala

Indie Developer | Dev blog


#10 wodinoneeye   Members   -  Reputation: 857

Like
0Likes
Like

Posted 01 March 2012 - 05:53 AM

Has anybody tried offloading CPU load to a networked computer within a game? Obviously the first step would be to avoid it if possible by writing efficient code or using optimal algorithms. But, assuming everything has been optimized and you've used multithreading to take advantage of all possible CPU cores, would there be an advantage in distributing the load onto a network?

Semi-fictional example: I have a game which uses an entity component model. Objects are composed of components and use a messaging system to communicate with each other. If I have a spare computer on a 1000mbps LAN which isn't doing anything, I could use it to run some of the components (like physics or AI processing). The only concern I can think of is the difference in latency between CPU->Memory and CPU->Network. They'd be orders of magnitude in difference. But, if I'm going for a target framerate of 30-60 fps, and maximizing the local CPU produces 10-15fps, offloading CPU work onto the network would free up some local CPU resources and give +X extra fps, even with the added network latency, possibly achieving the target framerate of 30-60fps. The only other possible issue would be maintaining synchronized game states... But, the pay off would be that you'd be able to add a TON of horsepower to whatever you need to calculate.

Granted, the added complexity this would add to a game would have to require a long hard think on cost vs. benefits before going forward...



Some AI tasks that might work -- pathfinding or terrain projectile collision plotting which use static map data to work off of (no state changes) and just position points/vectors as inputs and paths sent back or collision results data (all pretty low thruput but with alot of data crunching).

An issue might be : How much gameworld state data needs to be replicated (and kept up to date) for the second computer to be able to make use of, without having to do lots of slow network fetches of data (or cause alot of network overhead).

AI for additional numerous opponents (as an optional feature thing for a Solo game - using the players machine as a 'master').... More powerful AI in a more complex game can eat alot of that extra CPU (again the 2nd box needs sufficient game state replicated via network).

For a client-server game where a client gets replicated data fed from server, you would have an AI NPC node (machine) use the same client type feed and have it play additional opponents/allies (feeds subset of game state for each NPCs particular location).


I was actually looking at a design sorta reverse - using the Client machine in an MMORPG game to run AI for a team of NPCs that stay with your Avatar (so they make use of the world state ALREADY being sent for the avatars area.) They would send commands for the team NPCs to the (server subject to the same cheat-proofing validation).

A seperate feature (for an AI heavy game) was to have AI nodes on players machine left running (psuedo-cloud) to handle alot of extra NPCs in a cityscape. The problem with that is you cannot count of those machines always being available and you have to deal with dropouts. The 'cheat' security aspect probably wouldnt matter as the data being processed wouldnt be tied to any known player (no benefit to cheat) and tasks get shuffled quite frequently. (like the others, Server validation.... it would discover improper 'griefing' commands and can the user or better the NPCs are nerfed and cant do much harmful anyway - mostly there for background effect for user's game experience).

Its possible that most clients being GPU bound mostly to run the players 3D rendering would have several free cores and MOST clients would help run part of the world simulation in the immediate area of the player to reuse the game state feed (but only if the machine had sufficient extra CPU capacity)
--------------------------------------------Ratings are Opinion, not Fact

#11 Antheus   Members   -  Reputation: 2397

Like
0Likes
Like

Posted 01 March 2012 - 10:27 AM

For a sufficiently fast network it might work out better by doing frame rendering on multiple machines. It adds to latency, but various practical experiments show that users tolerate up to ~7 frames of delay with no notice.

Regardless of partition, at a certain scale GPU will become a bottleneck. It's pretty easy to saturate them today even with a single core. Only advantage such cluster would have is ability to spend equivalent of multiple frames producing a single one, effectively giving you 10/15FPS budget at 60FPS.

All of this assumes completely equivalent hardware, scaling across heterogenous hardware while ensuring latency in this range is not viable.

#12 frob   Moderators   -  Reputation: 22242

Like
0Likes
Like

Posted 01 March 2012 - 05:49 PM

I'm still struggling with the original question.


Has anybody tried offloading CPU load to a networked computer within a game? Obviously the first step would be to avoid it if possible by writing efficient code or using optimal algorithms. But, assuming everything has been optimized and you've used multithreading to take advantage of all possible CPU cores, would there be an advantage in distributing the load onto a network?


Yes, it has been done many times. More on that below.


A game needs a minimum spec. If that minimum spec is a single machine, then fully functional and optimized means that it only needs a single machine to be fully functional. That is enough for it.

If the minimum spec is a single machine there are really only two things that it would help with:

* The simulated content is cool to have but not critical to gameplay, such as VFX, and huge particle systems.
* OR the simulated content is critical to gameplay --- meaning the game is not a real-time simulator --- and you are trying to reduce processing time.

In the first case of it being non-critical to gameplay, about the only thing I can think of that would help are physics-driven particle systems.

In the second case, your simulation doesn't meet your description:

But, if I'm going for a target framerate of 30-60 fps, and maximizing the local CPU produces 10-15fps, offloading CPU work onto the network would free up some local CPU resources and give +X extra fps, even with the added network latency, possibly achieving the target framerate of 30-60fps.


At that frame rate, no, it will not help.

That is not the point of the "OR" statement above.

That second point is a common feature for simulators of games like go and chess.


These games are effectively searching min/max trees, and the games are not solved. The simulators could easily consume multiple compute-days of CPU time and still be working on an ideal move. In that situation having a large network of CPUs (aka supercomputer) can be very helpful.

Check out my book, Game Development with Unity, aimed at beginners who want to build fun games fast.

Also check out my personal website at bryanwagstaff.com, where I write about assorted stuff.


#13 slayemin   Members   -  Reputation: 2796

Like
0Likes
Like

Posted 02 March 2012 - 01:21 PM

A game needs a minimum spec. If that minimum spec is a single machine, then fully functional and optimized means that it only needs a single machine to be fully functional. That is enough for it.

If the minimum spec is a single machine there are really only two things that it would help with:

* The simulated content is cool to have but not critical to gameplay, such as VFX, and huge particle systems.
* OR the simulated content is critical to gameplay --- meaning the game is not a real-time simulator --- and you are trying to reduce processing time.

In the first case of it being non-critical to gameplay, about the only thing I can think of that would help are physics-driven particle systems.


You'd have to build a GUI with a slider to adjust the game complexity (like how many units to allow simultaneously in a battle). The slider would be set to the best fit spec of the current machine by default so you could play the game on a single computer. The minimum requirement is that the game be highly playable on a single computer with the average hardware. If you wanted to ramp up the spec beyond the capacity for a single computer, you could add another computer to the cluster to offload some of the CPU workload. I suppose if you were really smart, you could do some load balancing calculations to figure out how much horsepower the networked computer has and what it can handle (holy added complexity, batman!). It would probably be a really hard sell in a professional studio.

The physics driven particle systems can probably be handled by a GPU these days so no need to offload that. I was thinking more of processing AI, or collision detection, or something that doesn't necessarily have to be done locally (like rendering or input). I've written code for clustered computing before (30 computers to compute mandlebrot sets) but the requirements for that are a bit different from running a real time game.

I guess the best way to find the answer is to just code it up and see how it works out.

Eric Nevala

Indie Developer | Dev blog


#14 slayemin   Members   -  Reputation: 2796

Like
0Likes
Like

Posted 06 March 2012 - 09:33 PM

Just an update for anyone who is interested: I've hit a major milestone and also realized that this problem is a lot more complex than I initially thought. Here's the approach I'm taking.

I decided to go with an entity-component model using message passing as a means to communicate between entities. I can then break my game up based on the different component groups. Some components obviously can't be migrated to a remote machine (like rendering and input components), so a flag needs to be set on a component pool.

So here is a brief overview of the classes and a description:
Entity: This is an instanced object in the game. It's attributes and behaviors are defined by the components which compose it.
EntityTemplate: I define templates of my objects in an XML file so that I can quickly instance them
Component: This is a generic class inherited by actual components
ComponentTemplate: Much like the entity template, but just templates of a component.
ComponentSystem: This is a generalized collection of a specific group of components
ComponentSystemPool: This is a collection of ComponentSystems
MessageRouter: There needs to be a way for component pools to communicate with other component pools, even if they are in a remote thread or remote host. This class is responsible for routing messages appropriately.

Example: I have a physics system, rendering system, targeting system, and input system. All the systems have a pool of their respective components. I want to divide the application in half, so I create two component system pools:
ComponentSystemPool1 contains: {PhyicsSystem, TargetingSystem}
ComponentSystemPool2 contains: {RenderingSystem, InputSystem}
Since the rendering and input have to remain local, they can't be migrated. So, physics and targeting get sent off to a different computer/thread.
Main computer runs: {ComponentSystemPool2}
Second computer runs: {ComponentSystemPool1}
The message router receives an application message from the main game update loop to advance to the next frame. The message router has to keep everyone synchronized, so it sends a message to all the component systems to advance to the next frame and requests a call back when they're done with it. Once a callback has been recieved from everyone, we're ready to go to the next frame. When the player wants to quit the game, the message router has to send a system message to quit the application on all the connected clients before letting the game end.

As a proof of concept, I have physics components and a rendering components defined, and contained within physics systems and rendering systems. The physics component pool is stored and run on a seperate thread from the main game. The rendering components have to use the messaging system to request entity position information from the physics component. The messaging system has to know where to find all of the component pools and then know how to send a message to it. Components don't care about where other components are located, they just want to send a message and get a reply as fast as possible. So, they send a message to their component manager who then figures out whether the message can stay local to the system pool, or if the message has to be sent up to the message router. The components will run spin locks until they receive a response (I'll probably have to think of a better solution since a thread could be waiting a while).

There's a bit of an unforeseen problem though: What happens to the game if a remote machine gets disconnected and that remote machine was processing something vital like physics and position information? The current thought I'm entertaining is to use a "hot spare" of a component system which is hosted on another machine. The hot spare mirrors the current component system each frame and waits for a failover event to happen. Since a hot spare is being maintained, it would improve performance and make sense to use duplicate data if it exists. "Read" messages could be thrown at the hot spare if the main pool is too busy -- but this also creates a new problem: What if the main instance gets a write message and the mirror hasn't been updated yet and receives a read request? The read result would be at least one frame behind, or worse, depending on latency. I guess this would ultimately need some pretty slick automated load balancing. It'd be nice to have an algorithm which can measure CPU and memory load, figure out if it would increase application performance to migrate a component system pool, and then do whatever is best.

Hopefully someone else gets inspired to try something like this and uses a similar approach :)

Eric Nevala

Indie Developer | Dev blog





Old topic!
Guest, the last post of this topic is over 60 days old and at this point you may not reply in this topic. If you wish to continue this conversation start a new topic.



PARTNERS