Jump to content

  • Log In with Google      Sign In   
  • Create Account

hplus0603

Member Since 03 Jun 2003
Offline Last Active Today, 04:19 PM

Posts I've Made

In Topic: Hundreds of timers...

Today, 12:22 PM

I thought the key system 'thread' overhead was the locks and such of making a system level call


You don't need locks just because you have threads; it depends on your data structures.
Similarly, you don't avoid the need for locks when you use fibers, unless you apply very strict rules to when you may call fiber-switching functions. Which means things like, "don't do I/O in fibers," which ends up being too restrictive for most cases.
The benefit of fibers is that re-scheduling to another thread/fiber is cheaper in userspace than in kernel space. That's it. If you are drowning in context switches, then fibers may help. But if that's the case, you're probably threading your application too finely anyway.

Mechanisms (data) for 'fibers' dont have to be on the stack


You are in a fiber. You make a function call. Where does the state of the current function get saved? Where do locals for the new function you call get allocated?
Separately, that other function you call, may call a yielding function. Where does the OS save the locals of each function on the call stack while it switches to another fiber?
Of course fibers need stack space. You may allocate that from the heap or other unallocated memory regions for each fiber, just like the OS allocates stack space for threads from unallocated memory space.

In Topic: In depth articles about fast paced multiplayer network architecture

28 June 2016 - 11:02 AM

That video is pretty good at explaining the challenges that a physics network game faces.
It doesn't really talk about the solutions at an implementation level, though.
If you haven't analyzed network games and the role of latency in physical simulation, it'll do a good job of introducing the challenges.
The mechanisms they describe are better described for programmers in the Gaffer articles ("zen of network physics".)
The additional illustration of how they choose to assemble the pieces (which weapons use which method) is a good illustration of how physics engines, network engines, and game design all interrelate to make a good game!

In Topic: Hundreds of timers...

28 June 2016 - 10:49 AM

Also If you can avoid Threads and their overhead, do so - use fibers/microfibers (a software threading mechanism versus system resource hard threads)


The main cost of a system thread is the memory for the stack. You don't avoid that cost for a fiber. There are some other costs in pre-emptive threads, mainly having to do with the cost of kernel entry for context switch, but the difference in cost between fibers and threads is not an order of magnitude (except perhaps in some very synthetic cases.)

In Topic: Hundreds of timers...

27 June 2016 - 09:49 AM

Read from main memory will take 100+ clocks


{writes} ... are around 1 clock


Ah! You're looking at thread-blocking stall time. I'm looking at overall system throughput.

What will actually happen when you write, is that a cache line will be allocated for the write (which needs bus arbitration on multi-CPU systems,) and the bytes will be put in the right part of the cache line (or, for unaligned writes, cache lines.) Then a read will be issued, to fill the rest of the cache line. Then, once the cache line gets evicted, the memory bus will be used again, for putting the data back out.
At a minimum, this will take twice as long as reads. If you have data structures (or a heap) that alias cache lines to different pointers that different CPUs want to update, they will end up ping-ponging the cache line, costing more than 10x the "base" cost.
Similarly, unaligned writes that span cache line boundaries may cause significant additional slowdowns, depending on the specific (micro-)architecture. (Some RISC architectures disallow unaligned access for this reason.)

Btw: When a CPU thread stalls, waiting for a read to complete, a hyper-threaded CPU will switch to its next thread. Hopefully, that thread was previously stalled waiting for a cache line that has now loaded, so it can still do useful work.

The recommendation is still: If you can do a compare and a branch to avoid a write, that will improve your overall throughput if you are memory bound (which almost all workloads are.)

I think we're somewhat below the useful level of abstraction for timers based on priority queues, though :-)
And, finally: If you "step" each entity in your world each game tick anyway, you might as well just have an array of current attached timers inline in each object, and compare the current game time to the time in those timers; they don't need a queue separate from the collection of all game entities.

In Topic: In depth articles about fast paced multiplayer network architecture

26 June 2016 - 04:25 PM

I'm not aware of any such comprehensive resources, unfortunately. Mainly, because all of the different solutions put together, make for the better part of a game engine, so where all the pieces exist together, they will make up an engine (Source or Unreal or whatever.)
And, the few resources that exist for game engines, typically do not put networking as their main illustration, because networking is only used by a subset of games, AND it's pretty complex (as you are finding!)

The meta-advice I can give is to separate concerns between different pieces of code. In Object Oriented code, this should be trying to achieve the "single responsibility principle." In a more functional approach, keeping each function shorter/smaller, and composing functions to achieve the end goal, would achieve the same results.

PARTNERS