Jump to content
  • Advertisement


  • Content Count

  • Joined

  • Last visited

Community Reputation

139 Neutral

About pronpu

  • Rank
  1. We heard you!   Following the previous discussion, we've re-written the simulation to use actors running on top of true lightweight threads. No more callbacks! Read about the new spaceships simulation here.
  2.   I don't think so. The whole approach, as outlined in the blog post, is that scalability comes from making parallelization and querying one and the same. If you say you have one or the other, or even both separately -- that's nothing to do with what we provide. We say that IF you have real-time db queries AND the db uses knowledge about the domain and the queries in order to parallelize your business logic THEN you get significant scaling. We're not trying to solve the problem of querying spatial data. We're solving the problem of scaling by utilizing the database.    The main point of our approach is this: we're not utilizing parallelism to speed-up database queries. We're utilizing the database to parallelize (and speed up) your business logic.        Yeah, I'm probably being silly, but I don't think I'm ignorant. After spending a lot of times talking to potential customers in various markets, some patterns emerge.       All our middleware is server-side only. That's where you find the scaling issues we're trying to solve.     True. That's another reason why we're not trying harder to convince game devs to buy our stuff. For indie developers our software is free anyhow (depending on the game world size).
  3.   I absolutely agree, yet expectation or demand is often a function of what's available. I think it was Henry Ford who said, "if I were to ask my customers what they wanted they'd say they want a faster horse". I don't think there was demand for video games before there were any, or for 3D shooters before Wolfenstein. New technology just opens new horizons for the more adventurous, imaginative early adopters. Then it needs a killer app for the late adopters to realize its potential.   I don't think EVE's success, or its amount, is simply a function of the architecture. I'm sure that if someone came up with a great game that employs a seamless world, a lot of others will follow. Still, that game would have to be great for many reasons; obviously just having a seamless world won't be enough.
  4.   Yes, that's referring to Nphos's simple solution.As for Darkstar -- there was a prototype, but research was far from finished. At no point in the project, throughout its various incarnations, have they released even an alpha version of multinode. They believed they would be able to achieve terrific scaling, but all efforts on multinode were halted once Oracle defunded the project.     Yes, there would be problems, but also some great benefits to be gained. Nevertheless, we've given up all efforts to persuade game developers of anything. Turing and von Neumann themselves could not have persuaded a game developer. But the lucky developer who will try our stuff will get some great results, and we'll be all the more happy for him. That's not our business plan, though. If you give it a try you'll be happy that you did; if not -- well, I don't try everything, either.     Yes, but ease of use is different from sustaining habits. You're talking about habits, not ease, and, like we both said, game developers are conservative :) Many "web developers that don't know what a TLB is" were able to quickly wrap their heads around Clojure, Erlang, Go and Node and mostly reaped the benefits of better productivity, even at the price of changing their habits. But, to each his own.     Yes, because they only know what they're getting. I'd take that further. Players are more tolerant of games that haven't progressed much in terms of features and abilities in something like a decade; this particularly applies to MMOs. Google and Facebook are able to process in realtime several orders of magnitude more than they were a decade ago. MMOs operate at pretty much the same scale.    If you'd care to hear my thesis about it, here it is, and it's completely different from yours, about risk and investments. The reason for conservatism is that AAA games, especially MMOs, are prohibitively expensive to produce, and the software development is mere fractions of the cost. It almost all goes to content. For this reason, incumbents like Blizzard and CCP have little to fear from small startups with a better software architecture threatening their position. A startup could build an MMO architecture far superior to Blizzard's, but it doesn't matter because they won't have the resources to produce as much content and compete with WoW or EVE. There are very few incumbents, all of them co-existing happily together. And that is why the big studios have little reason to constantly innovate when it comes to architecture. To push MMO studios to innovate you'd have to come up with a game that raises player expectations, and that would only happen if players like it, which, in turn, will only happen if the content is rich. So it won't come from a startup. Without this expectation from their customers, innovation happens at a glacial pace. No one is in a hurry.    This is completely different in industries where you don't have such high content production costs, and so even the incumbents are constantly fearful of newcomers; so they must innovate. Engineers there are constantly looking for new techniques and approaches that would give them an edge over the competition. 
  5.   Cool. I actually did not know that.     I'm afraid this is what many game developers believe but it is patently false. True, most web companies (one company one vote) match your perception, but in terms of general effort (one developer one vote), engineers at twitter, amazon, facebook and google know quite well how to explain the role of the L1 cache, and are extremely knowledgable of concurrency issues. They know the cost of a store fence and a read fence, the cost of a CAS and of a task switch. They tackle latency and scaling issues that those at Blizzard and CCP haven't even dreamed of. They know how to take advantage of garbage-collectors and JITs, and can tell you exactly what machine code the JVM generates in what circumstances. They use read-write locks with striped counters that recognize contention and grow to accomodate it. They choose wait-free queues based on trade-offs between raw performance and garbage generation. They use fork-join, Clojure reducers and immutable data structures. They use parallel data structures. They use Erlang or Akka for actors. Believe me, they know how to reduce latencies to microseconds when they need it, and they need it more than you think. I've even seen someone experiment with eventual consistency at the CPU core level vs. CASs. I don't know about all game companies, but I can assure you that Twitter and Google know how to get more out of each of their cores than Blizzard or CCP (having spoken to Blizzard and CCP engineers).      Yes, but the server side, as well as the middleware mentioned by @hplus0603 above, is little more than a networking framework. It's analogous to Apache or Nginx in the web industry, or to RTI in defense. It doesn't help you with scheduling and concurrency.      Of course. :) Nice architecture, though. I especially liked the delayed messages bit. But, correct me if I'm wrong, at the end of each cycle all threads are effectively stopped and only one core is used, right? Also, how do you ensure that each thread has a similar workload? That's fine if you have 8 cores, but what if you have 100? Stopping 100 threads and waiting for stragglers has a huge impact on scalability.   Let me tell you how SpaceBase works internally. Each R-Tree node (a sub-region of space, dynamically resized by the number of objects), is an actor, only fork-join is used for actor scheduling, and it employs work-stealing queues. Work stealing queues ensure that no core ever waits for others: every message produced by an actor goes into a thread-local queue, but when a thread runs out of messages, it steals one from another thread's queue, trying to steal a task (message) that is likely to generate a large number of subtasks (other messages), so that cross-thread stealing wouldn't happen often.      It appears that your actor architecture was designed with two core requirements (other than the main ones of using multi-core): developer habits and determinism, and it seems that you've done a great job at satisfying the two, at the cost of some performance and scaling, I guess. Here's how those maligned "web startups" do it in the performance sensitive bits: You have an in-memory cyclical buffer per threads that records events along with a high-resolution timetag (obtaining the timetag is the expensive operation here, but recording is only turned on during debugging). When a bug occurs, the buffers from all threads are merged.    If your environment protects you from races, you usually don't even need that. All you need to do is replay the particular inputs a single task received.     I agree with that wholeheartedly! Unfortunately, when your library does scheduling, it inevitably becomes a "framework"... in most languages: Clojure, because it has abandoned the traditional OO model and implements on a uniform data access pattern, and because it's so beautifully (without impacting practicality) functional, it has the advantageous property of making any framework a library. All objects are the same, you don't need to inherit anything and mixins can be, well, mixed-in if necessary. It also protects you from data races and can be extremely fast. If you don't want frameworks, use Clojure and they will all magically disappear.        Nice! Reminds me of this.     Maybe an optimistic locking model degrades significantly when distributed across the network and maybe not, but Darkstar (or RedDwarf as it was later called) never tested that. It never reached that stage, and only did simple test distributing their single-node code. Check your sources...     ... and that's a reductionist view of what I outlined in my original blog post. SpaceBase is spatial, but we're working on other products using the same approach which aren't. You can do the same.     Perhaps for good reason game devs are conservative, and that's why we're not trying to sell them our products any more... I posted here because I thought some might be interested in the approach. You never know, an independent game developer might decide to use our tech for a new game and enjoy the performance boost. It's free for small instances anyway.
  6.   What I meant was the concurrency is essentially distribution across nodes with a relatively slow interconnect between them, just like a network. I don't understand how the two can be considered different. As I said before, all computer architecture is built on the idea of local nodes with relatively slow interconnects. Any use of the comm bus, be it between cores, NUMA sockets or cluster nodes should be minimized in order to increase scaling. The comm bus has two properties that hinder performance: low throughput and low latency. Eventual consistency tries to bypass the latency limitation. However, eventually consistent scaling sometimes sacrifice consistency for no good reason. They are architecturally lazy when it comes to object placement by employing a very simple scheme: usually a DHT. With such a bad strategy, sacrificing consistency gives latency benefits. But smarter object placement schemes can achieve the same (amortized, not worst case) latency while maintaining consistency.      Is it (I'm asking seriously)? It is certainly true on the client side with game engines, but which server-side frameworks are in common use in games?      The demo was purposefully written to simulate asynchronous requests coming over the network, each being processed as it arrives with no global "ticks".     The same could be achieved with SpaceBase (though that was not the intention in the demo). You could collect all requests and then process them in parallel, deterministically. I'm not sure this property is always so important (or important at all), though. BTW, what actor implementation have you used?      Which frameworks are used for that?     Of course. Software level messaging abstraction is implemented on top of a shared RAM abstraction, which is, in turn, implemented on top of hardware level messaging. The messages are different, but the point is that shared state is always an abstraction on top of messaging, and computer architecture is such that this cross-"node" messaging should be minimized. We can call it the "shared-state/message-passing duality". Sometimes it's best to work with one abstraction, and sometimes with the other. However, a shared-state framework can certainly improve performance in many cases, as it has more knowledge about the domain semantics to apply optimizations. 
  7.   Well, I stand behind my claim that nowadays performance is gained by relinquishing control. At the end of the day you'll have to trust the framework to do the right thing. We take care of the edge cases for you.     What you describe sounds a whole lot like our open-source in-memory data grid, Galaxy, that serves as our low-level distribution layer.      Except if you use a transactional framework that takes care of all that for you.   Every shared-state concurrency/distribution (we can agree it's the same problem) is an abstraction on top of message passing, that translates contention to messaging. In the CPU, that messaging layer is called cache-coherence, and uses variants of the MESI messaging protocol. Now there are two questions to be asked: 1) is the shared-state abstraction useful (or not, or maybe even harmful), and 2) what is the impact on performance?   As for 1), I'd claim that's it's not only useful, but often necessary, and, furthermore, often messaging architectures re-implement shared state, only they do it in an error-prone manner. If, for example, you have a seamless world, you do wish to model a shared state (your world). You could implement the shared state on top of messaging yourself, or use a pre-packaged abstraction. The reason shared state causes more trouble "out-of-the-box" than messaging is that messaging makes communication explicit, while shared state hides it. But, if you could use a shared state abstraction that takes care of races and deadlocks for you, then this problem, at least, is solved. You might be also persuaded that shared-state is useful by the fact that two of the best known actor systems, namely Erlang and Akka, both provide shared state in addition to actors. Erlang has ETS (and Mnesia), and Akka has limited STM. The designers of both platforms have realized that message passing alone is insufficient (or degrades at times to hand-rolled implementation of shared-state on top of it). Also, they implement shared state at a lower level (at least on a single machine) than their messaging API so it performs better.   And as for 2), I'd say this. Many transactional frameworks (databases, STM) turn a correctness problem into a performance problem. They guarantee correctness but at a high performance cost because they try to be general and address every possible data-access pattern. The aforementioned Project Darkstar took that same path. We, however, take a different path. We do not provide general transactions but only transactions that are likely within some domain. Our first database/scheduling framework, SpaceBase, assumes a spatial object domain, and allows transactions that span some contiguous region of space. This assumption does not only make internal scheduling decisions simpler (for example, we make sure that we never have to rollback conflicting transactions), it also gives tremendous performance (or scaling, rather), far better than you would have been able to do by re-implementing all of this yourself on top of messaging (at least not without a lot of effort).   The reason for that is that the framework/database is able to organize the objects in such a way as to minimize contention and therefore messaging.      Again, shared-state is simply an abstraction on top of message passing. NUMA's cross-node messages are relatively slow , so the aim should be to reduce the number of such messages regardless of the abstraction you use. I claim that the database can do a fine job at reducing messaging, and I've proved that in this highscalability article. The idea is that if you use the right kind of data structure (a b-tree in the article), communication is minimized.    All modern computer systems are organized as layers of fast-access caches communicating among them via some slow communication buses. The same problem, BTW, applies not only to NUMA but to CPU cores at a lower level, and to cluster nodes at a higher level. A b-tree data structure, like that in the example, makes contention at higher-level tree nodes exponentially less likely than contention at lower nodes, and contention at higher nodes is exponentially more expensive (slower comm channel). The two cancel each other out exactly, making each operation O(1) with a very, very low constant (<< 1) in terms of cross-node communication, no matter the "layer" you're at (CPU cache, NUMA or cluster).      Of course. That's still an AOT compiler, not a JIT, as it compiles the shader before it runs for the first time. I was referring to the merits of a true profiling JIT, one that performs optimizations based on the running profile of the code. I thought that somehow you've done that with shaders.
  8.   This actually sounds like some very constructive criticism. Thank you, and we'll get to work on that!   Naturally, many new products are a bit rough around the edges at first, though I don't think it's that high friction. You issue a query, you get the results passed callback, and within that callback the results are locked (for reading or writing, depending on the type of the query). You process the results in the callback and issue further queries if necessary.  But I agree it's bit cumbersome, especially in Java/C++. We will be integrating a continuations library that will make the callback implicit and the whole thing look more single-threaded. Hopefully this will help reduce the friction...
  9.   Thats right. Erlang and Clojure have similar constructs as well, and Clojure is generally faster than Go, if the code is optimized for performance. Both of these languages also have protections from data races that are absent in Go, but Go is great.     I meant to say that a GPU core is simple in the sense that the time it takes to execute a given piece of code is simply related to the number of instructions. This is far from being the case with modern CPUs.     It's not slower, quite the opposite. But taking full advantage of it in a large piece of software is no longer achievable by handwriting assembly. There are too many things to take into consideration, and it's better to put our faith in the JIT or the optimizing compiler.     Yeah, sorry. I may have been carried away.     Cool. What do you use to do it?     This is generally true until you hit a wall. You run your program "on the inside" of the operating system and let it schedule threads for you because scheduling is usually best left to the framework. It's the same for micro-threaded environments like Erlang and Go. Because the main contribution of this approach is letting the "database" take charge of scheduling for the purpose of multicore utilization, I don't quite see how it can be implemented as simple functions. A scheduler, by definition, calls your code and not the other way around.     And yet, when it comes to utilizing multi- and many-core hardware, games are years behind. I've spoken to a senior developer at a very well-known AAA MMO studio, who told me they don't allow their developers to write multi-threaded code. Another architect at a different MMO studio told me that only now they're realizing that they're going to have to abandon the serial game-loop way of thinking, at this is very hard for them to accept.    I think that game programmers, unlike their peers in other industries, still try to hold on to the illusion that by somehow being "close-to-metal" (which is an illusion in and of itself, IMHO) and controlling everything they would somehow utilize hardware better. I believe this is a direct result of what you were saying: game developers always needed to make the most out of a given fixed of hardware. But while in the past this could have only been achieved by controlling the entire software stack, today the opposite holds: to utilize modern hardware you must relinquish control.     Absolutely. I so other way to reduce communication. You just have to make sure that a node isn't overloaded when too much action is concentrated in the area it owns. The right kind of data structure would automatically take care of this kind of load balancing.   The trick to good concurrency is the realization that interesting concurrency must manage contention, and contention equals communication, and communication is slow. Disallowing contention altogether severely limits the application, but a good data structure can minimize communication. I did some calculations here (I may have posted a link to that here before).
  10.   That's a valid point. We're working on an API that transforms the callbacks into continuations that would appear single threaded to the programmer.     Future blog posts will address that. In the meantime, hopefully the user guide and demo code will help, and you can ask us more about it on our new forum.     Project Darkstar was shutdown before reaching the point of a multi-node distribution.      SpaceBase only supports spatial queries. Future products will be built for other kinds of queries as well. You can be sure we won't recommend them before they're good enough for military use, let alone games.     Ah, well, unfortunately you are right about that. I hope I will not be offending to many people here by saying this: we've come to realize game developers are the most conservative class of developers in existence. SpaceBase grew out of a project developed for the Israeli Air Force for a mission-critical, real-time C4I system. We're talking to people developing hard real-time missile interception systems, and they are much more open to innovation than game developers. Fact is, companies like Boeing are doing on-board avionics on the JVM, while some game developers are still convinced that they can produce better assembly code than a compiler or a JIT. Some have only recently come to terms with the benefits of an operating system :)   What we've seen is this: game developers' culture was formed at a time when every last inch of performance had to be squeezed out of old-gen CPUs in order to render impressive graphics. Now, game devs have quite mastered the GPU, and they think their games are fast because of the impressive graphics they've been able to achieve on the GPU. But the GPU is very much like the old CPUs: it's extremely simple. CPUs, with their pipelines, instruction-level parallelism, multiple pending cache misses and instruction re-ordering are extremely complex. There is no more "close to metal" when it comes to the CPU. Even the metal is more akin to software than to actual metal. But while other disciplines have come to rely on JIT and middleware, game developers are suspicious of both.    For some reason, when you show a new technology to someone at, say, Twitter, they'll immediately start thinking how they can take advantage of it. When you show it to a game developer, he'll say, "where's the demo", and when you show him a demo he'll say, "well, that doesn't quite do what my game does". It's not a matter of games being more challenging. It's about game developer culture of suspicion with a super-heavy dose of NIH syndrome.    The sad result is, and I know it will sound controversial, that game performance now lags behind the performance of apps in most other disciplines. It just doesn't appear that way because the graphics are so fast.   And this is why we've come to realize that convincing a game developer of anything is an exercise in futility. The truly challenging stuff takes place in the military and augmented reality communities, while games are about a decade behind state-of-the art. Like I said, it's a cultural thing, and there's little we can do to change that. Just about the only way to convince game developers to adopt a new technology is to not let them know it can be beneficial to games and let them think they've secretly uncovered some arcane knowledge.   That's the end of my rant. All I can do is put our stuff out there and if some adventurous game developer wants to try it, they'll see how helpful it is and they'll build a better game. If not - well, we make our money from other industries. You can only do so much to argue the theoretical merit of some technology or approach over another. There are always arguments and counterarguments. There's a point in the discussion when the developer will have to get his or her hands dirty and try the new technology out. Only then can the discussion continue.   I truly admire the game development community, and I remember the time when it was at the forefront of software development and showed others the way. I hope it gets to be that way again. In the meantime, I would really enjoy hearing your suggestions, reservations and questions and will try to address them, but in order to make the discussion fruitful - you're just going to have to try the stuff first. 
  11.   It is different in that it's not a database that let's your un stored procedures, but one that runs you entire application. That's why I said you can think of it as writing your entire application as stored procedures, but stored procedures are usually thought of as something separate from your app; as a useful tool. Often (as in the case of Redis, too, I believe), they need to be written in a different programming language than the rest of your app. Here, your entire code is written as callbacks (or lambdas or closures -- whatever you want to call them) which you pass to the database, which schedules and runs them.        Once you have lots of messages between shards then your architecture isn't really sharded: You have a sharded database, and then you need lots and lots of developers -- like they have at Facebook or Google or Amazon -- work around the sharding with complicated messaging code. I say, let the database handle that for you. It knows at which node each piece of information is best stored, so as to minimize messaging. You can read more about our data grid here, where I show how messaging is minimized.         Ah, here you hit the nail on the head. The database can only be smart about partitioning and scheduling is if the data-structure captures the domain use. In a spatial setting, most transactions will span a contiguous region of space, and the database can exploit this locality. Geospatial hashing gives a rather poor approximation of this locality. That is why I do not envision a single data structure that can do all I think a database can and should do for all applications. Different types od data and different data-access patterns would require a different data structure. But once you pick the right one, you won't need all those hordes of scalability experts to work around the limitations of a "stupid" database.   As for simulations, it's true the demo happens to be a simulation, but that's incidental. Like I say in the article, there are better ways to write a simulation using better simplifying assumptions. But the demo is written naively using the database and still gives good performance. I would say, though, that coming to terms with the suggested approach requires thinking differently about what a database is and what it should do. I mention that in the past the database was also considered to be the core of the application, but we've since moved away from that point of view, and started regarding the database as a simple data store. For example, for low latency applications, I do not think that the database even needs to be durable, or even use a disk at all. 
  12. Yep, that's me.   Well, only mentioned stored procedures to show that this approach has grounding in the past. Obviously, we're not talking plsql here, and there is nothing to differentiate those "stored procedures" from regular application code. It is true, however, that you need to use callbacks and lambdas. Those look particularly bad in Java (at least until Java 7), but would be much more natural in, say, Clojure. But conceptually, I see no other choice for writing scalable code. As soon as you submit your code to be executed on some thread pool, you're using a lambda. Making this natural It's only a matter of syntax. If you don't use lambdas at all (and in most modern languages every function is a lambda anyway), you're basically back to the old C loop, and back to partitioning your domain into fixed, disjoint regions, each run by a single thread, and that can't scale.   So, whether or not callback-based code always complicates systems, or only does so with "old" languages -- callbacks are here to stay. I hope more languages would transform them syntactically to continuations as those appear more like single-threaded functions, and might be easier for the programmer to reason about.   I don't think the problem in the demo is embarrassingly parallel at all. Entities, in this case spaceships, are contended all the time and require locking. If you can think of something that would seem less embarrassing :), I'd love to know what it is so we can try to tackle it.   And yes, you can only access entities from within visitors in order to allow the database to schedule your code (and prevent deadlocks). The idea is that the database wants to know what data you wish to process, and then decides how to schedule it, and whether it's in conflict with other transactions. But, again, this wouldn't seem as much of a problem if callbacks would be represented syntactically as continuations.   And as for non-spatial queries, those will be supported by our future products. SpaceBase only runs spatial queries. I am very much interested to hear what would be the concurrency challenges for that domain. 
  13. Hi. Yesterday I published a new blog post on our company blog: http://blog.paralleluniverse.co/post/44146699200/spaceships It showcases a 10Ks spaceship battle simulation we've built on top of our spatial DB, which was built in Java. The point of the post is not to show how fast the simulation can run because there are faster ways to do it. The idea is to show how even a naive solution running on top of the right kind of database can achieve decent performance and good scaling. I think some of you may find this interesting.
  14. Well, if the two characters just see each other, then there's no reason why each wouldn't get updates about the other from the other's node. If they interact then the way Galaxy works, after the first interaction they would both be handled on the same node. So the scenario you describe could really only happen if the two interact indirectly, say by passing a ball between them back and forth. In that case, the ball would migrate from one node to the other and back. That would entail one network roundtrip for each pass. But how fast could the two characters pass the bal between them?
  15. Well, after an object is updated (once or multiple time), a read from any other node would require a network roundtrip. Just items close to one another "across the border" won't cause network IO, even if clients need to see items on both sides. You can just send the data to the client from both nodes. It's only when the items interact that inter-node IO would be required. Now, if one node initiates a transaction, all items in that transactions are transferred to that node (to be "owned" by it), and then they stay there. So, for example, if a character stands right on the border and touches an object on side, that object will be transferred to the character's node; if the character then touches an object on the other side, this, too, will be migrated to the same node, and any further interaction with those two objects will no longer require network IO. So, potentially you'll have a problem if two characters stand on either side of the border, kicking, say, a ball between them back and forth. But how fast could that happen? Would your game allow them to pass the ball hundreds of time a second?
  • Advertisement

Important Information

By using GameDev.net, you agree to our community Guidelines, Terms of Use, and Privacy Policy.

GameDev.net is your game development community. Create an account for your GameDev Portfolio and participate in the largest developer community in the games industry.

Sign me up!