Jump to content

  • Log In with Google      Sign In   
  • Create Account

#Actualpronpu

Posted 28 February 2013 - 12:27 PM

I'd like to see you address how your solution is different from other fast NoSQL databases, like Redis.

 

It is different in that it's not a database that let's your un stored procedures, but one that runs you entire application. That's why I said you can think of it as writing your entire application as stored procedures, but stored procedures are usually thought of as something separate from your app; as a useful tool. Often (as in the case of Redis, too, I believe), they need to be written in a different programming language than the rest of your app. Here, your entire code is written as callbacks (or lambdas or closures -- whatever you want to call them) which you pass to the database, which schedules and runs them. 

 

 

One shard per bus is still totally feasible and a fine way to scale. (If not, then please don't tell Facebook or Google or Amazon or ... because they'd be disappointed :-)

 

Once you have lots of messages between shards then your architecture isn't really sharded: You have a sharded database, and then you need lots and lots of developers -- like they have at Facebook or Google or Amazon -- work around the sharding with complicated messaging code. I say, let the database handle that for you. It knows at which node each piece of information is best stored, so as to minimize messaging. You can read more about our data grid here, where I show how messaging is minimized.

 

 

 

I realize that you claim your database is "spatial," but any number of "spatial" approaches can be re-written as traditional key/value (or tuple space, or whatever) equivalents.

 

Ah, here you hit the nail on the head. The database can only be smart about partitioning and scheduling is if the data-structure captures the domain use. In a spatial setting, most transactions will span a contiguous region of space, and the database can exploit this locality. Geospatial hashing gives a rather poor approximation of this locality. That is why I do not envision a single data structure that can do all I think a database can and should do for all applications. Different types od data and different data-access patterns would require a different data structure. But once you pick the right one, you won't need all those hordes of scalability experts to work around the limitations of a "stupid" database.

 

As for simulations, it's true the demo happens to be a simulation, but that's incidental. Like I say in the article, there are better ways to write a simulation using better simplifying assumptions. But the demo is written naively using the database and still gives good performance. I would say, though, that coming to terms with the suggested approach requires thinking differently about what a database is and what it should do. I mention that in the past the database was also considered to be the core of the application, but we've since moved away from that point of view, and started regarding the database as a simple data store. For example, for low latency applications, I do not think that the database even needs to be durable, or even use a disk at all. 


#1pronpu

Posted 28 February 2013 - 12:26 PM

I'd like to see you address how your solution is different from other fast NoSQL databases, like Redis.

 

It is different in that it's not a database that let's your un stored procedures, but one that runs you entire application. That's why I said you can think of it as writing your entire application as stored procedures, but stored procedures are usually thought of as something separate from your app; as a useful tool. Often (as in the case of Redis, too, I believe), they need to be written in a different programming language than the rest of your app. Here, your entire code is written as callbacks (or lambdas or closures -- whatever you want to call them) which you pass to the database, which schedules and runs them. 

 

One shard per bus is still totally feasible and a fine way to scale. (If not, then please don't tell Facebook or Google or Amazon or ... because they'd be disappointed :-)

 

Once you have lots of messages between shards then your architecture isn't really sharded: You have a sharded database, and then you need lots and lots of developers -- like they have at Facebook or Google or Amazon -- work around the sharding with complicated messaging code. I say, let the database handle that for you. It knows at which node each piece of information is best stored, so as to minimize messaging. You can read more about our data grid here, where I show how messaging is minimized.

 

 

I realize that you claim your database is "spatial," but any number of "spatial" approaches can be re-written as traditional key/value (or tuple space, or whatever) equivalents.

 

Ah, here you hit the nail on the head. The database can only be smart about partitioning and scheduling is if the data-structure captures the domain use. In a spatial setting, most transactions will capture a contiguous region of space, and the database can exploit this locality. Geospatial hashing gives a rather poor approximation of this locality. That is why I do not envision a single data structure that can do all I think a database can and should do for all applications. Different types od data and different data-access patterns would require a different data structure. But once you pick the right one, you won't need all those hordes of scalability experts to work around the limitations of a "stupid" database.

 

As for simulations, it's true the demo happens to be a simulation, but that's incidental. Like I say in the article, there are better ways to write a simulation using better simplifying assumptions. But the demo is written naively using the database and still gives good performance. I would say, though, that coming to terms with the suggested approach requires thinking differently about what a database is and what it should do. I mention that in the past the database was also considered to be the core of the application, but we've since moved away from that point of view, and started regarding the database as a simple data store. For example, for low latency applications, I do not think that the database even needs to be durable, or even use a disk at all. 


PARTNERS