Jump to content
  • Advertisement
Sign in to follow this  
CalvinCoder

Unity Designing distributed systems

This topic is 4306 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

Hey guys! I am working on the architecture/design of a distributed system and I am not 100% sure of how to attack some of my problems. So I figured out I wanted to make a post to this great community to hopefully get some nice input :) And gosh, this will be a long post so I really hope someone has patient to read it =/ Basically, the system I am working on is a system for providing highly interactive and some non-interactive services for mobile phone users, in addition to provide billing services for external systems (like external streaming servers etc). I have already implemented the system today and it has been running for soon 2 years, with a few quirks here and there but in overall it is quite stable. However, today, the system is implemented as one application with the exception of inbound MMS receivers/connectors which are implemented as servlets, running under Tomcat. The current system is event based, i.e. when an inbound SMS is received on one of the inbound SMS connectors, the message is first converted to an internal message format and then an InboundSMSEvent is posted to the system and whatever module that is interested in inbound SMS pick the event up. Basically, this is the service subsystem that does and it is the service manager that listens for inbound SMS events and dispatches the message further to correct service based on keywords written in the text message (SMS) etc. The connector subsystem (or tier if you want to) is responsible for communicating with all external systems. The definition of a connector is basically a component or module that handles the inbound or outbound communication with the different operators, whether it is messages like SMS, MMS or system specific message formats etc. The protocols or message services supported are SMPP, JMS, HTTP/HTTPS (SOAP, XML) and maybe others as new protocols or message services must be supported. The connector tier is also responsible for translating between operator specific message formats or protocol specific message formats and the internal message format used by the system. A service is a component or module that does something for a user of the service like a chat service, auction service etc. Even though the system works very well today, there are many drawbacks. The system does not scale very well, if one thing crashes it might crash the whole system and whenever a new update is needed for example for service A, the whole system must be restarted which resets the current state of all services and the rest of the system (the latter can easily be solved by persistence of course but you see my point). So a new architecture is needed to make the system more scalable and get rid of some of the other issues too. The system will be distributed. The system I am designing now shall be the “core platform” where new connectors and services will be easily be added as new telecom operators shall be supported or new services will be launched. The new system will follow very much of the previous system design, except many parts will be made distributed. The system is implemented in JAVA and I have decided to go for RMI but any other suggestions is welcome  But my main problem in this all is to actually figure out how to make the system distributed to make it as scaleable as possible, flexible, good performance and stable and avoid single point of failure and bottlenecks. The message flow is pretty easy; a message originates as an SMS or MMS from a mobile phone to the operators and then to our system or the message originates from another external system and then is received by the connector tier, which converts it to internal message format and passes it further to the service tier. And vica versa, the message is generated by one of the services, is passed through the system to the connector tier which is then responsible to send it further using the correct connector (i.e. using the correct operator). For telecom operators’ connectors, there can be only one inbound connector but in some cases it can be 1 or many outbound connectors. Today, the connector manager is responsible for starting up each connector which in turn registerate itself with the connector manger using the country code + a unique id for the country as its id. The same for service system, the service manager is responsible for starting up each service which in turn registerate itself with the service manager using a unique service/channel id. There are other requirements to consider too, which is today handled individually by each service which is far from ideal. And that is for example special rules each operator might have. For instance, the operators in Norway have rules that if a user receives 20 premium messages (MT’s) from the system without sending a message back (MO), he shall be logged out of the service/system. Another rule is that a user can totally only be billed for max 5000 NOK pr month so each service must be aware of how much other services has billed. A own subsystem tier will handle this operator rules in the new architecture. My main problem is that I can clearly see how to make this distributed and yet avoid bottlenecks, single point of failure, good performance and stability. On top of that, the services must know where to find the connector manager or connector managers, the connectors must know where to find the service manager or service managers, the connector manager must know where to find all outbound connectors and service manager must know where to find all services. But this post already starts to get extremely long, so let me try to explain a specific scenario. Let’s say a service shall send the same message to 1500 users. The service can either generate 1500 messages; post them to the connector manager which dispatches the message further to correct connector. This might or might not put unnecessary load on the system. Instead, the service can generate 1 message containing 1500 recipients and pass it to the connector manager which in turn will can either sort the recipients based on country and operator for each user, generate 1 message for each combination of country+operator and add multiple recipients if the operator supports that or break it up in single message if the operator does not support multiple recipients in the message formats. Though sending 1 message to multiple recipients (if operator supports it) creates new problems; message logging. But that is of course also solvable. Another way is that the connector manager itself is responsible for creating 1 message for each recipient and passes it on to the correct connector. At least, this takes away load from service to connector manager. But then, what if the connector manager gets overloaded or crashes? Well, then we can run several instances of a connector manager. But how will a service know if a connector manager is down or overloaded and how will it find next connector manager to use etc Will it for example be ok to use a system where one connector manager is assigned as master and if master is overloaded it will direct request from service to next slave connector manager will such redirection create extra load on the system? Extra load it will put but will it affect the system too much? I really cant see any other solution to it…… Or if master has crashed, another connector manager takes over as master? Well, if you have read all the way down to here, I am happy  And hopefully you get the idea of my problems and hopefully someone can give me some good input  Thank you for your time!!

Share this post


Link to post
Share on other sites
Advertisement
Guest Anonymous Poster
Quote:
Original post by CalvinCoder
The system is implemented in JAVA and I have decided to go for RMI but any other suggestions is welcome 


ever heard of "Erlang" (wikipedia/google)-the project you are describing seems like the natural fit for it, in fact Erlang was specifically invented for this sort of applications (distributed, scalable, high-reliability ...) and even for the very domain you are working in.

So, you may want to check out some of the wikipedia references and maybe do a quick forum search for Erlang here at gamedev.net to see what people think about it.

Share this post


Link to post
Share on other sites
Guest Anonymous Poster
Also, note that there's no need to implement all of the new design in Erlang: rather, Erlang has been designed with high interoperability in mind, thus it has various mechanisms available that allow it to work with modules/programs implemented in other languages - such as for example, C, C++, Python or Java - i.e. by using CORBA, RMI or other RPC methods (possibly custom ones,likewise you can link in native libraries, too).

So, there should be hardly any reasons not to check out Erlang in your specific case, as you will be able to retain much of your current design and functionality while simply progressing step by step using Erlang.

Besides, your specs are pretty specific, you may find that these things are better dealt with in a Erlang forum (newsgroup/mailing list), given that the users there are not only extremely familiar with Erlang itself, but also the domain you are working in, you should be able to expect lots of support - of course, you should make sure to first consult the docs & references, to actually get some basic understanding of Erlang in the first place.

Share this post


Link to post
Share on other sites
Quote:
Original post by Anonymous Poster
ever heard of "Erlang" (wikipedia/google)-the project you are describing seems like the natural fit for it, in fact Erlang was specifically invented for this sort of applications (distributed, scalable, high-reliability ...) and even for the very domain you are working in.


Jepp, sure :) I've been working with telecom for several years which is a field that uses Erlang a lot, so I am aware of Erlang. Though switching to Erland is not something I think we will do and that's why I havent considered it. We have already invested a good amount of time and money into our current system which we can reuse almost every bit of code.

Quote:
Original post by Anonymous Poster
Also, note that there's no need to implement all of the new design in Erlang: rather, Erlang has been designed with high interoperability in mind, thus it has various mechanisms available that allow it to work with modules/programs implemented in other languages - such as for example, C, C++, Python or Java - i.e. by using CORBA, RMI or other RPC methods (possibly custom ones,likewise you can link in native libraries, too).


Yes, I will dig further down in Erland and look for ideas on how to adpot the various mechanisms into our appliacation.

Thanks for your reply, much appreciated :)

Share this post


Link to post
Share on other sites
Guest Anonymous Poster
(different ap) I'd suggest looking into the recently open sourced Terracota java virtualization system. It's supposed to make it easy to distribute any threaded java application (like tomcat).

I know our java application server were I work handles millions of transactions a day, but it's backed by a mainfram os/390. And it's not telecom, but finance.

Share this post


Link to post
Share on other sites
Quote:
Original post by Anonymous Poster
Also, note that there's no need to implement all of the new design in Erlang: rather, Erlang has been designed with high interoperability in mind, thus it has various mechanisms available that allow it to work with modules/programs implemented in other languages - such as for example, C, C++, Python or Java - i.e. by using CORBA, RMI or other RPC methods (possibly custom ones,likewise you can link in native libraries, too).

While I'm a pretty big Erlang fanboy, I have to correct this statement.
C/C++/Python/Java can only be used by Erlang in 3 ways. C Built-in-function you have to program yourself (good luck on that, since Erlang is very strict), through piped message passing (for a limited language subset), or through network message passing (basically having an erlang server locally with another binary talking to it localhost).

http://erlang.org/doc/doc-5.5.2/doc/tutorial/part_frame.html

Share this post


Link to post
Share on other sites
Guys,

thank you very much for your replys and feedback. It is really appreciated. :)

My main concern is really not which languages to use. I am very aware of the nature of Erlang, though I dont have huge experience with it.

My main concern is really how to divide my application into several nodes and how each node should work and communicate with other nodes..... Of course, communication will be done using RMI, I mean things like what is responsbile for doing what to keep a stable distributed system with good performance and avoiding single point of failures and bottlenecks)...

I know for example Erlang can help calculating some of the issues based on how much traffic will pass in and out, however, the nature of this system is not as easy as for example a typical call center so it is a difficult task.

Like the connector subsystem, if we for instance simplify it and says it only supports inbound and outbound SMS, I have some basic ideas on how I want to structure the application but I still cant see cleary if my toughts are good or bad.

Basicly, the inbound connectors is not so much to do about, for each operator it can only be one inbound any how. But the translation of operator specific message format (for example from JMS message) to our internal message format (which is XML based) can maybe be an idea to distribute but for low traffic that will just make it slower but for high traffic (whatever that will be in number of incoming messages) it might be a good idea to be able to distribute the translation even though each translator (for each inbound message) will anyhow be started in its own thread. And since each connector anyhow runs as a standalone process, we can dediced one single computer for one single inbound connector which should be able to handle a pretty high load.

Its the outbound connectors that will be kept busy really. And for outbound connectors, we can run several connectors for one operator. Though, there must be a connector manager (or maybe a OutboundMessageDispatcher is a better name), than must construct a operator specific message for each user and pass it on to the connector so this message dispatcher will do a heavy job to in some cases. And its here I am a bit stuck, to figure out how to design it properly for keeping stabiity and good performance while avoiding bottlenecks and single point of failures. Obviously, single point of failure for this sceneario can be abvoided by using several message dispatchers, but it is here my real question is on "how do I better do that?" .. By assigning one message dispatcher as master which redirects requests to slaves when overloaded and have a system for reassigning a new master if original master goes down (maybe a network heartbeat system or someting).....

So basicly, if someone has any feedback on this matter, I would be more than greateful :)


Cheers,

Eirik Moseng,
CTO
Digiment Games
http://www.digimentgames.com

Share this post


Link to post
Share on other sites
Sign in to follow this  

  • Advertisement
×

Important Information

By using GameDev.net, you agree to our community Guidelines, Terms of Use, and Privacy Policy.

We are the game development community.

Whether you are an indie, hobbyist, AAA developer, or just trying to learn, GameDev.net is the place for you to learn, share, and connect with the games industry. Learn more About Us or sign up!

Sign me up!