Communication between languages

Started by
19 comments, last by wood_brian 11 years, 9 months ago

Anyway, it looks like users have to write code to populate Protobuf's generic container at run-time.


It sounds like you are confusing the three layers of Protobuf. These are:

- The IDL description language, which describes the semantics of the data.
- The wire layout, which describes what the actual bits are on the wire for a particular data structure.
- The tools and libraries that implement protobuf in a particular language.

Note that there can (and do) exist more than one implementation of bindings for a particular language, and part of implementing bindings is to decide how the user uses the library, and what code gets generated. We've developed bindings for protobuf to Erlang, PHP and Javascript, and they each make different decisions on how to expose the data structures to the native language.
enum Bool { True, False, FileNotFound };
Advertisement
Iiuc, you're saying language bindings provide the code to populate protobuf's generic container and users don't have to write it themselves. That's better than what I was thinking, but what about the run-time aspect?
You have two options, basically: use the provided protobuf bindings for your language of choice (which means using their provided container code) or write the bindings yourself to comply with the IDL and use whatever containers you want.

So if you're genuinely in a situation where you need the speed, you can directly store data transmitted via protobufs into whatever representation you want. In 90% of situations, where getting the job done is more important, you just use the provided bindings and wrap them in whatever way makes sense.


I'm not sure what's unclear about this?

Wielder of the Sacred Wands
[Work - ArenaNet] [Epoch Language] [Scribblings]


You have two options, basically: use the provided protobuf bindings for your language of choice (which means using their provided container code) or write the bindings yourself to comply with the IDL and use whatever containers you want.

So if you're genuinely in a situation where you need the speed, you can directly store data transmitted via protobufs into whatever representation you want. In 90% of situations, where getting the job done is more important, you just use the provided bindings and wrap them in whatever way makes sense.


I'm not sure that getting it done the faster way development-time-wise offers a good foundation for reworking it later if you decide to shift gears.


I'm not sure what's unclear about this?
[/quote]

It's clear with your explanation here. I guess I think the extra code that has to be generated, built, loaded, and run is a bigger deal than some people.
I'm not sure that getting it done the faster way development-time-wise offers a good foundation for reworking it later if you decide to shift gears.[/quote]

The fact that you manage all your external (protocol) data using a well-defined IDL, a well-defined wire format, and a tool chain that supports many languages, is a FANTASTIC start on being able to optimize/improve the parts that matter, once you have an actual system where you can measure what matters.

I think the extra code that has to be generated, built, loaded, and run is a bigger deal than some people.[/quote]

Two things:

1. In many languages, you can parse Protobuf descriptions at runtime, and interpret them rather than compile them. Python, for example, can easily do this, as can Javascript and most other truly dynamic languages. The Protobuf in-memory representation even supports transparent version up/down-shifting, by storing unrecognized fields in a "copy-forward" format. This means you can upgrade your protobuf IDL files (which means the network protocol) at runtime. Sometimes, that's quite valuable, and other times, that's probably a bad idea. Network wires are typically slower than the CPUs interpreting the data, so the overhead of interpretation is often not important in a profile of the running system.

2. When you know that CPU cost is important, then you want dedicated marshaling/demarshaling code that is compiled. You could write this code manually. That code would have to be built, debugged, loaded and run, to be able to understand the protocol. Unfortunately, manual marshaling is a very bug-prone way of development. Thus, you can instead generate the code from the IDL files. This doesn't generate appreciably more or worse codtoue than the hand-coded version, but you can be certain that there are no marshaling bugs. Additionally, implementing changes is as simple as re-running "make," rather than, for example, having to update half a dozen different source files that touch the data.
enum Bool { True, False, FileNotFound };


a well-defined wire format,

[/quote]

I don't think there's anything preventing the development a well-defined wire format for this.


and a tool chain that supports many languages, is a FANTASTIC start on being able to optimize/improve the parts that matter, once you have an actual system where you can measure what matters.

[/quote]

Perhaps it would be possible to use languages' built-in serialization support more in distributed systems development rather than having this functionality duplicated in CORBA implementations, Protocol Buffers, Thrift, etc. A different approach might be helpful in terms of making it easier for more languages to be used in distributed systems.
That's a great theory, until you want to write one half of your distributed system in Erlang and the other in Stackless Python.

Wielder of the Sacred Wands
[Work - ArenaNet] [Epoch Language] [Scribblings]

Perhaps it would be possible to use languages' built-in serialization support[/quote]

Then again, perhaps it wouldn't. Each language serializes differently, and almost all languages are too verbose with their "native" serialization for games purposes.
If you want to try this, I suggest trying to find a way to load a Python Pickle into a Java Serialization stream, or perhaps make a DCOM RPC into an Erlang port. I'd be very interested in hearing how that goes!
enum Bool { True, False, FileNotFound };


Then again, perhaps it wouldn't. Each language serializes differently, and almost all languages are too verbose with their "native" serialization for games purposes.


When you say each language serializes differently that goes back to my initial question.
How is built-in serialization verbose?


If you want to try this, I suggest trying to find a way to load a Python Pickle into a Java Serialization stream, or perhaps make a DCOM RPC into an Erlang port. I'd be very interested in hearing how that goes!
[/quote]

From a practical perspective I think it makes sense to focus on scenarios where one of the languages is C++... thus the idea to adapt my middle tier to work with whatever language that Java code generator is written in. I found getting away from a web-based front end to be liberating.
How is built-in serialization verbose?[/quote]

By including too much information compared to what you can get away with when you have lots of domain specific knowledge.

I found getting away from a web-based front end to be liberating. [/quote]

Is this the same web-based front end that you last year suggested would be much better than everything that had come before it?
You may want to go back and re-review the discussion that was had about that a year or two ago. If I remember right, a lot of the same information that's in this thread was provided at that time, too.
enum Bool { True, False, FileNotFound };

This topic is closed to new replies.

Advertisement