Do you need to serialize/deserlize huge amount of data in very quick and simple way? Is XML, JSON or other mechanism for serialization too slow or too heavy for you? There is a nice solution for these problems. It is called Google Protocol Buffers. If you don’t know what is it and how it works, read my post below.
Protocol Buffers in a nutshell
Official Protocol Buffers page says: “Protocol buffers are Google’s language-neutral, platform-neutral, extensible mechanism for serializing structured data – think XML, but smaller, faster, and simpler. You define how you want your data to be structured once, then you can use special generated source code to easily write and read your structured data to and from a variety of data streams and using a variety of languages – Java, C++, or Python.”
Interesting are also reasons for releasing this API. Google announces the following reasons:
- Protocol buffers are used by practically everyone inside Google. They have many other projects they would like to release as open source that use protocol buffers, so to do this, Google needed to release protocol buffers first. In fact, bits of the technology have already found their way into the open – if you dig into the code for Google AppEngine, you might find some of it.
- Google like to provide public APIs that accept protocol buffers as well as XML, both because it is more efficient and because they’re going to convert that XML to protocol buffers at the end anyway.
- People outside Google might find protocol buffers useful.
- Getting protocol buffers into a form Google were happy to release was a fun 20% project.
Why I have chosen Protocol Buffers is simple – I need something like “XML, but smaller, faster, and simpler” :).
Why “think XML, but smaller, faster, and simpler”?
I wanted to know, how Protocol Buffers differ from XML, before I start using it. I found description of main differences in this overview. Protocol Buffers:
- are simpler and generate data access classes that are easier to use programmatically
Manipulating Protocol Buffers is much easier than XML. You must only create proto file like XML Schema and generate class file with Protocol Buffers compiler. From now you can use Protocol Buffers in very simple way by creating or reading appropriate objects. See example below.
Proto file that define structure of data:message Person { required string name = 1; required int32 id = 2; optional string email = 3; }
Usage of protocol buffer class in Java :
Person person = new Person(); // create object person.setName("Maciej"); // set values person.setId(1); int id = person.getId(); // read values string name = person.getName();
Usage of XML format in Java :
// create document DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance(); DocumentBuilder builder = factory.newDocumentBuilder(); Document document = builder.newDocument(); // create root element Element root = document.createElement("Person"); document.appendChild(root); // add child element Element nameNode = doc.createElement("name"); root.appendChild(nameNode); Text nametextNode = doc.createTextNode("Maciej"); nameNode.appendChild(nametextNode); // read value string name = nametextNode.getData();
You can see that Protocol Buffers is simpler and more intuitive in use than XML.
- are 3 to 10 times smaller and 20 to 100 times faster
As far as I am concerned performance is the most important advantage of Protocol Buffers. Beyond Google overview I found a very interesting site with performance tests. The conclusion is rather simple: Protocol Buffers serialization is smaller and faster than XML, JSON, Binary and others seriailzation mechanisms.
Something for .NET fans
Currently I work in C# and I needed .NET version of Protocol Buffers. I was worried when I red that there are only Java, C++ and Python APIs. But after few minutes I found site Third-Party Add-ons for Protocol Buffers whit links for three C# implementations:
I didn’t have time to test them all, but after reading this blog post I have chosen dotnet-protobufs by Jon Skeet, because it is “close to the original Google PB spirit, in the sense that you have to explicitly call methods to serialize/deserialize”. This feature is very important for me because I can learn Protocols Buffers from an official tutorial – in my case Java tutorial.
Summary: If you need simple and fast data serialization mechnism you should use Google Protocol Buffers successfully. Please note that serialization results are less readable then XML/JSON formats.
Share your opinion and experience with us below or meet us on Twitter: @GOYELLO.
Thanks to Karol Świder for helping me write this blog post.
you may also want to look at next generation parser such as vtd-xml
http://vtd-xml.sf.net
Thanks!
i wonder if someone has compared AMF3 and this GPB for flash platform.
i added this to my collge report
Thanks
jaine
______________________________________________
aion gold | Aion Powerleveling
The original World of Warcraft has eight races in the choice and subsequent expansions to increase this number to twelve. The race of your character determines your overall visual appearance, gives special benefits in the form of racial abilities, and also you locked in one of two factions: the Alliance or a faction Horde.Which that you choose is important because only the characters of the same faction can communicate each with the other. Thus, for example, if you created a night elf, this character would able to discuss and play with the Alliance of other characters in the field. A tauren, on the other hand, could not play with the Kingdom Horde of other characters.Type your comment here.