Google Protocol Buffers = “think XML, but smaller, faster, and simpler”

Do you need to serialize/deserlize huge amount of data in very quick and simple way? Is XML, JSON or other mechanism for serialization too slow or too heavy for you? There is a nice solution for these problems. It is called Google Protocol Buffers. If you don’t know what is it and how it works, read my post below.

Protocol Buffers in a nutshell

Official Protocol Buffers page says: “Protocol buffers are Google’s language-neutral, platform-neutral, extensible mechanism for serializing structured data – think XML, but smaller, faster, and simpler. You define how you want your data to be structured once, then you can use special generated source code to easily write and read your structured data to and from a variety of data streams and using a variety of languages – Java, C++, or Python.”

Interesting are also reasons for releasing this API. Google announces the following reasons:

  • Protocol buffers are used by practically everyone inside Google. They have many other projects they would like to release as open source that use protocol buffers, so to do this, Google needed to release protocol buffers first. In fact, bits of the technology have already found their way into the open – if you dig into the code for Google AppEngine, you might find some of it.
  • Google like to provide public APIs that accept protocol buffers as well as XML, both because it is more efficient and because they’re going to convert that XML to protocol buffers at the end anyway.
  • People outside Google might find protocol buffers useful.
  • Getting protocol buffers into a form Google were happy to release was a fun 20% project.

Why I have chosen Protocol Buffers is simple – I need something like “XML, but smaller, faster, and simpler” :).

Why “think XML, but smaller, faster, and simpler”?

I wanted to know, how Protocol Buffers differ from XML, before I start using it. I found description of main differences in this  overview. Protocol Buffers:

  • are simpler and generate data access classes that are easier to use programmatically
    Manipulating Protocol Buffers is much easier than XML. You must only create proto file like XML Schema and generate class file with Protocol Buffers compiler. From now you can use Protocol Buffers in very simple way by creating or reading appropriate objects. See example below.
    Proto file that define structure of data:

    message Person {
        required string name = 1;
        required int32 id = 2;
       optional string email = 3;
    }

    Usage of protocol buffer class in Java :

    Person person = new Person();        // create object
    
    person.setName("Maciej");              // set values
    person.setId(1);
    
    int id = person.getId();                   // read values
    string name = person.getName();

    Usage of XML format in Java :

    // create document
    DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
    DocumentBuilder builder = factory.newDocumentBuilder();
    Document document = builder.newDocument();
    
    // create root element
    Element root = document.createElement("Person");
    document.appendChild(root);
    
    // add child element
    Element nameNode = doc.createElement("name");
    root.appendChild(nameNode);
    Text nametextNode = doc.createTextNode("Maciej");
    nameNode.appendChild(nametextNode);
    
    // read value
    string name = nametextNode.getData();

    You can see that Protocol Buffers is simpler and more intuitive in use than XML.

  • are 3 to 10 times smaller and 20 to 100 times faster
    As far as  I am concerned performance is the most important advantage of Protocol Buffers. Beyond Google overview I found a very interesting site with performance tests. The conclusion is rather simple: Protocol Buffers serialization is smaller and faster than XML, JSON, Binary and others seriailzation mechanisms.

Something for .NET fans

Currently I work in C# and I needed .NET version of Protocol Buffers. I was worried when I red that there are only Java, C++ and Python APIs. But after few minutes I found site Third-Party Add-ons for Protocol Buffers whit links for three C# implementations:

  1. dotnet-protobufs by Jon Skeet,
  2. Proto#,
  3. protobuf-net.

I didn’t have time to test them all, but after reading this blog post I have chosen dotnet-protobufs by Jon Skeet, because it is “close to the original Google PB spirit, in the sense that you have to explicitly call methods to serialize/deserialize”. This feature is very important for me because I can learn Protocols Buffers from an official tutorial – in my case Java tutorial.

Summary: If you need simple and fast data serialization mechnism you should use Google Protocol Buffers successfully. Please note that serialization results are less readable then XML/JSON formats.

Share your opinion and experience with us below or meet us on Twitter: @GOYELLO.

Thanks to Karol Świder for helping me write this blog post.

Aspire Blog Team

Aspire Systems is a global technology services firm serving as a trusted technology partner for our customers. We work with some of the world's most innovative enterprises and independent software vendors, helping them leverage technology and outsourcing in our specific areas of expertise. Our services include Product Engineering, Enterprise Solutions, Independent Testing Services and IT Infrastructure Support services. Our core philosophy of "Attention. Always." communicates our belief in lavishing care and attention on our customers and employees.

5 comments

  1. The original World of Warcraft has eight races in the choice and subsequent expansions to increase this number to twelve. The race of your character determines your overall visual appearance, gives special benefits in the form of racial abilities, and also you locked in one of two factions: the Alliance or a faction Horde.Which that you choose is important because only the characters of the same faction can communicate each with the other. Thus, for example, if you created a night elf, this character would able to discuss and play with the Alliance of other characters in the field. A tauren, on the other hand, could not play with the Kingdom Horde of other characters.Type your comment here.

Comments are closed.