Intersec Object Packer – Part 1 : the basics

This post is an introduction to a useful tool here at Intersec, a tool that we call IOP: the Intersec Object Packer.

IOP is our take on the IDL approach. It is a method to serialize structured data to use in our communication protocols or data storage technologies. It is used to transmit data over the network in a safe manner, to exchange data between different programming languages or to provide a generic interface to store (and load) C data on disk. IOP provides data integrity checking and backward-compatibility.

The concept of IDL is not new. There are a lot of different available languages, such as Google Protocol Buffers or Thrift. IOP itself isn’t new, its initial version was written in 2008 and has seen a lot of evolutions during its, almost decade-long, life. However, IOP has proven itself to be solid and sufficiently well designed for not seeing any backward incompatible changes during that period.

IOP package description

The first thing to do with IOP is to declare the data structures in the IOP description language. With those definitions, our IOP compiler will automatically create all the helpers needed to use these IOP data structures in different languages and to allow serialization and deserialization.

Data stucture declaration is done in a C-like syntax (actually, it is almost the D language syntax) and lives inside a .iop file. As a convention, we use CamelCase in our iop files (which is different from our .c files coding rules).

Let’s look at a quick example:

Here we are. An IOP object with two fields: an id (as an integer) and a name (as a string). Obviously, it is possible to create much more complex structures. To do so, here is the list of available types for our structure fields.

Basic types

IOP allow several low-level types to be used to define object members. One can use the classics:

  • int/uint          (32 bits signed/unsigned integer)
  • long/ulong     (64 bits signed/unsigned integer)
  • byte/ubyte     (8 bits signed/unsigned integer)
  • short/ushort  (16 bits signed/unsigned integer)
  • bool           
  • double
  • string

and also the types:

  • bytes              (a binary blob)
  • xml                (for an XML payload)
  • void                (to specify a lack of data).

Complex types

Four complex data types are also available for our fields.

Structures

The structure describes a record containing one or more fields. Each field has a name and a type. To see what it looks like, let’s add an address to our user data structure:

Of course, there is no theoretical limitation on the number of struct “levels”. A struct can have a struct field which also contains a struct field etc.

Classes

A class is an extendable structure type. A class can inherit from another class, creating a new type that adds new fields to the one present in its parent class.

We will see classes in more details in a separate article.

Unions

An union is a list of possibilities. Its description is very similary to a structure: it has typed fields, but only one of the fields is defined at a time. The name union is inherited from C since the concept is very similar to C unions, however IOP unions are tagged, which means we do know which of the field is defined.

Example:

Enumeration

The last type that can be used is the enumeration. Here again, an enum is similar to the C-enum. It defines several literal keys associated to integer values. Just like the C enum, the IOP enum supports the whole integer range for its values.

Example:

Member constraints

Now that we have all the types we need for our custom data structure fields, it’s time to add some new features to them, in order to gain flexibility. Those features are called constraints. These constraints are qualifiers for IOP fields. For now, we have 4 different constraints: optional, repeated, with a default value and the implicit mandatory constraint.

Mandatory

By default, a member of an IOP structure is mandatory. This means it must be set to a valid value in order for the structure instance to be valid. In particular, you must guarantee the field is set before serializing/deserializing the object. By default, mandatory are value fields in the generated C structure: this means the value is inlined in the structure type and is copied. There are however some exceptions to this rule but we will see that later.

The example is pretty simple:

Optional members

An optional member is indicated by a ? following the data type. The packers/unpackers allow these members to be absent without generating an error.

Repeated members

A repeated member is a field that can appear zero or more times in the structure (often represented by an array in the programming languages). As such a repeated field is optional (can be present 0 times). A repeated member is indicated by a “[]” following the data type.

In the next example, you can consider the repeatedInteger field as a list of integers.

With default value

A field with a default value is a kind of mandatory member but allowed to be absent. When the member is absent, the packer/unpacker always sets the member to its default value.

A member with a default value is indicated by setting the default value after the field declaration.

Moreover, it is allowed to use arithmetic expressions on integer (and enum) member types like this:

IOP packages

The last thing to know to be able to write our first IOP file is about packages.

An IOP file corresponds to an IOP package. Basically, the package is kind of a namespace for the data structures you are declaring. The filename must match with package name. Every IOP file must define its package name like this:

A package can also be a sub-package, like this:

Finally, you can import objects from another package by specifying the package name before the type:

How to use IOP

Before going to more complicated features on IOP, let’s see a simple example of how to use our new custom data structures that we just declared.

When compiling our code, a first pass is done on our IOP files using our own compiler. This compiler will parse the .iop files and generate the corresponding C sources files that provides helpers to serialize/deserialize our data structures. Here again, we will see it in more details soon :)

Let’s see an example of code which is using IOP. First, let’s assume we have declared a new IOP package:

This will create several C files containing the type descriptors used for data serialization/deserialization as well as the C types declarations:

Not very different from the IOP file right? We can notice some uncommon stuff still:

  • The opt_i32_t type for zip_code. This is how we handle optional field. It is a structure containing a 32 bits integer + a boolean indicating if the field is set or not.
  • The stuctures names are now in snake_case instead of camelCase. The name of the package is added as a prefix of each structures, and there is a __t suffix too. This helps to recognize IOP structures when we meet one in our C code.

All the code generated by our compiler will be available through a user.iop.h file.

Now let’s play with it in our code :

Here we are. IOP gave us the superpower of packing/unpacking data structures in a binary format in two simple function calls. These binary packed structures can be used for disk storage. But as we will see in a future article, we also use it for our network communications.

Next time, we will talk about inheritance for our IOP objects!