Protocol Buffers DataReader Extensions for .NET

.NET, as a mostly-statically typed language, has a lot of really good options for serializing statically-typed objects. Protocol Buffers, MessagePack, JSON, BSON, XML, SOAP, and the BCL’s own proprietary binary serialization are all great for CLR objects, where the fields can be determined at runtime.

However, for data that is tabular in nature, there aren’t so many options. In my past two jobs I’ve had a need to serialize data:

  • That is tabular – not necessarily CLR DTOs.
  • Where the schema is unknown before it is deserialized – each data set can have totally different columns.
  • In a way that is streamable, so entire entire data sets do not have to be buffered in memory at once.
  • That can be as large as hundreds of thousands of rows/columns.
  • In a reasonably performant manner.
  • In a way that could potentially be read by different platforms.
  • Into as small a number of bytes as possible.

Protocol Buffers DataReader Extensions for .NET was born out of these needs. It’s powered by Marc Gravell’s excellent Google Protocol Buffers library, protobuf-net, and it packs data faster and smaller than the equivalent DataTable.Save/Write XML:

Usage is very easy. Serializing a data reader to a stream:

DataTable dt = ...;

using (Stream stream = File.OpenWrite("C:\foo.dat"))
using (IDataReader reader = dt.CreateDataReader())
{
    DataSerializer.Serialize(stream, reader);
}

Loading a data table from a stream:

DataTable dt = new DataTable();

using (Stream stream = File.OpenRead("C:\foo.dat"))
using (IDataReader reader = DataSerializer.Deserialize(stream))
{
    dt.Load(reader);
}

It works with IDataReaders, DataTables and DataSets (even nested DataTables). You can download the protobuf-net-data from NuGet, or grab the source from the GitHub project page.

3 thoughts on “Protocol Buffers DataReader Extensions for .NET

  1. Fantastic work — thank you! Great to see someone with the same problems delivering solutions. Have you considered talking to Marc about rolling this back into the core protobuf-net source?

  2. Thanks Chad. Marc does know about this project but I suspect because data tables are not part of the protocol buffers spec it’s not something he would probably want to roll into the main library.

  3. That’s simply excellent Richard!

    I was discussing very recently the absurdity of creating custom serialisers when Google has protobuf – and while looking online for .NET related implementations I came across this post (and your github code).

    Many thanks!

Comments are closed.