Skip to content
/ picobuf Public

Small replacement for subset of protobuf.

License

Notifications You must be signed in to change notification settings

storj/picobuf

Repository files navigation

Picobuf

Picobuf is a minimal replacement for google.golang.org/protobuf with focusing on small binary size.

It does not support the whole Protocol Buffers feature set and features are added on as the need arises.

Currently the implementation is still in beta; the API can change.

How?

The usage is quite similar to the official implementation.

Ensure that you have installed the protoc protobuf compiler.

Second, install the picobuf tool with:

go install storj.io/picobuf/protoc-gen-pico@latest

Finally, add the go generate line that generates your picobuf encoding:

//go:generate protoc  -I. --pico_out=paths=source_relative:. example.proto

The example.proto looks like a usual proto file:

syntax = "proto3";
option go_package = "example.test/pb";

package example;

message Person {
  string  name = 1;
  Address address = 2;
}

message Address {
  string street = 1;
}

picobuf also supports some non-standard features:

syntax = "proto3";
option go_package = "example.test/pb";

package example;

message Timestamp {
  int64 seconds = 1;
  int32 nanos = 2;
}

message Person {
  string name = 1;

  Timestamp created_at = 5 [
    // "always_present" makes the field be a non-pointer.
    // This can help with allocation overhead. Of course, this will then lose the presence information.
    (pico.field).always_present = true,
    // "custom_type" allows to serialize the field into a different type, that was not generated by picobuf.
    (pico.field).custom_type = "time.Time",
    // "custom_serialize" allows to use a different type for serialization.
    // For example, we cannot add functions to `time.Time` so it needs custom serialization.
    (pico.field).custom_serialize = "storj.io/picobuf/picoconv.Timestamp"
  ];
}

Take a look at picoconv.Timestamp on how to implement custom serialization.

Why?

This library came out of a concern of creating smaller binaries. The official library, at the time of writing, relies on calling methods via reflection which switches the compiler into conservative mode. In conservative mode a lot of deadcode elimination doesn't work and hence the binaries get bloated.

Similarly, picobuf has optimizations and custom type support; which allows to use a different data representation from the official implementation.

Should I use it?

As a rule-of-thumb, you are probably better off using the official implementation. There are plenty of features this implementation doesn't support and in most cases the official implementation is completely fine.

Here are a few cases where you might consider using picobuf:

  • You need a small binary, for example:
    • WASM module
    • C library
    • embedded systems (TinyGo)
  • Custom serialization
    • this allows to avoid some allocations
  • Custom types
    • this allows to avoid some annoying conversions across the codebase

Note, you only get the binary size benefit when you do not have the official protobuf included in other ways. Or any other library that makes the compiler go into conservative mode (e.g. text/template or html/template).

Similarly, there are contraindications when it doesn't make sense to use it:

  • you need features such as:
    • any messages
    • reflection
    • json
    • descriptors

Benchmark

sizebench contains tests for the binary size.

go1.19.5
goos: linux
goarch: amd64
// A basic "fmt.Println" executable size.
ExeBase               1     1814.766 KB
// A basic "fmt.Println" executable size, but contains code
// that switches the compiler into conservative mode.
ExeBaseConserv        1     2549.444 KB
// Picobuf generated binary size (non-conservative).
ExePico/Sml           1     2000.987 KB      186.221 KB-Δ
// Picobuf generated binary size (conservative).
ExePicoConserv/Sml    1     2700.553 KB      151.109 KB-Δ
// Protobuf generated binary size (conservative).
ExeProto/Sml          1     4978.815 KB     2429.371 KB-Δ

The message definitions are in msg-sml.proto.

As can be seen, the protobuf binary is ~3MB bigger than picobuf in non-conservative mode; and it's ~2.3MB bigger than picobuf in conservative mode.