Kaitai Struct

A new way to develop parsers for binary structures.

Declarative: describe the very structure of the data, not how you read or write it

Language-neutral: write once, use in all supported languages:

  • C++/STL
  • C#
  • Java
  • JavaScript
  • Perl
  • PHP
  • Python
  • Ruby

... more to come

Packed with tools and samples: includes a compiler, an IDE, a visualizer and library of format specs

Free & open source: feel free to use, modify and join the project

Reading and writing binary formats is hard, especially if it's interchange format that should work across multitude of platforms and languages.

Have you ever found yourself writing repetitive, error-prone and hard-to-debug code that reads binary data structures from file / network stream and somehow represents them in memory for easier access?

Kaitai Struct tries to make this job easier — you only have to describe binary format once and then everybody can use it from their programming languages — cross-language, cross-platform.

What is Kaitai Struct?

Kaitai Struct is a declarative language used for describe various binary data structures, laid out in files or in memory: i.e. binary file formats, network stream packet formats, etc.

The main idea is that a particular format is described in Kaitai Struct language (.ksy file) and then can be compiled with ksc into source files in one of the supported programming languages. These modules will include a generated code for a parser that can read described data structure from a file / stream and give access to it in a nice, easy-to-comprehend API.

Using KS in your project

Typically, using formats described in KS in your project, involves the following steps:

  • Describe the format — i.e. create a .ksy file
  • Use visualizer to debug the format and ensure that it parses data properly
  • Compile .ksy file into target language source file and include that file into your project
  • Add KS runtime library for your particular language into your project (don’t worry, it’s small and it’s there mostly to ensure readability of generated code)
  • Use generated class(es) to parse your binary file / stream and access its components

Check out documentation for more information.

meta:
  id: tcp_segment
  endian: be
seq:
  - id: src_port
    type: u2
  - id: dst_port
    type: u2
  - id: seq_num
    type: u4
  - id: ack_num
    type: u4

public class TcpSegment extends KaitaiStruct {
    // ...
    private void _read() throws IOException {
        this.srcPort = _io.readU2be();
        this.dstPort = _io.readU2be();
        this.seqNum = _io.readU4be();
        this.ackNum = _io.readU4be();
    }
    // ...

Quick start

Consider this simple .ksy format description file that describes header of a GIF file (a popular web image format):

meta:
  id: gif
  file-extension: gif
  endian: le
seq:
  - id: header
    type: header
  - id: logical_screen
    type: logical_screen
types:
  header:
    seq:
      - id: magic
        contents: 'GIF'
      - id: version
        size: 3
  logical_screen:
    seq:
      - id: image_width
        type: u2
      - id: image_height
        type: u2
      - id: flags
        type: u1
      - id: bg_color_index
        type: u1
      - id: pixel_aspect_ratio
        type: u1

It declares that GIF file usually has .gif extension and uses little-endian integer encoding. The file itself starts with two blocks: first comes header and then comes logical_screen:

  • “Header” consists of “magic” string of 3 bytes (“GIF”) that identifies that it’s a GIF file starting and then there are 3 more bytes that identify format version (87a or 89a).
  • “Logical screen descriptor” is a block of integers:
    • image_width and image_height are 2-byte unsigned ints
    • flags, bg_color_index and pixel_aspect_ratio take 1-byte unsigned int each

This .ksy file can be compiled it into gif.cpp / Gif.cs / Gif.java / Gif.js / Gif.pm / Gif.php / gif.py / gif.rb and then one can instantly load .gif file and access, for example, it’s width and height.

std::ifstream ifs("path/to/some.gif", std::ifstream::binary);
kaitai::kstream ks(&ifs);
gif_t g = gif_t(&ks);

std::cout << "width = " << g.logical_screen()->image_width() << std::endl;
std::cout << "height = " << g.logical_screen()->image_height() << std::endl;
Gif g = Gif.FromFile("path/to/some.gif");

Console.WriteLine("width = " + g.LogicalScreen.ImageWidth);
Console.WriteLine("height = " + g.LogicalScreen.ImageHeight);
Gif g = Gif.fromFile("path/to/some.gif");

System.out.println("width = " + g.logicalScreen().imageWidth());
System.out.println("height = " + g.logicalScreen().imageHeight());
var g = new Gif(someArrayBuffer);

console.log("width = " + g.logicalScreen().imageWidth());
console.log("height = " + g.logicalScreen().imageHeight());
my $g = Gif->from_file("path/to/some.gif");

print("width = ", $g->logical_screen()->image_width(), "\n");
print("height = ", $g->logical_screen()->image_height(), "\n");
$g = Gif::fromFile("path/to/some.gif");

print("width = " . $g->logicalScreen()->imageWidth() . "\n");
print("height = " . $g->logicalScreen()->imageHeight() . "\n");
g = Gif.from_file("path/to/some.gif")

print "width = %d" % (g.logical_screen.image_width)
print "height = %d" % (g.logical_screen.image_height)
g = Gif.from_file("path/to/some.gif")

puts "width = #{g.logical_screen.image_width}"
puts "height = #{g.logical_screen.image_height}"
Of course, this example shows only very limited subset of what Kaitai Struct can do. Please refer to documentation for more insights.

Downloading and installing

There is an official .deb repository available for Debian / Ubuntu-based distributions. The repository is hosted at BinTray and signed with BinTray GPG key (379CE192D401AB61), so it's necessary to import that key first if your box haven't used any BinTray repositories beforehand:

echo "deb https://dl.bintray.com/kaitai-io/debian jessie main" | sudo tee /etc/apt/sources.list.d/kaitai.list
sudo apt-key adv --keyserver hkp://pool.sks-keyservers.net --recv 379CE192D401AB61
sudo apt-get update
sudo apt-get install kaitai-struct-compiler

Requirements

  • .deb-based Linux distribution (Debian, Ubuntu, etc)

There is a formula now available within Homebrew that you can use to install kaitai-struct-compiler:

brew install kaitai-struct-compiler

Requirements

Windows versions are avalable as MSI format installer. If you want a portable version that requires no installation, download our universal .zip build instead.

Download — stable v0.7, 7.0 MiB

Download — latest development (unstable) build

Requirements

"Universal" builds are downloadable as a .zip file that includes all the required .jar files bundled and launcher scripts for Linux / Mac OS X / Windows systems. No installation required, one can just unpack and run it.

Download — stable v0.7, 6.7 MiB

Requirements

If you prefer to build your tools from source, or just want to see how KS works, the easiest way to check out whole project is to download main (umbrella) project repository that already includes all other parts as sub-modules. Use:

git clone --recursive https://github.com/kaitai-io/kaitai_struct.git

Note the --recursive option.

Alternatively, one can check out individual sub-projects that consitute Kaitai Struct suite. See GitHub project page for details.

Requirements

Licensing

Kaitai Struct is free and open-source software, licensed under the following terms: