Kaitai project is happy to announce release of new major version of Kaitai Struct, declarative markup language to describe various binary data structures — binary file formats, network stream packets, etc.
The basic idea of Kaitai Struct is that a particular format can be
described using Kaitai Struct language (in a .ksy
file), which then
can be compiled using kaitai-struct-compiler
into source files in
one of the supported programming languages. These modules will include
a generated code for a parser that can read described data structure
from a file / stream and provide access to its contents in a nice,
easy-to-comprehend API.
This release finally brings serialization support for Java and Python! It adds decent support for Rust, thanks to Oleh Dolhov and Vitaly Reshetyuk. Many fixes to the import functionality were added, so if something related to imports didn’t work before, try it now. It also brings numerous improvements to the Web IDE, in particular the ability to show a partial object tree up if a parsing error occurs, which greatly facilitates reverse engineering and debugging (see the previous blog post for more details).
Many of the improvements in this version were supported by the NLnet Foundation.
This is the last version of Kaitai Struct to support Python 2.7 and Ruby 1.9.3 - 2.3. Future versions will require at least Python 3.4 (or possibly even higher, see #821) and Ruby 2.4.
-w
/--read-write
: serialization support, currently only for Java and Python (see Serialization guide)
--no-auto-read
, so _read()
must always be called manually to parse from a stream_check()
performs consistency checks - must be called on each object after the last change to its seq
fields or instances
, otherwise _write()
will throw a ConsistencyNotCheckedError
_write()
_invalidate{Inst}()
(Java) / _invalidate_{inst}()
(Python) for each value instance inst
allow invalidating (forgetting) the cached value so that the instance can obtain a new value--zero-copy-substream {true|false}
(default is true
): zero-copy substreams, currently only for Java and Ruby (#44)
_raw_*
fields from the generated code - if you need them, use --zero-copy-substream false
valid/in-enum: true
validates that the parsed value is defined in the enum specified by the enum
keytype: strz
in combination with encoding: UTF-16{BE,LE}
or encoding: UTF-32{BE,LE}
now properly terminates the string on a 2-byte or 4-byte null character (#187)to-string
in a type definition can be used to provide a concise human-readable string representation of the object (#732)
toString()
(or similar), __str__()
in Python, to_s
in Ruby, Display
trait in Rustksv
), but not yet in the Web IDE, which still uses the -webide-representation
key for this purposef"foo={foo}"
: only strings and integers can be interpolated, formatting options are not yet supported (#1073)encoding
key, warn against using unknown encodings (#393)
_root
and _parent
in recursive invocations of the top-level type in the same .ksy spec (#1089)_root
and _parent
incorrectly passed to imported nested types (compiler#283)_parent
type (#961)contents
(#1011)error: unable to find type ...
for one of the imported types (#951)meta/ks-opaque-types: true
when using imports (#295)--ksc-json-output
: preserve input .ksy paths in output JSON keys exactly without slash normalization (#507)ValidationNotInEnumError
exception, which is thrown if the valid/in-enum: true
validation failsbytesTerminateMulti
and readBytesTermMulti
methods needed for type: strz
+ encoding: UTF-16
/UTF-32
support to all runtime libraries (#187)netstandard2.0
(csharp@7b1ac6d) - fixes KaitaiStruct.Runtime.CSharp v0.10.0 contains indirect vulnerable references
(csharp#20)_fetchInstances()
(Java) / _fetch_instances()
can be used to recursively fetch all parse instances so that the input stream can be closed; this is especially useful with serialization when reading from one file and writing to anotheralign_to_byte()
/ alignToByte()
, which ensures proper alignment to a byte boundary after using bit-sized integers (type: bX
), instead of the compiler often inserting them incorrectly (#1070)List
instead of ArrayList
- this is a potentially breaking change (#1116)Uint8Array
, not number[]
(ec064e3)encoding: UTF-8
(lua#12)KaitaiStructError
instead of raising generic exceptions (python#80).as<>
(f65fd5b)valid
and contents
(bfdd54a)enum_val.to_i
(#802)enum_val.to_i
now works even if enum_val
represents a value not defined in the enum (#815)bytes.to_s(encoding)
so that it always returns UTF-8 strings (be695f5)TypeError: {ImportedType} is not a constructor
when loading a .ksy specification with imports for the first time since loading the Web IDE, support circular imports (webide#169)0x..
is no longer incorrectly parsed as a constant (e.g. pos: 0x1 + offset
is not interpreted as pos: 0x1ffe
)0b...
is no longer parsed as 0
-webide-representation
on imported types (webide#163)_unnamed*
fields created by omitting id
in seq
fields (#1064)kaitai-struct-compiler
now returns the compiler object itself instead of a constructor function (called KaitaiStructCompiler
). This is a breaking change, so make sure to adapt your code: replace (new KaitaiStructCompiler()).compile(...)
with KaitaiStructCompiler.compile(...)
(compiler#222)encoding
keyksv
, ksdump
ksdump
: include _unnamed*
fields created by omitting id
in seq
fields (#1064)