Kaitai project is happy to announce release of new major version of Kaitai Struct, declarative markup language to describe various binary data structures — binary file formats, network stream packets, etc.
The basic idea of Kaitai Struct is that a particular format can be
described using Kaitai Struct language (in a .ksy file), which then
can be compiled using kaitai-struct-compiler into source files in
one of the supported programming languages. These modules will include
a generated code for a parser that can read described data structure
from a file / stream and provide access to its contents in a nice,
easy-to-comprehend API.
This release finally brings serialization support for Java and Python! It adds decent support for Rust, thanks to Oleh Dolhov and Vitaly Reshetyuk. Many fixes to the import functionality were added, so if something related to imports didn’t work before, try it now. It also brings numerous improvements to the Web IDE, in particular the ability to show a partial object tree up if a parsing error occurs, which greatly facilitates reverse engineering and debugging (see the previous blog post for more details).
Many of the improvements in this version were supported by the NLnet Foundation.
This is the last version of Kaitai Struct to support Python 2.7 and Ruby 1.9.3 - 2.3. Future versions will require at least Python 3.4 (or possibly even higher, see #821) and Ruby 2.4.
-w/--read-write: serialization support, currently only for Java and Python (see Serialization guide)
                --no-auto-read, so _read() must always be called manually to parse from a stream_check() performs consistency checks - must be called on each object after the last change to its seq fields or instances, otherwise _write() will throw a ConsistencyNotCheckedError_write()_invalidate{Inst}() (Java) / _invalidate_{inst}() (Python) for each value instance inst allow invalidating (forgetting) the cached value so that the instance can obtain a new value--zero-copy-substream {true|false} (default is true): zero-copy substreams, currently only for Java and Ruby (#44)
                _raw_* fields from the generated code - if you need them, use --zero-copy-substream falsevalid/in-enum: true validates that the parsed value is defined in the enum specified by the enum keytype: strz in combination with encoding: UTF-16{BE,LE} or encoding: UTF-32{BE,LE} now properly terminates the string on a 2-byte or 4-byte null character (#187)to-string in a type definition can be used to provide a concise human-readable string representation of the object (#732)
                toString() (or similar), __str__() in Python, to_s in Ruby, Display trait in Rustksv), but not yet in the Web IDE, which still uses the -webide-representation key for this purposef"foo={foo}": only strings and integers can be interpolated, formatting options are not yet supported (#1073)encoding key, warn against using unknown encodings (#393)
                _root and _parent in recursive invocations of the top-level type in the same .ksy spec (#1089)_root and _parent incorrectly passed to imported nested types (compiler#283)_parent type (#961)contents (#1011)error: unable to find type ... for one of the imported types (#951)meta/ks-opaque-types: true when using imports (#295)--ksc-json-output: preserve input .ksy paths in output JSON keys exactly without slash normalization (#507)ValidationNotInEnumError exception, which is thrown if the valid/in-enum: true validation failsbytesTerminateMulti and readBytesTermMulti methods needed for type: strz + encoding: UTF-16/UTF-32 support to all runtime libraries (#187)netstandard2.0 (csharp@7b1ac6d) - fixes KaitaiStruct.Runtime.CSharp v0.10.0 contains indirect vulnerable references (csharp#20)_fetchInstances() (Java) / _fetch_instances() can be used to recursively fetch all parse instances so that the input stream can be closed; this is especially useful with serialization when reading from one file and writing to anotheralign_to_byte() / alignToByte(), which ensures proper alignment to a byte boundary after using bit-sized integers (type: bX), instead of the compiler often inserting them incorrectly (#1070)List instead of ArrayList - this is a potentially breaking change (#1116)Uint8Array, not number[] (ec064e3)encoding: UTF-8 (lua#12)KaitaiStructError instead of raising generic exceptions (python#80).as<> (f65fd5b)valid and contents (bfdd54a)enum_val.to_i (#802)enum_val.to_i now works even if enum_val represents a value not defined in the enum (#815)bytes.to_s(encoding) so that it always returns UTF-8 strings (be695f5)TypeError: {ImportedType} is not a constructor when loading a .ksy specification with imports for the first time since loading the Web IDE, support circular imports (webide#169)0x.. is no longer incorrectly parsed as a constant (e.g. pos: 0x1 + offset is not interpreted as pos: 0x1ffe)0b... is no longer parsed as 0-webide-representation on imported types (webide#163)_unnamed* fields created by omitting id in seq fields (#1064)kaitai-struct-compiler now returns the compiler object itself instead of a constructor function (called KaitaiStructCompiler). This is a breaking change, so make sure to adapt your code: replace (new KaitaiStructCompiler()).compile(...) with KaitaiStructCompiler.compile(...) (compiler#222)encoding keyksv, ksdump
                ksdump: include _unnamed* fields created by omitting id in seq fields (#1064)