diff options
-rw-r--r-- | eval/src/vespa/eval/tensor/serialization/format.txt | 42 |
1 files changed, 42 insertions, 0 deletions
diff --git a/eval/src/vespa/eval/tensor/serialization/format.txt b/eval/src/vespa/eval/tensor/serialization/format.txt new file mode 100644 index 00000000000..8c5d3b331d2 --- /dev/null +++ b/eval/src/vespa/eval/tensor/serialization/format.txt @@ -0,0 +1,42 @@ +This file explains how the typed binary formats of serialized tensors +for different archetypes (sparse[1], dense[2] and mixed[3]) can be +interpreted as a single unified binary format. The description below +uses data types defined by document serialization (nbostream) combined +with some comments and python-inspired flow-control. The mixed[3] +binary format is defined in such a way that it overlays as +effortlessly as possible with both existing formats. + +//----------------------------------------------------------------------------- + +byte: type (1:sparse, 2:dense, 3:mixed) + bit 0 -> 'sparse' + bit 1 -> 'dense' + (mixed tensors are tagged as both 'sparse' and 'dense') + +if ('sparse'): + 1_4_int: number of mapped dimensions -> 'n_mapped' + 'n_mapped' times: (sorted by dimension name) + small_string: dimension name + +if ('dense'): + 1_4_int: number of indexed dimensions -> 'n_indexed' + 'n_indexed' times: (sorted by dimension name) + small_string: dimensions name + 1_4_int: dimensions size (must be at least 1) -> 'size_i' + +if ('n_mapped' > 0 || !'dense'): + 1_4_int: number of named dense sub-spaces -> 'n_blocks' +else: + 'n_blocks' = 1 (a single dense space) + +'n_blocks' times: + 'n_mapped' times: + small_string: dimension label (same order as dimension names) + prod('size_i') times: (product of all indexed dimension sizes) + double: cell value (last indexed dimension is nested innermost) + +//----------------------------------------------------------------------------- + +Note: A tensor with no dimensions should not be serialized as +sparse[1], but when it is, it will contain an integer indicating the +number of cells. |