diff options
author | Haavard <havardpe@yahoo-inc.com> | 2017-05-03 08:38:58 +0000 |
---|---|---|
committer | Haavard <havardpe@yahoo-inc.com> | 2017-05-03 08:38:58 +0000 |
commit | 603635be530e6fa6d0115fc5a9f0a0b1edb66a40 (patch) | |
tree | 1b7eafadfe851dce186fd24ea8fe9da3e1d4fc9f /eval/src | |
parent | 9b3b15a8f6ea14cd0d81d4714b253761cd4309d4 (diff) |
account for sparse tensors without dimensions
Diffstat (limited to 'eval/src')
-rw-r--r-- | eval/src/vespa/eval/tensor/serialization/format.txt | 17 |
1 files changed, 11 insertions, 6 deletions
diff --git a/eval/src/vespa/eval/tensor/serialization/format.txt b/eval/src/vespa/eval/tensor/serialization/format.txt index 9d0a387c36a..02db6114ab2 100644 --- a/eval/src/vespa/eval/tensor/serialization/format.txt +++ b/eval/src/vespa/eval/tensor/serialization/format.txt @@ -4,10 +4,9 @@ interpreted as a single unified binary format. The description below uses data types defined by document serialization (nbostream) combined with some comments and python-inspired flow-control. The mixed[3] binary format is defined in such a way that it overlays as -effortlessly as possible with both existing formats. The only thing -needed to go from sparse[1] or dense[2] binary formats to the mixed[3] -format for a specific tensor is to add a single byte indicating there -are no dimensions of the other kind (mapped/indexed). +effortlessly as possible with both existing formats. + +//----------------------------------------------------------------------------- byte: type (1:sparse, 2:dense, 3:mixed) bit 0 -> 'sparse' @@ -15,7 +14,7 @@ byte: type (1:sparse, 2:dense, 3:mixed) (mixed tensors are tagged as both 'sparse' and 'dense') if ('sparse'): - 1_4_int: number of mapped dimensions -> ''n_mapped' + 1_4_int: number of mapped dimensions -> 'n_mapped' 'n_mapped' times: (sorted by dimension name) small_string: dimension name @@ -25,7 +24,7 @@ if ('dense'): small_string: dimensions name 1_4_int: dimensions size (must be at least 1) -> 'size_i' -if ('n_mapped > 0'): +if ('n_mapped > 0' || !'dense'): 1_4_int: number of named dense sub-spaces -> 'n_blocks' else: 'n_blocks' = 1 (a single dense space) @@ -35,3 +34,9 @@ else: small_string: dimension label (same order as dimension names) prod('size_i') times: (product of all indexed dimension sizes) double: cell value (last indexed dimension is nested innermost) + +//----------------------------------------------------------------------------- + +Note: A tensor with no dimensions should not be serialized as +sparse[1], but when it is, it will contain an integer indicating the +number of cells. |