aboutsummaryrefslogtreecommitdiffstats
path: root/eval/src
diff options
context:
space:
mode:
authorHaavard <havardpe@yahoo-inc.com>2017-05-03 08:38:58 +0000
committerHaavard <havardpe@yahoo-inc.com>2017-05-03 08:38:58 +0000
commit603635be530e6fa6d0115fc5a9f0a0b1edb66a40 (patch)
tree1b7eafadfe851dce186fd24ea8fe9da3e1d4fc9f /eval/src
parent9b3b15a8f6ea14cd0d81d4714b253761cd4309d4 (diff)
account for sparse tensors without dimensions
Diffstat (limited to 'eval/src')
-rw-r--r--eval/src/vespa/eval/tensor/serialization/format.txt17
1 files changed, 11 insertions, 6 deletions
diff --git a/eval/src/vespa/eval/tensor/serialization/format.txt b/eval/src/vespa/eval/tensor/serialization/format.txt
index 9d0a387c36a..02db6114ab2 100644
--- a/eval/src/vespa/eval/tensor/serialization/format.txt
+++ b/eval/src/vespa/eval/tensor/serialization/format.txt
@@ -4,10 +4,9 @@ interpreted as a single unified binary format. The description below
uses data types defined by document serialization (nbostream) combined
with some comments and python-inspired flow-control. The mixed[3]
binary format is defined in such a way that it overlays as
-effortlessly as possible with both existing formats. The only thing
-needed to go from sparse[1] or dense[2] binary formats to the mixed[3]
-format for a specific tensor is to add a single byte indicating there
-are no dimensions of the other kind (mapped/indexed).
+effortlessly as possible with both existing formats.
+
+//-----------------------------------------------------------------------------
byte: type (1:sparse, 2:dense, 3:mixed)
bit 0 -> 'sparse'
@@ -15,7 +14,7 @@ byte: type (1:sparse, 2:dense, 3:mixed)
(mixed tensors are tagged as both 'sparse' and 'dense')
if ('sparse'):
- 1_4_int: number of mapped dimensions -> ''n_mapped'
+ 1_4_int: number of mapped dimensions -> 'n_mapped'
'n_mapped' times: (sorted by dimension name)
small_string: dimension name
@@ -25,7 +24,7 @@ if ('dense'):
small_string: dimensions name
1_4_int: dimensions size (must be at least 1) -> 'size_i'
-if ('n_mapped > 0'):
+if ('n_mapped > 0' || !'dense'):
1_4_int: number of named dense sub-spaces -> 'n_blocks'
else:
'n_blocks' = 1 (a single dense space)
@@ -35,3 +34,9 @@ else:
small_string: dimension label (same order as dimension names)
prod('size_i') times: (product of all indexed dimension sizes)
double: cell value (last indexed dimension is nested innermost)
+
+//-----------------------------------------------------------------------------
+
+Note: A tensor with no dimensions should not be serialized as
+sparse[1], but when it is, it will contain an integer indicating the
+number of cells.