diff options
author | Tor Brede Vekterli <vekterli@yahooinc.com> | 2021-10-29 15:41:14 +0200 |
---|---|---|
committer | Tor Brede Vekterli <vekterli@yahooinc.com> | 2021-11-01 15:15:10 +0100 |
commit | bac8ab58a18d25db1871d3e933cb0cc018be5439 (patch) | |
tree | e1bdd02017a1bfdb1390442247e0a9351b748ec5 /vespajlib/abi-spec.json | |
parent | 1163edf3b7d94e9581a6670fc6b725e056e87023 (diff) |
Use UTF-8 bytewise ordering for StringResultNode comparisons
The C++ backend uses `memcmp` ordering of UTF-8 strings for its
`StringResultNode` instances and expects the container to feed it
nodes in the same order. However, the Java code used `String` internally,
which compares UTF-16 codepoints instead of UTF-8 octets. These
may not agree on the ordering, particularly in the presence of
surrogate pairs.
Java `StringResultNode` now uses a raw UTF-8 byte array as its value
backing, which has the added benefit that (de-)serializing is
effectively a no-op. Some extra `String` roundtrip work needed now
to support the various type-erased `ResultNode` functionality, but
this is not expected to be called in a hot path.
Diffstat (limited to 'vespajlib/abi-spec.json')
-rw-r--r-- | vespajlib/abi-spec.json | 1 |
1 files changed, 1 insertions, 0 deletions
diff --git a/vespajlib/abi-spec.json b/vespajlib/abi-spec.json index c426195bc37..5eeee267cf6 100644 --- a/vespajlib/abi-spec.json +++ b/vespajlib/abi-spec.json @@ -3428,6 +3428,7 @@ "protected static com.yahoo.vespa.objects.Identifiable deserializeOptional(com.yahoo.vespa.objects.Deserializer)", "protected static boolean equals(java.lang.Object, java.lang.Object)", "public void visitMembers(com.yahoo.vespa.objects.ObjectVisitor)", + "protected static byte[] getRawUtf8Bytes(com.yahoo.vespa.objects.Deserializer)", "protected java.lang.String getUtf8(com.yahoo.vespa.objects.Deserializer)", "protected void putUtf8(com.yahoo.vespa.objects.Serializer, java.lang.String)", "public bridge synthetic java.lang.Object clone()" |