-
Notifications
You must be signed in to change notification settings - Fork 245
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimize the serialization of large primitive floating-point arrays #61
Comments
HI, thanks for the hint. I wasn't aware of this. Can't remember of benchmarks where JDK was better (probably new in 1.8 ? ) .. @Override
public void writeFDouble(double value) throws IOException {
writePlainLong(Double.doubleToLongBits(value));
} apparently ObjectOutputStream.doublesToBytes are native and private :(. Another alternative would be to use Unsafe. I will have to benchmark the differences .. I am offline for one week, so stay tuned ;-) .. if you could provide a small benchmark reproducer menawhile .. even better :) |
Thanks for the instant feedback! Here is a small benchmark that I run on my laptop. The larger the primitive array the stronger the difference. With an extreme size of 100M doubles for instance (which requires a few gigabytes of heap to run) I have the following result:
Code below:
|
Thanx ! First analysis: Its not the float/double implementation but Object size. public static void main(String[] args) throws Exception {
while( true ) {
CountingOutputStream count;
long start, elapsed;
double obj[] = new double[10_000];
ByteArrayOutputStream bout = new ByteArrayOutputStream(4*obj.length);
start = System.nanoTime();
for ( int n = 0; n < 10000; n++ ) {
try(ObjectOutputStream oos = new ObjectOutputStream(bout)) {
oos.writeObject(obj);
}
bout.reset();
}
elapsed = System.nanoTime() - start;
System.out.println("STD :" + (elapsed)/1000000L + "ms");
FSTConfiguration conf = FSTConfiguration.createDefaultConfiguration();
start = System.nanoTime();
for ( int n = 0; n < 10000; n++ ) {
bout.write(conf.asByteArray(obj));
bout.reset();
}
elapsed = System.nanoTime() - start;
System.out.println("FST :" + (elapsed)/1000000L + "ms");
}
} yields: Note also, writing plain number arrays is kind of unrealistic benchmark (no to zero optimization potential). Its real world object graphs where FST outperforms other serializers. |
I understand that serializing very large arrays is far from the common usage of the library. And that it probably deserves custom code. One example though: my project is a distributed database and it frequently serializes data that is a mix of standard objects together with arrays of numbers, for instance when nodes exchange partial results part of a distributed query, or when the transaction log is written to disk or pushed to replicas. Maybe the usage of buffers in FST is not the only cause of slightly lower performance, in my original test case is you replace
with
the difference is much smaller than with doubles:
|
hm .. probably a mixture of both. The internal buffer written needs to grow + get copied several times. This could be avoided as I could signal expected size when detecting a large "blob" array, so the buffer is allocated with the correct final size. I alsready have an idea on how to add a special path for large primitive arrays ... |
argh just had to do some last minute investigation .. this checkin fixes the issue 363d820 for double and float arrays. FST now is ~10-15% ahead for your test case I'll release when back from vacation. It implicitely also optimizes buffer preallocation ("ensureReadahead()"). The fix is not complete as all arrays should get optimized .. |
Well that was fast! I checked out your master branch and confirm that my use case of serializing a large array of doubles now runs a bit faster with FST than with JDK. Obviously I was wrong staring at how doubles are split into bytes, but luckily you found the real optimization ;) Looking forward to the upcoming release(s). Enjoy your trip, this is probably one of the best weeks of the year to spend time in Italy... |
thx, had a great time in italy (though it's tough for me to stay away from computers for a whopping full week ;), even tried to program using AIDE on a crappy tablet ..). Thanks for your analysis, report and fix validation :-). |
Awesome library that works out of the box and then better when you tweak it.
In the scope of my personal usage I have found only one weakness: the serialization of large primitive arrays is slower than the standard java serialization, much slower for 'double' and 'float'.
That's because FST writes the primitive elements one by one, converting them in bytes one at a time, while java serialization uses esoteric intrinsic native methods such as java.io.ObjectOutputStream.doublesToBytes or java.io.ObjectOutputStream.floatsToBytes.
Could we consider using those intrinsics in FST too?
The text was updated successfully, but these errors were encountered: