-
Notifications
You must be signed in to change notification settings - Fork 830
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add simple, queue based kryo pool #230
Conversation
Cool. Simple and clean :) I would soft-reference the pooled instances, to release them when GC reclaims space. This would keep the simplicity of the artifact (zero-configuration) while becoming GC-friendly. Regards, -----Original Message----- I'm submitting this as a proposal for a simple kryo pool, didn't want to push it directly into master. Some words on the suggested pool... Kryo instances are created using a The included
KryoPool usage example: KryoFactory factory = new KryoFactory() {
public Kryo create () {
Kryo kryo = new Kryo();
// configure kryo
return kryo;
}
};
KryoPool pool = new KryoPool(factory);
Kryo kryo = pool.borrow();
// do s.th. with kryo here, and afterwards release it
pool.release(kryo);
// or use a callback to work with kryo
String value = pool.run(new KryoCallback() {
public String execute(Kryo kryo) {
return kryo.readObject(input, String.class);
}
}); You can merge this Pull Request by running: git pull https://github.com/magro/kryo master Or you can view, comment on it, or merge it online at: -- Commit Summary --
-- File Changes --
-- Patch Links -- https://github.com/EsotericSoftware/kryo/pull/230.patch Reply to this email directly or view it on GitHub: |
And if you go this way, for me is always preferrable to soft-reference the hole backing queue than to softreference eache queue entry, as th GC has less work to do. Reclaim all or nothing. Tumi -----Original Message----- Cool. Simple and clean :) I would soft-reference the pooled instances, to release them when GC reclaims space. This would keep the simplicity of the artifact (zero-configuration) while becoming GC-friendly. Regards, -----Original Message----- I'm submitting this as a proposal for a simple kryo pool, didn't want to push it directly into master. Some words on the suggested pool... Kryo instances are created using a The included
KryoPool usage example: KryoFactory factory = new KryoFactory() {
public Kryo create () {
Kryo kryo = new Kryo();
// configure kryo
return kryo;
}
};
KryoPool pool = new KryoPool(factory);
Kryo kryo = pool.borrow();
// do s.th. with kryo here, and afterwards release it
pool.release(kryo);
// or use a callback to work with kryo
String value = pool.run(new KryoCallback() {
public String execute(Kryo kryo) {
return kryo.readObject(input, String.class);
}
}); You can merge this Pull Request by running: git pull https://github.com/magro/kryo master Or you can view, comment on it, or merge it online at: -- Commit Summary --
-- File Changes --
-- Patch Links -- https://github.com/EsotericSoftware/kryo/pull/230.patch Reply to this email directly or view it on GitHub: |
Good point on using soft references! On 07/21/2014 04:13 PM, Tumi wrote:
If there are kryo instances that grow rather big I'd say soft So I'd prefer to soft reference single kryo instances. Cheers,
inoio gmbh - http://inoio.de |
Yes you are right! The only difference would be a doing single nullity check versus a loop while polling nulled references in borrow(), but in practice there is no difference. So better soft reference instances if you prefer P.S: Many years ago I came to this article on IBM developerworks (umm, so much time since 2006), http://www.ibm.com/developerworks/library/j-jtp01246/index.html#2.1 , section “A poor man’s cache”, but you are right that in this case we are not talking about thousand of pooled instances. Tumi De: Martin Grotzke [mailto:[email protected]] Good point on using soft references! On 07/21/2014 04:13 PM, Tumi wrote:
If there are kryo instances that grow rather big I'd say soft So I'd prefer to soft reference single kryo instances. Cheers,
inoio gmbh - http://inoio.de — |
+1 from me. Nice idea and pretty simple implementation. |
BTW, should there be an option to set a max size of the pool in a constructor, i.e. created a size-bound pool? |
I think in this case there is no need of that parametrization (and the small added complexity in borrow and release): there should a number of pooled instances <= the max number of threads (except for leaks or bad usages), so that limitatios is already configured in yout thread pool, if any. (My opinion) De: romix [mailto:[email protected]] BTW, should there be an option to set a max size of the pool in a constructor, i.e. created a size-bound pool? — |
I'm new for kryo. |
@mindwind I think it is described on the GitHub Kryo home page, but I can explain it again: |
@romix thanks, i will read the home page more carefully. |
I just changed the pool to cache kryo instances using SoftReferences... |
In borrow(), instead of making a single poll() , I think that you should loop while the soft-referenced polled instance is not null (as GC may have reclaimed one instance but not the next). Cheers, -----Original Message----- I just changed the pool to cache kryo instances using SoftReferences... Reply to this email directly or view it on GitHub: |
@serverperformance Good point! I just changed this to while((ref = queue.poll()) != null) {
if((res = ref.get()) != null) {
return res;
}
} |
@magro Looks good. You should merge it, I'd say. And please add a corresponding section into the docs on our homepage. |
@romix Good point, I just added a section to the README. To you want to have a quick look at it? |
@magro I had a quick look. Here are my quick comments:
|
@@ -531,6 +532,36 @@ The serializers Kryo provides use the call stack when serializing nested objects | |||
|
|||
**Kryo is not thread safe. Each thread should have its own Kryo, Input, and Output instances. Also, the byte[] Input uses may be modified and then returned to its original state during deserialization, so the same byte[] "should not be used concurrently in separate threads**. | |||
|
|||
## Pooling Kryo instances | |||
|
|||
Because the creation/initialization of `Kryo` instances is rather expensive, in a multithreaded scenario you should use the `KryoPool`. Here's an example that shows how to use it: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would say most of the time you'd want to use ThreadLocal and have a Kryo per thread. You'd only need a pool if you don't want a Kryo per thread for whatever reason. What is a reason you would not want a Kryo per thread? A thread is pretty heavyweight. I would guess you could have a Kryo per thread in just about any situation and be fine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, ThreadLocal is also techically possible. But to be honest I don't like ThreadLocals that much because it feels like a global variable that a component accesses, violating IoC / reducing control. Additionally it depends on the ThreadPool that's used and the threading strategy (what if Kryo/the ThreadLocal used from within an I/O event loop running with a single thread only?). So IMHO using ThreadLocals is more complex and the user needs to be sure in which invironment he uses a ThreadLocal.
Still, I'd mention ThreadLocal as a possible solution in the README.
Ok?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what if Kryo/the ThreadLocal used from within an I/O event loop running
with a single thread only?
Seems like it would work fine? If you have a single thread, you don't need
more than one Kryo instance.
So IMHO using ThreadLocals is more complex and the user needs to be sure
in which invironment he uses a ThreadLocal.
Agreed, but only slightly. As long as both ways are mentioned I'm good with
the pool.
@magro BTW, Martin, this PR reminds me of another issue, we discussed at some point (e.g. #182). Current PR is about pooling Kryo instances. But it would be also cool to have a nice and flexible way to create Kryo instances with the same configuration. Do we want to provide something for it as well? |
Looks perfect to me, nice work! -----Original Message----- @serverperformance Good point! I just changed this to while((ref = queue.poll()) != null) {
if((res = ref.get()) != null) {
return res;
}
} Reply to this email directly or view it on GitHub: |
@romix Thanks for your suggestions, I improved the README (now as separate commit) and also mentioned ThreadLocal now. Still/again room for improvements? Re configuration of Kryo instances, do you have s.th. in mind like Chill's KryoInstantiator that you mentioned in #182? Yes, we could provide this OOTB as well. I can look into this when I'm looking at memcached-session-manager and kryo2 again (which can take some time to be honest ;-)). |
@magro README is OK now, IMHO Re configuration of Kryo instances, I don't have anything concrete at hand. One idea could be really to borrow Chill's approach. Or we could follow what you suggested in #182 and provide a builder API or something like that. Overall:
|
Kryo instances are created using a `KryoFactory` that's passed to the pool. The pool also allows to run callbacks by passing a kryo instance. The pool uses a `ConcurrentLinkedQueue` to manage kryo instances. The included `KryoPoolBenchmarkTest` (with `ITER_CNT = 100000`) shows the following output for me (excerpt): >>> With pool (average): 8 ms >>> Without pool (average): 2,105 ms KryoPool usage example: ```java KryoFactory factory = new KryoFactory() { public Kryo create () { Kryo kryo = new Kryo(); // configure kryo return kryo; } }; KryoPool pool = new KryoPool(factory); Kryo kryo = pool.borrow(); // do s.th. with kryo here, and afterwards release it pool.release(kryo); // or use a callback to work with kryo String value = pool.run(new KryoCallback() { public String execute(Kryo kryo) { return kryo.readObject(input, String.class); } }); ```
Add simple, queue based kryo pool
Late to the party, just wanted to share two thoughts
My users chain factories together, end users want their KryoFactory to start with a Kryo created by the library, but all registration needs to happen after default serializers are added so that the correct serializer is chosen. So first create is called, and the chain of factories installs default serializers, and then registration is called when the chain of factories registers types. With only a single create method, the library's Factory has to do registration inside create() and so the end user's Factory cannot change the default serializer for those types. |
Can you go into more detail regarding the impact on the jvm (and share some docs/numbers perhaps)?
Can't you just use your internal KryoFactory and chains of factories inside the newly introduced c.e.k.pool.KryoFactory? |
Ping @brianfromoregon |
Hi, I think you don't need to change any of what you've produced for me, I just wanted to share the use case early on. Yes I think an interface is good practice anyway (Kryo should have an interface imo). Yes I could chain my factories inside of your factory, it would then just mean I have two KryoFactory types in my app when ideally there would be one. Thanks! |
@brianfromoregon Would you use the pool if it wouldn't use SoftReferences? Which change would you suggest regarding the KryoFactory? |
Hi, I might use the pool at home but at work I key Kryos by name so wouldn't need the pool. For the KryoFactory I would suggest keeping it simple like it is now, and keeping its use constained to the pool package. Because, KryoFactory overlaps in a small way with what I believe to be a large gap in the Kryo world: versioning. So I'd avoid committing code in that space ("how do we create and version Kryos") until a strategic direction is agreed upon. I attended a great talk at oscon last week about designing APIs for reuse. I agreed with his argument that the key is a message format that can be extended (added to) over time. And for that, a good approach is having a "bag" of key/value pairs which can be added to PLUS not imposing structure in the message, keeping it as just data and letting reader decide structure they wish to impose. Anyway, it got me thinking about Kryo and the serialized forms I'm persisting to file/grid today. They are NOT extensible at all by default, just adding a field to a class breaks it. :( So with Kryo I can't use any of the advice he gave.... so how do we version with Kryo. I don't know, like i said it feels like a big gap that something like protobuf has a better answer for. |
On Tue, Jul 29, 2014 at 1:39 AM, Brian Harris [email protected]
Kryo doesn't limit you, you can write serializers that support versioning -Nate |
Why is TaggedFieldSerializer not the default if it has such high recommendation? Surely for out of the box behavior a slight performance hit would be worth the compatibility gain? I will definitely consider switching to it as default. |
Because it requires tagging fields with an annotation. On Tue, Jul 29, 2014 at 7:49 AM, Brian Harris [email protected]
|
I think also your guess about performance is not quite wrong. Many people use Kryo for in-flight data. Compatibility is often not an issue for them. If they would need compatibility, they would probably have used protostuff-runtime or avro (which both don't need any pre-defined schemas and can serialize almost any classes out of the box) or they would go for something like thrift, protobuffers, avro and other schema-based approaches. I think if Kryo is to be seriously used for such scenarios where compatibility and versioning is so important, then we should eventually consider something like schema-based solutions (either predefined or dynamically generated). Another dimension is where and how you keep the meta-data, i.e. your schema.
My conclusion: the problem is not as straight forward as it may seem. We need a good discussion resulting in reasonable strategy if we want to move forward in that direction. |
On Tue, Jul 29, 2014 at 10:11 AM, romix [email protected] wrote:
Kryo is already very good at versioning. Most applications only need One of my apps has released more than 200 versions over the years and any The drawback to TaggedFieldSerializer is that it requires field One reason to not use TaggedFieldSerializer is if you need forward I don't think it is very interesting to compete with protobuf, which Using a schema has it's own drawbacks. First, you need to write and Using a schema also means you need to smush your data to fit in the schema.
-Nate |
@nate: Good points and explanations about TaggedFieldSerializer. Regarding schema-based solutions: You realize that AVRO and protostuff-runtime have a mode, where no pre-defined schema is required? It is generated dynamically during serialization based only on class-files, i.e. it is very similar to what Kryo does + it writes a schema (for each object graph). As a result, any avro instance and any protostuff-runtime instance can read the data serialized this way. Regarding usage of TaggedFieldSerializer: From experience, I can tell you that DB people would almost certainly not accept non-schema based solution. This is just a fact of life ;-) |
BTW, Avro would write this whole automatically derived schema in front of the serialized data whereas protostuff-runtime would more or less embed the information in front of each field similar to TaggedFieldSerializer. I think recent versions of Hazelcast serialization also started using something similar. |
On Tue, Jul 29, 2014 at 11:09 AM, romix [email protected] wrote:
I don't see the DB as a driving factor for how serialization is done. Kryo Cheers, |
Nathan, TaggedFieldSerializer sounds great, except asking my users to annotate their objects with @tag is not going to fly. Would it be possible to have a serializer which does the same thing as taggedfieldserializer (puts the varint with each field) except its written to assume that all fields are "annotated"/"tagged"? Then I could change my default serializer to this new thing and everyone gets backward compatibility out of the box. |
On Tue, Jul 29, 2014 at 6:43 PM, Brian Harris [email protected]
-Nate |
This discussion about CompatibleFieldSerializer & alternatives is really interesting and useful, but it is getting more and more off-topic for this PR. I'd suggest to create a dedicated issue and continue this discussion there. BTW, when it comes to a 3rd party classes and there is no way to annotate them in a usual way, one trick I used in some projects was to allow for "external" annotations. In this case, annotations are not provided inside class files, but elsewhere (config files, DB, etc) in a declarative way. Such external definitions would contain class names, names of the annotated fields and annotations themselves (i.e. the class name of an annotation and its parameters). Of course, we introduced a simple interface which abstracted away how annotations were accessed at the low-level (i.e. from a class file or from such a declarative description). This way we could "annotate" even third-party libs we were using. |
As you've probably seen I've submitted another pull request that changes the KryoPool to an interface and makes SoftReferences optional. Regarding the discussion about versioning: +1 for @romix' suggestion, it would be great if we could continue the discussion or on the mailing list or a dedicated issue (and not stop it here :-)). |
I'm submitting this as a proposal for a simple kryo pool, didn't want to push it directly into master. Some words on the suggested pool...
Kryo instances are created using a
KryoFactory
that's passed to the pool.The pool uses a
ConcurrentLinkedQueue
to manage kryo instances.The pool also allows to run callbacks by passing a kryo instance.
The included
KryoPoolBenchmarkTest
(withITER_CNT = 100000
) showsthe following output for me (excerpt):
KryoPool usage example: