Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

storage mechanism #155

Merged
merged 18 commits into from
Jan 12, 2021
Merged

storage mechanism #155

merged 18 commits into from
Jan 12, 2021

Conversation

ZacLN
Copy link
Contributor

@ZacLN ZacLN commented May 27, 2020

Custom storage format. @davidanthoff I don't know anything about serialization so this certainly isn't optimised but is ~6x faster than JSON storage.

@davidanthoff davidanthoff added the enhancement New feature or request label May 29, 2020
@davidanthoff davidanthoff added this to the Triage milestone May 29, 2020
@davidanthoff
Copy link
Member

Hehe, that is pretty good :)

@davidanthoff
Copy link
Member

This will hopefully also fix #54 and julia-vscode/LanguageServer.jl#750.

@davidanthoff
Copy link
Member

@ZacLN also, do you think there is much more to be done here, or is this pretty close and ready?

@pfitzseb
Copy link
Member

In the interest of potential other consumers of this: Can we give a JSON3 based solution a go? I feel like JSON (de)serialization shouldn't be this slow.

@davidanthoff
Copy link
Member

If we were to use JSON3.jl, I think it would have to be limited to the LS process, right? JSON3 brings along a fair number of deps, so not a good fit for the symbol server process. But I also think that would be ok, because we are mostly worried about deserialization speed, right? It would mean an additional 3 git submodules... We should also make sure that everything in JSON3 works properly with static compilation. Also, the LS process then would depend on both JSON and JSON3, which is a bit weird, but I guess also not the end of the world?

I guess the other consideration is how stable is JSON3's design? Just not clear to me, but I wouldn't want to take a dep on a package that is work/design in progress.

@ZacLN
Copy link
Contributor Author

ZacLN commented Jun 15, 2020

I think I must have timed JSON 3 before writing this. From my recollection the interesting part of their approach is that they deparse on demand which doesn't help with our use case

@davidanthoff
Copy link
Member

I started reading through the JSON3 docs. Not finished yet, but my thoughts so far: we definitely don't want to use something that is a view into the JSON, I think. That seems not good for our use case because we know that we want to load everything, and more importantly, that means we would have to keep the memory mapped file open on Windows, and that can cause all sort of trouble that I think we should avoid at all cost (assuming that the design is a view into a memory mapped file).

I haven't finished reading through the struct API, but at least the option that requires us to use mutable types I think we should also avoid: I would very much want to keep the option to move to a more and more immutable design in the whole LS space, and that seems to be a pretty fundamental way to prevent that.

Overall, the design of it seems very complicated, involved and low level, and I'm not sure we should take a dep on something like that here...

@pfitzseb
Copy link
Member

The nice thing about JSON3 is that it should allow us to skip the whole Dict-handling stuff necessary for JSON.jl -- we can simply go from a String or IOBuffer or whatever to Julia objects. That should give us a massive performance increase (but to be fair I've never tested that, so who knows).

@davidanthoff
Copy link
Member

My vote would be to go with this here for now, we can always reconsider later, but this seems to be simple, fast and more or less dependency less.

I think once we start to index in the cloud we might want to also save everything in JSON files, so that if we decide to change file format in the future, we don't have to reindex everything, but could just load the JSON files and write them out to our own cache file format v2.

Copy link
Member

@davidanthoff davidanthoff left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ZacLN I couldn't find the code where this actually is integrated into the load and save routine of the symbol server? Or did I just somehow miss this?

src/server.jl Outdated Show resolved Hide resolved
src/server.jl Outdated Show resolved Hide resolved
src/utils.jl Outdated Show resolved Hide resolved
@ZacLN ZacLN marked this pull request as ready for review September 3, 2020 09:14
@ZacLN
Copy link
Contributor Author

ZacLN commented Jan 11, 2021

Alright @davidanthoff, @pfitzseb, @aviatesk , this is functional and good to go. Do we want to retain the suffix for these files - jstore - which reads as a pretty crap and uninformative to me. We'll also need to remember to update the persistant storage folder (from 2 to 3) within vscode when the whole vscode story is updated.

@davidanthoff
Copy link
Member

Fantastic! We should probably first finish the CSTParser update across all the repos, and then merge this, right?

@ZacLN
Copy link
Contributor Author

ZacLN commented Jan 12, 2021

Either way is fine with me. Are we going to introduce this to LanguageServer/vscode (and so into the wild) at the same time as the CSTParser update? We'll likely get a fair few bugs from both but I don't they'll intersect in a way that'll make things more difficult.

@davidanthoff
Copy link
Member

I think that primarily depends on our availability to fix things quickly. I think if we all have a bit of time at hand in the next weeks (I do) then we could just push it out all at the same time?

@ZacLN
Copy link
Contributor Author

ZacLN commented Jan 12, 2021

Agreed, let's get them out as soon as we can then

@davidanthoff
Copy link
Member

Do we want to retain the suffix for these files - jstore - which reads as a pretty crap and uninformative to me.

I'm indifferent :) No one will ever see these, so I don't think it matters much.

We'll also need to remember to update the persistant storage folder (from 2 to 3) within vscode when the whole vscode story is updated.

Yes, I'll make sure to do that when I update the submodules in the VS Code repo!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants