Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Big performance bottleneck with default comparator #330

Closed
Delapouite opened this issue Oct 24, 2012 · 7 comments
Closed

Big performance bottleneck with default comparator #330

Delapouite opened this issue Oct 24, 2012 · 7 comments

Comments

@Delapouite
Copy link
Contributor

Hi

The FilesCollection has a default comparator : [date:-1, name:1]

The issue is that during the first steps of the generation process, the onChange event keeps triggering the sorting of the collection over and over which is a very costly operation over heaps of entries (thousands of models).

By simply removing this comparator, the Parsing phase time was divided by more than 10, going from 50 seconds to only 3. The Contextualizing step went from 15 minutes to only 2min.

Significant improvements that are such a relief to my Docpad experience !

Is the comparator really mandatory in those initial phases or should collections be sorted only once on the contextualizeAfter event, before the rendering ?

@balupton
Copy link
Member

Wow. What a fantastic find! Great work!

This makes me think, sorting is important, and it is important for live collections to stay up to date, however not all the time is that important. Perhaps we can pause and resume live collections after we've done processing batches. Kind of combining the best of non-live and live together. This would surely bring huge improvements to everyone that uses queryengine.

For DocPad, that would mean, pausing live collections at the start, then resuming them just before the rendering process. I'll think into this some more. But we can happily wipe that default comparator for any default collections in DocPad, I doubt anyone would use those collections directly.

@balupton
Copy link
Member

Just took a look and I couldn't find the default comparator you mention. Is that defined in the docpad configuration file or in docpad itself? Perhaps my eyes are rusty ;)

Also, turns out QueryEngine already does support the pause resume stuff we're after. collection.live(false) to pause and collection.live(true) to resume. This should work well :)

So I think in perhaps generatePrepare we pause all the collections, and then in render event we resume them. Your thoughts? If you're inclined to do a pull, I'll happily accept :)

@balupton
Copy link
Member

Did some work on the above suggestion on https://github.com/bevry/docpad/tree/dev-suspendcollections - however there is some major problems with this:

  • live must be re-enabled before generateRender
  • add events that modify things do not care for the reset event which is what .query() will fire

Not sure how we can optimise this further, as the above requires some nasty hacks.

@Delapouite
Copy link
Contributor Author

That's definitely a problem not easy to tackle, because we also have to make sure it doesn't break plugin behavior that hook early in the process.

For instance I rely a lot on the relatedPlugin which does its magic on the parseAfter event, setting it's own comparator on sub relatedDocuments collections. By removing the default comparator, and reactivating the live on the generateRender, the order is all shuffled.

@balupton
Copy link
Member

@Delapouite the work I've been doing on queryengine should see massive improvements here - http://jsperf.com/queryengine/3

@balupton
Copy link
Member

Give DocPad v6.10.0 a try. It handles things a bit differently so you should see good performance boosts.

balupton added a commit that referenced this issue Nov 23, 2012
- v6.12.0 November 23, 2012
	- When creating new documents or files, if it is inside an unknown
path we will now default to creating a document intead of a file
	- We now send growl notifications when errors occur
		- Thanks to [Luke Hagan](https://github.com/lhagan) for [pull request
#346](#346) and [issue
#343](#343)
	- We now error and provide suggestions when an extension transform
doesn't do anything
		- Thanks to [Farid Neshat](https://github.com/alFReD-NSH), [Elias
Dawson](https://github.com/eliasdawson) and [Steve
Trevathan](https://github.com/kidfribble) for [issue
#192](#192)
	- Watching stability has been improved signifcantly
		- Thanks to [ashnur](https://github.com/ashnur) for [issue
#283](#283)
	- Parser headers that don't include spacing now work again (e.g.
`---cson` instead of `--- cson`)
		- Thanks to [bobobo1618](https://github.com/bobobo1618) for [issue
#341](#341)
	- Removed default comparator on `FilesCollection` due to performance
improvement it provides
		- Thanks to [Bruno Héridet](https://github.com/Delapouite) for [issue
#330](#330)
	- Added
		- `Document::parseFileDirectory(opts,next)`
		- `Document::parseDocumentDirectory(opts,next)`
		- `FilesCollection::fuzzyFindOne(data)`
@balupton
Copy link
Member

Released to DocPad v6.12

balupton added a commit that referenced this issue Oct 23, 2013
- v6.12.0 November 23, 2012
	- When creating new documents or files, if it is inside an unknown
path we will now default to creating a document intead of a file
	- We now send growl notifications when errors occur
		- Thanks to [Luke Hagan](https://github.com/lhagan) for [pull request
#346](#346) and [issue
#343](#343)
	- We now error and provide suggestions when an extension transform
doesn't do anything
		- Thanks to [Farid Neshat](https://github.com/alFReD-NSH), [Elias
Dawson](https://github.com/eliasdawson) and [Steve
Trevathan](https://github.com/kidfribble) for [issue
#192](#192)
	- Watching stability has been improved signifcantly
		- Thanks to [ashnur](https://github.com/ashnur) for [issue
#283](#283)
	- Parser headers that don't include spacing now work again (e.g.
`---cson` instead of `--- cson`)
		- Thanks to [bobobo1618](https://github.com/bobobo1618) for [issue
#341](#341)
	- Removed default comparator on `FilesCollection` due to performance
improvement it provides
		- Thanks to [Bruno Héridet](https://github.com/Delapouite) for [issue
#330](#330)
	- Added
		- `Document::parseFileDirectory(opts,next)`
		- `Document::parseDocumentDirectory(opts,next)`
		- `FilesCollection::fuzzyFindOne(data)`
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants