Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unloadable indices for Elasticsearch 2.3.4 #1

Open
wants to merge 1 commit into
base: 2.3.4
Choose a base branch
from

Conversation

m31collision
Copy link
Owner

This commit adds unloadable "phantom" indices mainly for reducing heap usage when you need to store huge amount of data and search time does not matter.

Index consist of shards which are distributed accross nodes in the cluster. Unloading works per node only - each node has its own phantom indices manager, so entire index can be in state of partially loaded.

When index reqested to be unloadable:

  • entire index goes into read-only, so data will not be changed - no need in translogs
  • phantom shards replaces standard ones and comes into play
  • they caches all static stats because data is read-only
  • unloads indexed data (just closes Lucene's SearcherManager)
  • profit!

When search request comes:

  • manager unloads sufficient amount of other shards on the current node
  • shard loads data again
  • search can be performed

Other search requests will load phantom shards, but when limit is reached, requests will be blocked until other search requests will be finished. As a result, the more less amount of phantom shards will be permitted to load => the more requests will wait on these unloaded shards => the more time will be requied to complete search request.

Phantom indices can be turned off to index new data.

How to use:

  • Index boolean setting "index.phantom" controls whether index is unloadable or not
  • Cluster bytesize/percent setting "index.phantom.max_heap_size" limits a number of loaded phantom shards per node, others are unloaded
  • Deletion of phantom index works despite the read-only mode

Stats affected/added

  • _cat/indices has new column with +/- that shows which indices are phantom
  • _cat/shards has new column with +/- that shows which shards are loaded, empty space for standard shards
  • _nodes/stats has new section with phantom manager stats
  • added _stats/phantom that shows phantom shards stats additionally summarized per index

* can be made/unmade unloadable by putting setting "index.phantom" with corresponding true/false value to the index settings
* limit memory usage by unloadable indices on the current node by putting setting "index.phantom.max_heap_size" value (percent/bytesize)
* cannot index new data, so they are read-only
* but still they can be deleted
* stats on _cat/indices, _cat/shards, _nodes/stats, _stats, _stats/phantom
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant