EZP-26807: First implementation for SolrCloud support #86

pspanja · 2016-12-22T13:56:24Z

https://jira.ez.no/browse/EZP-26807

This targets Solr search engine v2.0

This implements support for SolrCloud. Support for using Solr in standalone mode is here removed, meaning that from now on Solr search engine will have to be used with Solr backend running in cloud mode.

`implicit` routing

The approach taken here is using implicit document router to control where the documents are indexed. This is achieved by specifying special field in the indexed document that holds the identifier of the shard it's to be indexed in. In order for that to work the collection must to be created with these parameters:

router.name=implicit
router.field=<field name>

Unfortunately implicit routing is not supported by the bundled Solr start script, which can create a collection with compositeId router only, meaning that setting up the cloud for local development will be little bit more involved. Solr initialization script for Travis that is provided here can be used as an example. For more information see:

Previously, running Solr in standalone/multicore mode enabled using a separate schema per core. Usually that would mean a dedicated language analysis would be configured per core. With Solr running in the cloud mode, a collection with all it's shards must have the same configuration. That means we will need to handle multiple (per language) full text fields in the same schema. That is yet to be implemented on this PR. For more details on the available options see Semantic & Multilingual Strategies in Lucene/Solr. For possible future support for dynamic analyzers see SOLR-6492.

`compositeId` routing

With compositeId routing, exact destination shard for a document is not strictly controlled. With it, we provide the shard key and Solr takes care of choosing the exact shard by itself. For us, that means it would not be possible to direct a document in a specific language to a shard dedicated for that language.

While this might not fit the multilingual setup, it would be desirable for single language setup. The benefit of compositeId routing is that shards can be split, which is not possible when using implicit routing. While theoretical document limit per shard is of no concern for us (more that 2 billion documents), being able to split shards is still practical, as Solr node will work best if it has enough memory to cache it's data.

Implementing support for compositeId routing is left for future improvement.

TODOs

Update schema and full text search to use multiple full text fields
Adjust integration tests once Do not call getSubtreeLocationsCount from provider ezpublish-kernel#1871 is fixed

pspanja · 2016-12-23T12:44:59Z

Now added an issue and some description. Any feedback welcome.

andrerom · 2017-03-08T13:09:47Z

bin/.travis/init_solr.sh


 download() {
    case ${SOLR_VERSION} in
-        4.10.4|6.3.0 )
+        6.3.0 )


rebase needed to add 6.4.1

adamwojs · 2019-05-08T07:01:15Z

Closed in favor #137

pspanja mentioned this pull request Dec 22, 2016

Do not call getSubtreeLocationsCount from provider ezsystems/ezpublish-kernel#1871

Closed

pspanja force-pushed the solr-cloud-support-2.0 branch from f396b67 to 7b5d46c Compare December 23, 2016 08:48

pspanja added 6 commits December 23, 2016 12:27

First implementation for SolrCloud support

21352f9

Resolved todo

683a86e

Fix return hint

1bad9bf

Simplify indexing not to use nested arrays

2828b44

Remove tests for entry endpoint defaulting

553c5e6

Update unit tests: endpoint -> shard

755396a

pspanja force-pushed the solr-cloud-support-2.0 branch from 7b5d46c to 755396a Compare December 23, 2016 11:30

pspanja changed the title ~~[WIP] First implementation for SolrCloud support~~ EZP-26807: First implementation for SolrCloud support Dec 23, 2016

pspanja and others added 3 commits December 23, 2016 13:50

Removed Solr 4.10.4 from Travis builds

b11f2e7

TMP: use eDisMax for full text search

02561df

[Composer] Bump version to 2.0 and allow kernel 7

a73ea40

andrerom reviewed Mar 8, 2017

View reviewed changes

bin/.travis/init_solr.sh

download() {

case ${SOLR_VERSION} in

4.10.4|6.3.0 )

6.3.0 )

Copy link

Contributor

andrerom Mar 8, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rebase needed to add 6.4.1

adamwojs closed this May 8, 2019

andrerom deleted the solr-cloud-support-2.0 branch May 8, 2019 07:26

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

EZP-26807: First implementation for SolrCloud support #86

EZP-26807: First implementation for SolrCloud support #86

pspanja commented Dec 22, 2016 •

edited by andrerom

Loading

pspanja commented Dec 23, 2016

andrerom Mar 8, 2017

adamwojs commented May 8, 2019

EZP-26807: First implementation for SolrCloud support #86

EZP-26807: First implementation for SolrCloud support #86

Conversation

pspanja commented Dec 22, 2016 • edited by andrerom Loading

https://jira.ez.no/browse/EZP-26807

This targets Solr search engine v2.0

implicit routing

compositeId routing

TODOs

pspanja commented Dec 23, 2016

andrerom Mar 8, 2017

Choose a reason for hiding this comment

adamwojs commented May 8, 2019

pspanja commented Dec 22, 2016 •

edited by andrerom

Loading

`implicit` routing

`compositeId` routing