Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

EZP-30463: As an Developer, I want to use Solr Cloud #137

Merged
merged 1 commit into from
Jun 21, 2019
Merged

Conversation

adamwojs
Copy link
Member

@adamwojs adamwojs commented Apr 18, 2019

JIRA: https://jira.ez.no/browse/EZP-30463

Description

This PR introduces support for Solr Cloud. More information https://lucene.apache.org/solr/guide/6_6/solrcloud.html

Configuration

From now on the user can specify data distribution strategy for connection via distribution_strategy option. The possible values are:

The default value is standalone for backward compatibility reasons.

Example Solr Cloud configuration :

ez_search_engine_solr:
    endpoints:
        main:
            dsn: '%solr_dsn%'
            core: '%solr_main_core%' 
        en:
            dsn: '%solr_dsn%'
            core: '%solr_en_core%'
        fr:
            dsn: '%solr_dsn%'
            core: '%solr_fr_core%'
        # ...
    connections:
        default:
            distribution_strategy: cloud
            entry_endpoints:
                - main
                - en
                - fr
             # -  ...
            mapping:
                translations:
                    eng-GB: en
                    fre-FR: fr
                    # ...
                main_translations: main

Cluster screenshot:
Zrzut ekranu 2019-05-21 o 13 33 54

Document routing

This solution uses the default Solr Cloud document routing strategy: compositeId. In compare to implicit strategy eZ Platform doesn't need to know shards list.

More information: https://lucene.apache.org/solr/guide/6_6/shards-and-indexing-data-in-solrcloud.html#ShardsandIndexingDatainSolrCloud-DocumentRouting

Language specific analysis

The configuration is based on Multi-Core setup so any specific language analysis options could be specified on the collection level.

@adamwojs adamwojs self-assigned this Apr 18, 2019
@andrerom
Copy link
Contributor

☁️FTW😉

bin/.travis/init_solr.sh Outdated Show resolved Hide resolved
bundle/DependencyInjection/Configuration.php Outdated Show resolved Hide resolved
bin/.travis/init_solr.sh Outdated Show resolved Hide resolved
.travis.yml Outdated
@@ -18,7 +18,8 @@ matrix:
env: TEST_CONFIG="phpunit-integration-legacy-solr.xml" SOLR_VERSION="6.5.1" CORES_SETUP="shared"
- php: 7.2
env: TEST_CONFIG="phpunit-integration-legacy-solr.xml" SOLR_VERSION="6.6.5" CORES_SETUP="single" SOLR_CORES="collection1"

- php: 7.2
env: TEST_CONFIG="phpunit-integration-legacy-solr.xml" SOLR_VERSION="6.6.5" CORES_SETUP="cloud" SOLR_CORES="collection1" SOLR_CLOUD="yes"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm, based on logic in getTestConfigurationFile it does not seems like we need both CORES_SETUP and SOLR_CLOUD. CORES_SETUP=cloud seems like it could be detected anywhere needed.

But maybe logic in getTestConfigurationFile is temprary (simple for now) and you intend to allow several different cloud configs right?

@adamwojs adamwojs force-pushed the ezp_30463 branch 2 times, most recently from a1818cf to 7a2d376 Compare May 8, 2019 06:51
@adamwojs adamwojs changed the title [WIP] EZP-30463: As an Developer, I want to use Solr Cloud EZP-30463: As an Developer, I want to use Solr Cloud May 8, 2019
@adamwojs adamwojs requested review from alongosz and kmadejski May 8, 2019 06:51
@adamwojs
Copy link
Member Author

PR has been updated with significant changes so please re-review 😉

Copy link
Member

@kmadejski kmadejski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just one small thing which we discussed before. It seems that it has to be covered in this place.

Besides, looks good to me, great job! 🙂

copy_files ${TEMPLATE_DIR} "${files[*]}"

# modify solrconfig.xml to remove section that doesn't agree with our schema
sed -i.bak '/<updateRequestProcessorChain name="add-unknown-fields-to-the-schema">/,/<\/updateRequestProcessorChain>/d' "${TEMPLATE_DIR}/solrconfig.xml" || exit_on_error "Can't modify file '${TEMPLATE_DIR}/solrconfig.xml'"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you have to add this line (https://github.com/ezsystems/ezplatform-solr-search-engine/blob/master/bin/generate-solr-config.sh#L123) here as well even though tests are passing. Without it, you'll notice a significant delay before seeing content changes (eg. breadcrumbs in AdminUI).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Applied in 260c729

download

if [ "$SOLR_CLOUD" = "no" ]; then
$SCRIPT_DIR/../generate-solr-config.sh \
Copy link
Contributor

@andrerom andrerom May 23, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@adamwojs This script is used to generate config for platform.sh, which is kind of a must. So most likely many of the changes here should be moved into generate-solr-config.s, so script will be able to continued to be used also with cloud config on platform.sh.

@vidarl might be able to provide more info, and also who to talk to there if needed.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You might be right, but let's process with combination platform.sh + Solr Cloud as follow up for this PR. I want to unblock QA on this PR and platform.sh itself is a quite new topic for me 😉

composer.json Outdated
@@ -14,7 +14,7 @@
"prefer-stable": true,
"require": {
"php": "^7.1",
"ezsystems/ezpublish-kernel": "^8.0@dev",
"ezsystems/ezpublish-kernel": "dev-ezp_30463_solr_cloud as 7.5",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"ezsystems/ezpublish-kernel": "dev-ezp_30463_solr_cloud as 7.5",
"ezsystems/ezpublish-kernel": "^7.5.2@dev || ^8.0@dev",

Once kernel pr is merged.

@alongosz
Copy link
Member

alongosz commented Jun 5, 2019

@adamwojs If this is to work with eZ Platform 2.5 LTS you need to rebase against the latest stable branch, master is for 3.0.

Copy link

@m-tyrala m-tyrala left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cloud seems to work fine in various configurations of nodes, shards, and cores. Searching can be performed for various languages (japanese, polish, french, german, english). Searching still works after deleting the shard replica. After deleting Leader replica, another one is taking its place automatically. Documents are distributed quasi equally (distribution scales well with the number of documents).

QA approves

Copy link
Contributor

@andrerom andrerom left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 But please push 1.7 branch based on 1.6 + composer changes and rebase this on that before merging, as @alongosz said master is 3.0 only now and we need this on 2.5.

@adamwojs adamwojs changed the base branch from master to 1.7 June 21, 2019 11:34
@adamwojs
Copy link
Member Author

PR has been rebased against the new 1.7 branch (also temporary composer.json changes has been reverted). Last successful CI build https://travis-ci.org/ezsystems/ezplatform-solr-search-engine/builds/548644319

@adamwojs adamwojs merged commit af22427 into 1.7 Jun 21, 2019
@adamwojs adamwojs deleted the ezp_30463 branch June 21, 2019 12:50
@kmadejski
Copy link
Member

Good job @adamwojs! It's cool to see this feature 🎉

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging this pull request may close these issues.

5 participants