Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Override Distribution Downloader to download opensearch-min artifact from a custom url #1240

Closed
peternied opened this issue Sep 14, 2021 · 15 comments
Assignees
Labels
Build Libraries & Interfaces enhancement Enhancement or improvement to existing feature or request v1.4.0

Comments

@peternied
Copy link
Member

The opensearch-plugin-cli works well for getting the last released and release candidate builds, but lacks pull artifacts from any location for scenarios like local development and testing.

This is needed for build and release process to validate components before they are released.

@setiah
Copy link
Contributor

setiah commented Sep 14, 2021

Related to opensearch-project/opensearch-build#319

@saratvemulapalli
Copy link
Member

There are 2 different pieces to this.

  1. Integration Test framework within OpenSearch which is leveraged by all plugins via opensearch.testclusters' gradle plugin. Even though the integration test is provided with an external test cluster it still downloads the min-artifact via DistributionDownloader. The fix is to not let the framework download the artifact if an external cluster is provided.
  2. Add support to pull the artifact from an override location/ use it from the local file system. This is needed for BWC tests where the test framework orchestrates the test cluster and upgrades them during the tests. The fix is to override the url DistributionDownloader is using and pull it from a new path via the gradle plugin.

@peternied peternied removed their assignment Sep 27, 2021
@dblock dblock added the v1.2.0 Issues related to version 1.2.0 label Nov 2, 2021
@dblock dblock changed the title opensearch-plugin-cli should accept a url as a parameter to download artifacts from opensearch-plugin-cli needs to accept a url as a parameter to download artifacts from Nov 2, 2021
@saratvemulapalli
Copy link
Member

saratvemulapalli commented Nov 2, 2021

There are 2 different pieces to this.

  1. Integration Test framework within OpenSearch which is leveraged by all plugins via opensearch.testclusters' gradle plugin. Even though the integration test is provided with an external test cluster it still downloads the min-artifact via DistributionDownloader. The fix is to not let the framework download the artifact if an external cluster is provided.
  2. Add support to pull the artifact from an override location/ use it from the local file system. This is needed for BWC tests where the test framework orchestrates the test cluster and upgrades them during the tests. The fix is to override the url DistributionDownloader is using and pull it from a new path via the gradle plugin.

Adding on this, couple of different options:

Short term: Update the zip in the artifacts repository for distribution downloader to pick it up. Downsides: This is manual and needs hand holding for every release.

Long Term 1: Skip downloading the zip via distribution downloader when an external cluster is provided for integration tests. Downsides: This only works for integration tests but not BWC tests. -> about 2-3 days of work.

Long Term 2: Override distribution downloader to download the cluster zip from a new endpoint dynamically. This solution works for IntegTests, and BWC tests. -> about 4-6 days of work.

I am leaning towards Long term 1, as its simpler and it has to be done anyway in the future. As the plugins integration test framework unnecessarily depend on published zips even though it's not used.

@saratvemulapalli
Copy link
Member

saratvemulapalli commented Nov 9, 2021

We have decided to go with Long term 1 for 1.2 which should anyway be done.

Double clicking on it:
The problem is the integration tests in plugins use testClusters in the build scripts even though when they would like to use remote endpoints. This is causing the integration tests in Jenkins CI to have the published artifacts even though the test clusters need not be setup.

Example: Looking at Anomaly Detection plugin. The integTest task depends on having a cluster setup but literally doesn't use it if few java arguments like remote cluster endpoints are provided.
Ref: https://github.com/opensearch-project/anomaly-detection/blob/main/build.gradle#L196-L209.

There are 2 ways to solve this:

  • The simplest way to solve this is to have dynamically configure integTest task in plugins depending on test cluster already setup or the framework needs to do it for them.
  • Have another task defined for remote endpoint integration tests and kick them via integtest.sh

So for longterm solution 1, this fix has to be made in plugins.
And for longterm solution 2, the enhancement has to be made in OpenSearch and consumed by plugins.

@saratvemulapalli
Copy link
Member

To test the above solution I did a quick POC in AD.

Added

task integTestRemote(type: StandaloneRestIntegTestTask) {
    testClassesDirs = sourceSets.test.output.classesDirs
    classpath = sourceSets.test.runtimeClasspath
    systemProperty 'tests.security.manager', 'false'
    systemProperty 'java.io.tmpdir', opensearch_tmp_dir.absolutePath

    systemProperty "https", System.getProperty("https")
    systemProperty "user", System.getProperty("user")
    systemProperty "password", System.getProperty("password")

    // Only rest case can run with remote cluster
    if (System.getProperty("tests.rest.cluster") != null) {
        filter {
            includeTestsMatching "org.opensearch.ad.rest.*IT"
            includeTestsMatching "org.opensearch.ad.e2e.*IT"
        }
    }

    if (System.getProperty("https") == null || System.getProperty("https") == "false") {
        filter {
            excludeTestsMatching "org.opensearch.ad.rest.SecureADRestIT"
        }
    }

    if (System.getProperty("tests.rest.bwcsuite") == null) {
        filter {
            excludeTestsMatching "org.opensearch.ad.bwc.*IT"
        }
    }
}

The tests start to run on Darwin (which usually doesnt because darwin artifacts are not published yet for OpenSearch).
The tests now fail due to Jar hell, but its still progress.

% ./gradlew integTestRemote -Dtests.rest.cluster=localhost:9200 -Dtests.cluster=localhost:9200 -Dtests.clustername="docker-cluster" -Dhttps=true -Duser=admin -Dpassword=admin
=======================================
OpenSearch Build Hamster says Hello!
  Gradle Version        : 6.6.1
  OS Info               : Mac OS X 10.16 (x86_64)
  JDK Version           : 14 (Oracle JDK)
  JAVA_HOME             : /Library/Java/JavaVirtualMachines/jdk-14.0.2.jdk/Contents/Home
  Random Testing Seed   : 21321835E3218CF
  In FIPS 140 mode      : false
=======================================

> Task :integTestRemote

org.opensearch.ad.e2e.DetectionResultEvalutationIT > classMethod FAILED
    java.lang.RuntimeException: found jar hell in test classpath
        at org.opensearch.bootstrap.BootstrapForTesting.<clinit>(BootstrapForTesting.java:112)
        at org.opensearch.test.OpenSearchTestCase.<clinit>(OpenSearchTestCase.java:251)
        at java.base/java.lang.Class.forName0(Native Method)
        at java.base/java.lang.Class.forName(Class.java:427)
        at com.carrotsearch.randomizedtesting.RandomizedRunner$2.run(RandomizedRunner.java:623)

        Caused by:
        java.lang.IllegalStateException: jar hell!
        class: com.sun.jna.AltCallingConvention
        jar1: /Users/vemsarat/.gradle/caches/modules-2/files-2.1/org.elasticsearch/jna/5.5.0/ade077cbb2618a18bfc6c335413b2b7163d97601/jna-5.5.0.jar
        jar2: /Users/vemsarat/.gradle/caches/modules-2/files-2.1/net.java.dev.jna/jna/5.5.0/e0845217c4907822403912ad6828d8e0b256208/jna-5.5.0.jar
            at org.opensearch.bootstrap.JarHell.checkClass(JarHell.java:300)
            at org.opensearch.bootstrap.JarHell.checkJarHell(JarHell.java:208)
            at org.opensearch.bootstrap.JarHell.checkJarHell(JarHell.java:99)
            at org.opensearch.bootstrap.BootstrapForTesting.<clinit>(BootstrapForTesting.java:110)
            ... 4 more


Suite: Test class org.opensearch.ad.e2e.DetectionResultEvalutationIT
  2> java.lang.RuntimeException: found jar hell in test classpath
        at org.opensearch.bootstrap.BootstrapForTesting.<clinit>(BootstrapForTesting.java:112)
        at org.opensearch.test.OpenSearchTestCase.<clinit>(OpenSearchTestCase.java:251)
        at java.base/java.lang.Class.forName0(Native Method)
        at java.base/java.lang.Class.forName(Class.java:427)
        at com.carrotsearch.randomizedtesting.RandomizedRunner$2.run(RandomizedRunner.java:623)

        Caused by:
        java.lang.IllegalStateException: jar hell!
        class: com.sun.jna.AltCallingConvention
        jar1: /Users/vemsarat/.gradle/caches/modules-2/files-2.1/org.elasticsearch/jna/5.5.0/ade077cbb2618a18bfc6c335413b2b7163d97601/jna-5.5.0.jar
        jar2: /Users/vemsarat/.gradle/caches/modules-2/files-2.1/net.java.dev.jna/jna/5.5.0/e0845217c4907822403912ad6828d8e0b256208/jna-5.5.0.jar
            at org.opensearch.bootstrap.JarHell.checkClass(JarHell.java:300)
            at org.opensearch.bootstrap.JarHell.checkJarHell(JarHell.java:208)
            at org.opensearch.bootstrap.JarHell.checkJarHell(JarHell.java:99)
            at org.opensearch.bootstrap.BootstrapForTesting.<clinit>(BootstrapForTesting.java:110)
            ... 4 more

org.opensearch.ad.rest.SecureADRestIT > classMethod FAILED
    java.lang.NoClassDefFoundError: Could not initialize class org.opensearch.ad.ODFERestTestCase
        at java.base/java.lang.Class.forName0(Native Method)
        at java.base/java.lang.Class.forName(Class.java:427)
        at com.carrotsearch.randomizedtesting.RandomizedRunner$2.run(RandomizedRunner.java:623)


Suite: Test class org.opensearch.ad.rest.SecureADRestIT
  2> java.lang.NoClassDefFoundError: Could not initialize class org.opensearch.ad.ODFERestTestCase
        at java.base/java.lang.Class.forName0(Native Method)
        at java.base/java.lang.Class.forName(Class.java:427)
        at com.carrotsearch.randomizedtesting.RandomizedRunner$2.run(RandomizedRunner.java:623)

org.opensearch.ad.rest.HistoricalAnalysisRestApiIT > classMethod FAILED
    java.lang.NoClassDefFoundError: Could not initialize class org.opensearch.ad.AnomalyDetectorRestTestCase
        at java.base/java.lang.Class.forName0(Native Method)
        at java.base/java.lang.Class.forName(Class.java:427)
        at com.carrotsearch.randomizedtesting.RandomizedRunner$2.run(RandomizedRunner.java:623)


Suite: Test class org.opensearch.ad.rest.HistoricalAnalysisRestApiIT
  2> java.lang.NoClassDefFoundError: Could not initialize class org.opensearch.ad.AnomalyDetectorRestTestCase
        at java.base/java.lang.Class.forName0(Native Method)
        at java.base/java.lang.Class.forName(Class.java:427)
        at com.carrotsearch.randomizedtesting.RandomizedRunner$2.run(RandomizedRunner.java:623)

org.opensearch.ad.rest.AnomalyDetectorRestApiIT > classMethod FAILED
    java.lang.NoClassDefFoundError: Could not initialize class org.opensearch.ad.AnomalyDetectorRestTestCase
        at java.base/java.lang.Class.forName0(Native Method)
        at java.base/java.lang.Class.forName(Class.java:427)
        at com.carrotsearch.randomizedtesting.RandomizedRunner$2.run(RandomizedRunner.java:623)


Suite: Test class org.opensearch.ad.rest.AnomalyDetectorRestApiIT
  2> java.lang.NoClassDefFoundError: Could not initialize class org.opensearch.ad.AnomalyDetectorRestTestCase
        at java.base/java.lang.Class.forName0(Native Method)
        at java.base/java.lang.Class.forName(Class.java:427)
        at com.carrotsearch.randomizedtesting.RandomizedRunner$2.run(RandomizedRunner.java:623)
Exception in thread "Thread-4" java.lang.NoClassDefFoundError: Could not initialize class org.opensearch.test.OpenSearchTestCase
        at java.base/java.lang.Thread.run(Thread.java:832)
        Suppressed: java.lang.IllegalStateException: No context information for thread: Thread[id=19, name=Thread-4, state=RUNNABLE, group=TGRP-DetectionResultEvalutationIT]. Is this thread running under a class com.carrotsearch.randomizedtesting.RandomizedRunner runner context? Add @RunWith(class com.carrotsearch.randomizedtesting.RandomizedRunner.class) to your test class. Make sure your code accesses random contexts within @BeforeClass and @AfterClass boundary (for example, static test class initializers are not permitted to access random contexts).
                at com.carrotsearch.randomizedtesting.RandomizedContext.context(RandomizedContext.java:249)
                at com.carrotsearch.randomizedtesting.RandomizedContext.current(RandomizedContext.java:134)
                at com.carrotsearch.randomizedtesting.RandomizedRunner.augmentStackTrace(RandomizedRunner.java:1885)
                at com.carrotsearch.randomizedtesting.RunnerThreadGroup.uncaughtException(RunnerThreadGroup.java:20)
                at java.base/java.lang.Thread.dispatchUncaughtException(Thread.java:1993)

Tests with failures:
 - org.opensearch.ad.e2e.DetectionResultEvalutationIT.classMethod
 - org.opensearch.ad.rest.SecureADRestIT.classMethod
 - org.opensearch.ad.rest.HistoricalAnalysisRestApiIT.classMethod
 - org.opensearch.ad.rest.AnomalyDetectorRestApiIT.classMethod

4 tests completed, 4 failed

> Task :integTestRemote FAILED

@saratvemulapalli
Copy link
Member

saratvemulapalli commented Nov 17, 2021

There are 2 different pieces to this.

  1. Integration Test framework within OpenSearch which is leveraged by all plugins via opensearch.testclusters' gradle plugin. Even though the integration test is provided with an external test cluster it still downloads the min-artifact via DistributionDownloader. The fix is to not let the framework download the artifact if an external cluster is provided.
  2. Add support to pull the artifact from an override location/ use it from the local file system. This is needed for BWC tests where the test framework orchestrates the test cluster and upgrades them during the tests. The fix is to override the url DistributionDownloader is using and pull it from a new path via the gradle plugin.

Adding on this, couple of different options:

Short term: Update the zip in the artifacts repository for distribution downloader to pick it up. Downsides: This is manual and needs hand holding for every release.

Long Term 1: Skip downloading the zip via distribution downloader when an external cluster is provided for integration tests. Downsides: This only works for integration tests but not BWC tests. -> about 2-3 days of work.

Long Term 2: Override distribution downloader to download the cluster zip from a new endpoint dynamically. This solution works for IntegTests, and BWC tests. -> about 4-6 days of work.

I am leaning towards Long term 1, as its simpler and it has to be done anyway in the future. As the plugins integration test framework unnecessarily depend on published zips even though it's not used.

Long Term 1 is now implemented and taken care of in the plugins.

The issue is still open to take care of Long term 2, which is opensearch.testclusters gradle plugin should accept a URL to override distribution downloader ivy repository and download the min artifact from a custom place. This solution will help automating backwards-compatibility tests in our release pipeline.

@saratvemulapalli saratvemulapalli changed the title opensearch-plugin-cli needs to accept a url as a parameter to download artifacts from Override Distribution Downloader to download opensearch-min artifact from a custom url Nov 17, 2021
@Rishikesh1159
Copy link
Member

I will work on this.

@saratvemulapalli
Copy link
Member

opensearch.testclusters gradle plugin takes few arguments like: testDistribution, version, numberOfNodes etc.
We should expose another parameter testDistributionUrl and feed it to the testclusters plugin -> OpenSearchCluster.java-> OpenSearchNode.java -> DistributionDownloadPlugin.java

opensearch.testclusters plugin internally uses OpenSearchCluster.java and OpenSearchCluster.java internally uses OpenSearchNode.java
OpenSearchNode.java downloads the required distribution via DistributionDownloadPlugin.

We should override the Ivy repository url in DistributionDownloadPlugin.java with the custom url.

Ref: https://github.com/opensearch-project/anomaly-detection/blob/main/build.gradle#L302

@dblock
Copy link
Member

dblock commented Dec 3, 2021

@Rishikesh1159 are you still looking into this?

@Rishikesh1159
Copy link
Member

@dblock Yes I am working on this.

@dblock
Copy link
Member

dblock commented Mar 1, 2022

Looks like we reverted the implementation? What happened? What's the next step for getting this implemented?

@Rishikesh1159
Copy link
Member

Yes @dblock. This implementation was causing Alerting plugin build to fail. (opensearch-project/alerting#312). I moved setupDownloadService() call out from apply method for this implementation. Looks like it is causing the error of "Cannot mutate content repository descriptor 'MavenLocal(file:/usr/share/opensearch/.m2/repository/)' after repository has been used".

Before releasing PR, I tested this implementation with Anomaly Detection plugin and job scheduler plugin. Both of them worked fine as expected.

I still did not fix this or find an alternative for this implementation. I will work on this. Also we need to write a test this failing scenario.

@CEHENKLE
Copy link
Member

CEHENKLE commented Mar 4, 2022

I still did not fix this or find an alternative for this implementation. I will work on this. Also we need to write a test this failing scenario.

Yes, please do. Thank you!

/C

@Rishikesh1159
Copy link
Member

Rishikesh1159 commented Mar 8, 2022

I was able to root cause failure in Alerting plugin. In order to fix this issue #1240, in my previous PR I moved call to setupDownloadServiceRepo() in DistributionDownloadPlugin.java from apply() method to setupDistributions() method, which is called in afterEvaluate(). Here is previous PR: #2086 which is reverted.

-> This change made to setup repo at the end of configuration phase [One of gradle life cycle process] after entire root and sub projects are configured.

-> Internally in subprojects of Alerting plugin in configuration phase call to the repo is being made before it was actually setup. But the setup will happen after configuration phase ends. So, this was the reason Alerting plugin was failing.

-> Temporarily we moved back call to setupDownloadServiceRepo() in DistributionDownloadPlugin.java to apply() method as it was before. Here the repo is setup before evaluation happens. But, it does not fix this issue.

Fix:

-> The fix I found was to use Project properties to set the custom url from plugin instead of system properties. Before, I was trying to use system properties to set custom url from plugin.

-> These system properties are not available for us to use them in configuration phase, we had to wait until configuration phase ends and afterEvaluate() happens to use these system properties.

->But Project properties are different and are available in configuration phase.

-> So, the fix would be using Project properties to set custom url from plugin side.

-> So no moving of seupDownloadServiceRepo() call out of apply() in DistributionDownloadPlugin.java is necessary and just adding conditions and logic of cutom Url is enough to fix this issue.

Test case for handling Alerting plugin failure:

-> I haven't thought of a way writing test case for this failure. As, it is happening because of moving method call in gradle life cycle phase, I find it a bit complex to write a test case for this scenario. opensearch-project/alerting#312

Conclusion:
-> For now I will release PR to fix this issue, but I am planning to keep it open until I figure out a test case for handling Alerting plugin failure. Please let us know if anyone has any suggestions for testing this scenario.

@Rishikesh1159
Copy link
Member

Here is the exact line where Alerting plugin fail with my previous reverted PR #2086. There are other plugins using similar piece of code but for those this piece of code is wrapped inside an after evaluate block, which is making it not fail. So, test case mentioned in my above comment will be related to this. This was the error on Alerting plugin side: opensearch-project/alerting#312

Here is the PR #2420 where I will be working on this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Build Libraries & Interfaces enhancement Enhancement or improvement to existing feature or request v1.4.0
Projects
None yet
Development

No branches or pull requests

6 participants