From 9c69136599c43acab6ca5a7acf8d6ae1c8823e8b Mon Sep 17 00:00:00 2001 From: kumarks1122 Date: Tue, 30 May 2023 23:13:13 +0530 Subject: [PATCH 1/4] LR-453 | Data products readme added --- README.md | 51 ++++++++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 50 insertions(+), 1 deletion(-) diff --git a/README.md b/README.md index 4ee4eb403..0dc703017 100644 --- a/README.md +++ b/README.md @@ -1 +1,50 @@ -# data-products \ No newline at end of file +# LERN data-products + +Data products is a collection of scala scripts which are used to generate reports, updating data in the redis and migration of data. + +The code in this repository is licensed under MIT License unless otherwise noted. Please see the [LICENSE](https://github.com/project-sunbird/sunbird-lms-service/blob/master/LICENSE) file for details. + +## System Requirements + +### Prerequisites + +- Java 11 +- Scala 2.12 +- Spark 3.1.3 +- Latest Maven + +### Dependency libraries for data-products +- `sunbird-analytics-core` + + https://github.com/project-sunbird/sunbird-analytics-core + + Analytics job driver and analytics framework is used to trigger the job in job manager + +- `sunbird-core-dataproducts` + + https://github.com/project-sunbird/sunbird-core-dataproducts + + Batch-models module is used from this + +***Note***: The above dependency libraries has to be built from the respective release branch for the data-products. Use the +below command for building the dependencies. +``` +mvn clean install -DskipTests +``` + +## Setup of data-products + +Each data-product is an independent job which used for generating reports and data migrations. Since that each data-product having different sets of data provider dependencies. Data-provider for each job is listed in the below reference link. + +[Reference Link](https://project-sunbird.atlassian.net/wiki/spaces/UM/pages/3135471624/Migration+of+Data+Products+in+Sunbird-LERN#%F0%9F%A7%AE-Data-product-list) + +The data-products can be tested locally with the testcases. + +***Note***: Use below command for running specific testcases from command line shell. + +``` +mvn -Dsuites={{classname with package path}} test + +# Example: +# mvn -Dsuites=org.sunbird.lms.exhaust.TestProgressExhaustJob test +``` From 43f1d811fdf6ff80053bcfa2db3b7664edb0abea Mon Sep 17 00:00:00 2001 From: kumarks1122 Date: Thu, 1 Jun 2023 22:08:31 +0530 Subject: [PATCH 2/4] LR-453 | Data products readme updated --- README.md | 62 +++++++++++++++++++++++++++++++++++++++++++++---------- 1 file changed, 51 insertions(+), 11 deletions(-) diff --git a/README.md b/README.md index 0dc703017..297ec7c1d 100644 --- a/README.md +++ b/README.md @@ -13,35 +13,75 @@ The code in this repository is licensed under MIT License unless otherwise noted - Spark 3.1.3 - Latest Maven -### Dependency libraries for data-products -- `sunbird-analytics-core` +### Data provider dependencies +Following data providers will be required for running the job with spark-submit mode. +- Cassandra +- Postgres +- Druid +- Redis +- Elasticsearch +- Content search API +- Org search API + +### Setup of dependency libraries for data-products +Build the dependency libraries in local machine +#### sunbird-analytics-core +Analytics job driver and analytics framework is used to trigger the job in job manager - https://github.com/project-sunbird/sunbird-analytics-core +``` +### Steps to build ### + +# Clone the repo +git clone git@github.com:Sunbird-Obsrv/sunbird-analytics-core.git - Analytics job driver and analytics framework is used to trigger the job in job manager +# checkout to the respective release branch +git checkout release-5.1.1 -- `sunbird-core-dataproducts` +# build the project +mvn clean install -DskipTests +``` - https://github.com/project-sunbird/sunbird-core-dataproducts +#### sunbird-core-dataproducts - Batch-models module is used from this +Batch-models module is used from this library handling the execution of job -***Note***: The above dependency libraries has to be built from the respective release branch for the data-products. Use the -below command for building the dependencies. ``` +### Steps to build ### + +# Clone the repo +git clone git@github.com:Sunbird-Obsrv/sunbird-core-dataproducts.git + +# checkout to the respective release branch +git checkout release-5.1.1 + +# build the project mvn clean install -DskipTests ``` +***Note***: The above dependency libraries has to be built from the respective release branch for the data-products. + ## Setup of data-products -Each data-product is an independent job which used for generating reports and data migrations. Since that each data-product having different sets of data provider dependencies. Data-provider for each job is listed in the below reference link. +Each data-product is an independent spark job which used for generating reports and data migrations. Since that each data-product having different sets of data provider dependencies. Data-provider for each job is listed in the below reference link. [Reference Link](https://project-sunbird.atlassian.net/wiki/spaces/UM/pages/3135471624/Migration+of+Data+Products+in+Sunbird-LERN#%F0%9F%A7%AE-Data-product-list) The data-products can be tested locally with the testcases. -***Note***: Use below command for running specific testcases from command line shell. +**Steps to build the project** + +``` +# Clone the repo +git clone git@github.com:Sunbird-Lern/data-products.git + +# checkout to the respective release branch +git checkout release-5.3.0 + +# build the project +mvn clean install -DskipTests +``` +**Steps to run the testcase** ``` mvn -Dsuites={{classname with package path}} test From 510b77a9e21f227ecf0d8e6e3d7ad49b7a94410d Mon Sep 17 00:00:00 2001 From: kumarks1122 Date: Thu, 1 Jun 2023 22:11:33 +0530 Subject: [PATCH 3/4] LR-453 | Data products readme updated --- README.md | 26 ++++++++++++++++++++++++-- 1 file changed, 24 insertions(+), 2 deletions(-) diff --git a/README.md b/README.md index 297ec7c1d..6324cc09f 100644 --- a/README.md +++ b/README.md @@ -60,9 +60,9 @@ mvn clean install -DskipTests ***Note***: The above dependency libraries has to be built from the respective release branch for the data-products. -## Setup of data-products +## Setup of data-products in local -Each data-product is an independent spark job which used for generating reports and data migrations. Since that each data-product having different sets of data provider dependencies. Data-provider for each job is listed in the below reference link. +Each data-product is an independent spark job which used for generating reports and data migrations. So each data-product having different sets of data provider dependencies. Data-provider for each job is listed in the below reference link. [Reference Link](https://project-sunbird.atlassian.net/wiki/spaces/UM/pages/3135471624/Migration+of+Data+Products+in+Sunbird-LERN#%F0%9F%A7%AE-Data-product-list) @@ -77,6 +77,9 @@ git clone git@github.com:Sunbird-Lern/data-products.git # checkout to the respective release branch git checkout release-5.3.0 +# change the directory to project directory +cd lern-data-products + # build the project mvn clean install -DskipTests ``` @@ -88,3 +91,22 @@ mvn -Dsuites={{classname with package path}} test # Example: # mvn -Dsuites=org.sunbird.lms.exhaust.TestProgressExhaustJob test ``` + +For running the data-products testcase, we are using following data sources in embedded mode +- cassandra +- postgres +- redis + +Data sources shema used in testcases are below +
+https://github.com/Sunbird-Lern/data-products/blob/release-5.3.0/lern-data-products/src/main/resources/data.cql +
+https://github.com/Sunbird-Lern/data-products/blob/release-5.3.0/lern-data-products/src/test/scala/org/sunbird/core/util/EmbeddedPostgres.scala + +And the API requests are mocked inside the testcase with mockwebserver library. + +## Run Data-products in server + +Data-products in server runs in spark-submit mode. Installation and execution guide can be found from the below link + +https://lern.sunbird.org/use/developer-installation/data-products \ No newline at end of file From ae349f7cfece0887e0d2f2659d8f242152dcbfa6 Mon Sep 17 00:00:00 2001 From: kumarks1122 Date: Thu, 1 Jun 2023 22:40:49 +0530 Subject: [PATCH 4/4] LR-453 | Data products readme updated --- README.md | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/README.md b/README.md index 6324cc09f..8e09f467c 100644 --- a/README.md +++ b/README.md @@ -92,6 +92,10 @@ mvn -Dsuites={{classname with package path}} test # mvn -Dsuites=org.sunbird.lms.exhaust.TestProgressExhaustJob test ``` +**Note**: While testcase execution, report files will be generated and verified and deleted immediately after the testcase is completed. Check for the file path from the testcase for manual verification. +
+We suggest running the testcases in debug mode using IDE for debugging. + For running the data-products testcase, we are using following data sources in embedded mode - cassandra - postgres