learningOrchestra: a distributed machine learning processing tool

Objective

The goal of this work is to develop a tool, named learningOrchestra, to facilitate and streamline the data science iterative process of:

Gathering data;
Cleaning/preparing the datasets;
Building models;
Validating their predictions; and
Deploying the results.

Architecture

The architecture of learningOrchestra is a collection of microservices deployed in a cluster.

A dataset (in CSV format) can be loaded from an URL using the Database API microservice, which converts the dataset to JSON and later stores it in MongoDB.

It is also possible to perform several preprocessing and analytical tasks using learningOrchestra's collection of microservices.

With learningOrchestra, you can build prediction models with different classifiers simultaneously using stored and preprocessed datasets with the Model Builder microservice. This microservice uses a Spark cluster to make prediction models using distributed processing. You can compare the different classification results over time to fit and increase prediction accuracy.

By providing their own preprocessing code, users can create highly customized model predictions against a specific dataset, increasing model prediction accuracy. With that in mind, the possibilities are endless! 🚀

Getting Started

To make using learningOrchestra more accessible, we provide the learning_orchestra_client Python package. This package provides developers with all of learningOrchestra's functionalities in a Python API.

To improve user experience, a user can export and analyse the results using a MongoDB GUI, such as NoSQLBooster.

We also built a demo of learningOrchestra (in learning_orchestra_client usage example section) with the Titanic challenge dataset.

The learningOrchestra documentation has a more detailed guide on how to install and use it. We also provide documentation and examples for each microservice and Python package.

Name		Name	Last commit message	Last commit date
Latest commit History 685 Commits
.github/workflows		.github/workflows
docs		docs
learning_orchestra_client		learning_orchestra_client
microservices		microservices
.gitattributes		.gitattributes
.gitignore		.gitignore
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
docker-compose.yml		docker-compose.yml
mkdocs.yml		mkdocs.yml
readme.md		readme.md
run.sh		run.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

learningOrchestra: a distributed machine learning processing tool

Objective

Architecture

Getting Started

About

Releases

Packages

Languages

License

srihas070/learningOrchestra

Folders and files

Latest commit

History

Repository files navigation

learningOrchestra: a distributed machine learning processing tool

Objective

Architecture

Getting Started

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages