-
Notifications
You must be signed in to change notification settings - Fork 59
Installation
Versatile Data Kit SDK (vdk) is the go to tool for developing and running Data Jobs locally.
Versatile Data Kit CLI (VDK) requires Python 3.7+. If you're new to Python, we recommend Anaconda.
pip install quickstart-vdk
This will install VDK with support for some common databases and job lifecycle management operations.
See help to see what you can do:
vdk --help
Check out Getting Started to create your first Data Job and the Examples for the various things you can do with Versatile Data Kit.
VDK comes with a number plugins that can be installed to change or extend its behavior.
pip install <vdk-plugin-name>
You can find a list of plugins that we have already developed in plugins directory.
See plugins page for more info.
Versatile Data Kit includes Control Service server enabling deploying, managing and monitoring Data Jobs.
Prerequisites
Then run:
vdk server --install
This will install Control Service on your local machine and configure local installation of VDK to use it.
In production, use the helm chart to install and configure it.
Prerequisites
- Install helm
- Kubernetes
Then run
helm repo add vdk-gitlab https://gitlab.com/api/v4/projects/28814611/packages/helm/stable
helm install my-release vdk-gitlab/pipelines-control-service
Read more about the installation in the Versatile Data Kit Control Service Chart Github repository
Use
# see Help to see what you can do and start playing around
vdk --help
# for example create a new job:
vdk create -u <URI-YOU-GOT-FROM-helm-install-OUTPUT>
SDK - Develop Data Jobs
SDK Key Concepts
Control Service - Deploy Data Jobs
Control Service Key Concepts
- Scheduling a Data Job for automatic execution
- Deployment
- Execution
- Production
- Properties and Secrets
Operations UI
Community
Contacts