Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

vdk-control-cli: Make sample job be runnable without error #227

Merged
merged 2 commits into from
Sep 14, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -9,4 +9,4 @@
-- A valid query parameter looks like → {parameter}.
-- Parameters will be automatically replaced if there is a corresponding value existing in the IJobInput properties.

SELECT count ( * ) as test_records from hello_world;
CREATE TABLE IF NOT EXISTS hello_world (id NVARCHAR);
Original file line number Diff line number Diff line change
Expand Up @@ -6,17 +6,17 @@ Versatile Data Kit feature allows you to implement automated pull ingestion and

Data Job directory can contain any files, however there are some files that are treated in a specific way:

* SQL files (.sql) - called SQL steps - are directly executed as queries against your configured database.
* Python files (.py) - called Python steps - are python scripts tha define run function that takes as argument the job_input object .
* config.ini is needed in order to configure the Job. This is the only required file.
* SQL files (.sql) - called SQL steps - are directly executed as queries against your configured database;
* Python files (.py) - called Python steps - are Python scripts that define run function that takes as argument the job_input object;
* config.ini is needed in order to configure the Job. This is the only file required to deploy a Data Job;
* requirements.txt is an optional file needed when your Python steps use external python libraries.

Delete all files you do not need and replace them with your own
Delete all files you do not need and replace them with your own.

### Data Job Code

VDK supports having many Python and/or SQL steps in a single Data Job. Steps are executed in ascending alphabetical order based on file names.
Prefixing file names with numbers, makes it easy having meaningful names while maintaining steps execution order.
Prefixing file names with numbers makes it easy to have meaningful file names while maintaining the steps' execution order.

Run the Data Job from a Terminal:
* Make sure you have vdk installed. See Platform documentation on how to install it.
Expand All @@ -26,8 +26,8 @@ vdk run <path to Data Job directory>

### Deploy Data Job

When Job is ready to be deployed at Versatile Data Kit runtime(cloud) to be executed in regular manner:
Run below command and follow its instructions (you can see its options with `vdkcli --help`)
When a Job is ready to be deployed in a Versatile Data Kit runtime (cloud):
Run the command below and follow its instructions (you can see its options with `vdk --help`)
```python
vdkcli deploy
vdk deploy
```
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
; Supported format: https://docs.python.org/3/library/configparser.html#supported-ini-file-structure

; This is the only required file in a Data Job.
; This is the only file required to deploy a Data Job.
; Read more to understand what each option means:

; Information about the owner of the Data Job
Expand All @@ -15,7 +15,6 @@ team =
; The cron expression is evaluated in UTC time.
; If it is time for a new job run and the previous job run hasn’t finished yet,
; the cron job waits until the previous execution has finished.
; To distribute load evenly, Platform team may override the minute you specified.
schedule_cron = 11 23 5 8 1

; Who will be contacted and on what occasion
Expand Down