v0.7
Summary
Major features include:
VDK Template running state detection capability
Since template executions are autonomous data job runs, we need to be able to determine if a template is running at any time.
For example, to distinguish between root data job finalization, and template finalization
For example if we want to send telemetry somewhere:
@hookimpl
def finalize_job(self, context: JobContext) -> None:
template = context.core_context.state.get(ExecutionStateStoreKeys.TEMPLATE_NAME)
if template:
telemetry.send(phase="finalize_template", template_name = template)
else:
telemetry.send(phase="finalize_job", job_name=context.name)
New Logging configuration LOG_LEVEL_MODULE
Enable users to override logs per module, temporarily (e.g for debugging or prototyping reasons to increase the verbosity of certain
module).
For example assuming default log level is INFO we can enable verbose logs for 2 modules "vdk.api" and "custom.module":
export LOG_LEVEL_MODULE="vdk.api=DEBUG;custom.module=DEBUG"
vdk run job-name
Or in specific job config.ini:
[vdk]
log_level_module=vdk.api=DEBUG;custom.module=DEBUG
New plugin backend for Properties: from local file system
A simplistic plugin, that allows a developer or presenter to quickly store properties on the local FS.
It can be used to store secrets/configuration for a dev/demo session, that does not require a prerequisite of the entire Control Service installed and running.
It can be used to test a job run locally only without updating the state of the deployed job.
Example:
export PROPERTIES_DEFAULT_TYPE="fs-properties-client"
or in specific job config.ini
[vdk]
properties_default_type=fs-properties-client
Now properties are stored in a local file. The file location can be further configured using FS_PROPERTIES_FILENAME
and FS_PROPERTIES_DIRECTORY
Coockiecutter for new plugins
Create new plugin skeleton very easy
cookiecutter https://github.com/tozka/cookiecutter-vdk-plugin.git
and follow the instructions
Add the ability to cancel remaining job steps
Now a job (or a template) can be canceled from any step and all remaining steps in the job (or template) will be skipped.
For example, it can be used if a data job depends on processing data from a source that has indicated no new entries since the last run, then we can skip the remaining steps.
Example:
def run(job_input: IJobInput):
data = get_last_delta()
if not data:
job_input.skip_remaining_steps()
Package versions
See installation instructions here.
The versions of VDK components released under VDK 0.7 are:
Main components
control-service 1.5.622899758
vdk-control-cli==1.3.626767210
vdk-core==0.3.652866366
Plugins
vdk-properties-fs==0.0.651770458
vdk-kerberos-auth==0.3.631374202
vdk-impala==0.4.651849986
What's Changed
- vdk-control-cli: Drop requirement pluggy to be 0.* by @gageorgiev in #1116
- vdk-core: Add log before query result fetch by @doks5 in #1195
- vdk-core: Fix issue with serializing Decimal values during payload check by @gageorgiev in #946
- vdk-core: add ability to cancel remaining job steps by @mrMoZ1 in #1188
- vdk-core: add new configuration log_level_module by @tozka in #1167
- vdk-core: added default values to write termination message method by @mivanov1988 in #1185
- vdk-core: avoid circular references in print results by @tozka in #1176
- vdk-core: extend classification error test by @tozka in #1180
- vdk-core: fix error classification of vdk code by @tozka in #1173
- vdk-core: fix flakey test in test checking logs output by @murphp15 in #1194
- vdk-core: template running state detection capability by @ivakoleva in #941
- vdk-csv: Updates on vdk-csv README by @duyguHsnHsn in #952
- vdk-impala: Add validation for queries that doesn't provide lineage info by @kostoww in #1175
- vdk-impala: fix error classification in impala by @tozka in #1178
- vdk-impala: fix impala template empty source view usr err by @mrMoZ1 in #1189
- vdk-impala: fixed platform error missclasified when running template by @mrMoZ1 in #944
- vdk-impala: improve vdk-impala documentation by @tozka in #948
- vdk-kerberos-auth: Pinned minikerberos in vdk-kerberos-auth plugin by @mivanov1988 in #1168
- vdk-kerberos-auth: add KerberosClient for authenticating API calls by @tozka in #879
- vdk-plugins: improve plugin project creation with cookiecutter by @tozka in #942
- vdk-properties-fs: new plugin for local FS properties storage by @ivakoleva in #1190
- vep: Jupyter Notebook Integration Goals and Requirements by @duyguHsnHsn in #1165
- vep: Jupyter Notebook Integration by @duyguHsnHsn in #1113
- versatile-data-kit: Without and with VDK image by @zverulacis in #1184
- versatile-data-kit: set automatic java formatter by @tozka in #757
- versatile-data-kit: simplify release process by @tozka in #951
- versatile-data-kit: update contact instructions by @tozka in #1172
New Contributors
Full Changelog: v0.6...v0.7