Releases: vmware/versatile-data-kit
Versatile Data Kit 0.2
Summary
Major features include:
- Improvements in control-service security to be compliant with best Kubernetes practices (run jobs as unprivileged/non-root);
- Plugins can now hook into the ingestion process before, during or after the ingestion execution; these hooks can also be chained;
- Added support for Kimball templates (SCD1, SCD2, Snapshot Accumulating Fact Table) for the vdk-impala plugin.
Package versions
See installation instructions here.
The versions of VDK components released under VDK 0.2 are:
Main components
vdk-heartbeat==0.5.476585195
vdk-core==0.1.476585195
vdk-control-cli==1.2.476585195
pipelines-control-service==1.4.476585195
Plugins
vdk-trino==0.2.476585195
vdk-test-utils==0.2.476585195
vdk-kerberos-auth==0.2.476585195
vdk-ingest-http==0.2.476585195
vdk-impala==0.2.476585195
quickstart-vdk==0.2.476585195
What's Changed
- vdk-wiki: Created Life Expectancy Scenario by @alod83 in #616
- vdk-wiki: life-expectancy minor fixes to work end-to-end by @tozka in #700
- vdk-kerberos-auth: Fix keytab file in job directory by @doks5 in #721
- vdk-plugins: Update ingestion interfaces used in plugins by @doks5 in #689
- vdk-test-utils: Add test pre-ingest, ingest, post-ingest plugins by @doks5 in #679
- control-service: Allow job builder run as non-root by @doks5 in #625
- control-service: add counter to track data job watching task executions by @tpalashki in #692
- control-service: add job-base-image folder and CI step/job by @tozka in #711
- control-service: add security context to Data Job template by @mivanov1988 in #713
- control-service: add timeouts to shedlock's database operations by @tpalashki in #693
- control-service: enable logging on update cron job failure by @mivanov1988 in #674
- control-service: fix role permissions for pod/logs by @tozka in #699
- control-service: graphQL job executions filter by teamName by @mrMoZ1 in #702
- control-service: remove unnecessary directory of legacy builder by @tozka in #712
- control-service: remove unused code by @tozka in #705
- control-service: revert swagger path changes by @ivakoleva in #704
- control-service: run data job as non-root user by @tozka in #710
- control-service: vdk sdk docker repository secret by @mivanov1988 in #694
- control-service: add amazon-ecr-credential-helper to the job builder by @tpalashki in #723
- control-service: Swagger UI path changes docs update by @ivakoleva in #715
- control-service: publish Swagger UI to /data-jobs path by @ivakoleva in #677
- control-service: publish Swagger UI to /data-jobs path by @ivakoleva in #714
- control-service: redirect Swagger webjars resources by @ivakoleva in #697
- control-service: release a new version by @ivakoleva in #681
- quickstart-vdk: Run vdk-heartbeat before release by @YanaZhivkova in #665
- vdk-control-cli: Expand missing resource error message by @gageorgiev in #706
- vdk-control-cli: Make list command print all jobs on empty team param by @gageorgiev in #709
- vdk-control-cli: Parse contacts with both comma "," as delimiter as well by @tozka in #719
- vdk-control-cli: clarify api token documentation by @tozka in #722
- vdk-core: add flag to enable synchronous/blockng ingestion by @tozka in #698
- vdk-core: adjust defined type for configuration values by @tozka in #684
- vdk-core: fix error message by @tozka in #686
- vdk-core: Add ingestion functional tests by @doks5 in #691
- vdk-core: Implementation of new ingestion interfaces by @doks5 in #690
- vdk-core: Introduce post-ingest-sequence env var by @doks5 in #682
- vdk-heartbeat: Successful data job run test mode by @ivakoleva in #718
- vdk-heartbeat: Fix successful run status check by @YanaZhivkova in #724
- vdk-impala: Impala docker image upgrade by @ivakoleva in #675
- vdk-impala: impala templates by @mrMoZ1 in #671
- vdk-impala: template .sql missing from python distro by @ivakoleva in #695
- vdk-impala: Improve error handling to handle view errors by @doks5 in #717
- vdk-ingest-http: additional request parameters support by @ivakoleva in #701
- vdk-ingest-http: configurable allow JSON float NaN capability by @ivakoleva in #725
- vdk-trino: trino ingest to handle type casting and missing values by @tozka in #685
- versatile-data-kit: Establish release process by @gageorgiev in #673
New Contributors
Full Changelog: 0.1...0.2
Versatile Data Kit 0.1
Package versions
The versions of VDK components released under VDK 0.1 are:
Main components
vdk-core==0.0.451799019
vdk-control-cli==1.1.434200173
vdk-control-service-api==1.0.6
vdk-heartbeat==0.4.434200173
quickstart-vdk==0.1.415625538
Plugins
vdk-test-utils==0.1.449393937
vdk-kerberos-auth==0.1.449393937
vdk-impala==0.1.448032169
vdk-trino==0.1.433653387
vdk-ingest-http==0.1.428971094
vdk-server==0.1.424970629
vdk-logging-json==0.1.418423702
vdk-plugin-control-cli==0.1.417315215
vdk-greenplum==0.0.415648530
vdk-postgres==0.0.415648530
vdk-sqlite==0.1.415630020
vdk-snowflake==0.2.415625538
vdk-logging-ltsv==0.1.415625538
vdk-ingest-file==0.1.415625538
vdk-csv==0.1.415625538
What's Changed
- [WIP] vdk-server: wire up the implementation with the plugin by @tpalashki in #186
- [control-service]:fix memoryToMB conversion overflow in KubernetesService by @mrMoZ1 in #462
- [draft]vkd-heartbeat: Add execution API to vdk heartbeat by @mrMoZ1 in #116
- [vdk-core]: Post-ingestion specification by @doks5 in #492
- [vdk-plugins][vdk-server]: Remove authentication imports by @doks5 in #609
- base: initial commit by @mivanov1988 in #29
- builds: make pre-requisites explicit by @tozka in #61
- cicd/control service: set dockerhub account for default imagePullSecrets by @tozka in #550
- cicd: fix gitlab runners by @tozka in #39
- contro-service: revert some logic related to false-positive emails by @tpalashki in #460
- control service cicd: add deploy demo/cicd environment scripts by @tozka in #55
- control service: rename data job api distribution by @tozka in #59
- control-serivice: clean up api documentation by @tozka in #489
- control-serivice: make smtp server configurable by @tozka in #490
- control-service,gitlab-ci: move end job to top level project by @tozka in #71
- control-service-base: upgrade JUnit 4 to JUnit 5 by @ivakoleva in #207
- control-service/vdk-cli/vdk-heartbeat: set Pipeline ID as op_id for vdk-heartbeat run by @tozka in #654
- control-service: extend PATCH deployments semantics by @tozka in #464
- control-service: Add Data Job Executions to the graphql by @kostoww in #265
- control-service: Add logging for keytab secret creation by @mrMoZ1 in #498
- control-service: Add username to executions by @doks5 in #159
- control-service: Add vdk-heartbeat image by @doks5 in #281
- control-service: Allow for additional labels for CS deployment by @gageorgiev in #318
- control-service: Allow for additional labels in deployment pod by @gageorgiev in #325
- control-service: Bump builder job version in helm by @doks5 in #473
- control-service: CI/CD improvements by @mivanov1988 in #76
- control-service: Change dep additional labels to be set using YML by @gageorgiev in #338
- control-service: Classify OOM as User Errors by @doks5 in #479
- control-service: Convert debug logs to info by @doks5 in #257
- control-service: Create Github repository on Helm install by @tpalashki in #64
- control-service: Data Job Executions GraphQL API returns null propert… by @mivanov1988 in #659
- control-service: Data Jobs API documentation changes by @ivakoleva in #77
- control-service: Document variables supported for configuring logs UR… by @mivanov1988 in #612
- control-service: Expose configuration properties by @mrMoZ1 in #60
- control-service: Failed to log the HttpTrace object fix by @ivakoleva in #102
- control-service: Fix JsonSyntaxException when cancelling execution by @mrMoZ1 in #192
- control-service: Fix authentication when pulling images by @YanaZhivkova in #456
- control-service: Format logs in JSON by @gageorgiev in #308
- control-service: GraphQL executions query IT tests by @mivanov1988 in #546
- control-service: GraphQL executions query by @mivanov1988 in #478
- control-service: Helm ingress support for networking.k8s.io/v1/Ingress by @inbobev in #88
- control-service: Include additional labels for service template by @gageorgiev in #255
- control-service: Introduce Executions in GraphQL Schema/API by @kostoww in #150
- control-service: Make fluentdconfig configurable by @gageorgiev in #74
- control-service: Properly classify requirements.txt errors by @doks5 in #437
- control-service: Release 1.2.16 by @gageorgiev in #343
- control-service: Remove logging of credentials in builder by @doks5 in #458
- control-service: Remove upper limit of the graphql query for page size by @kostoww in #232
- control-service: Remove upper limit of the graphql query for page size by @kostoww in #242
- control-service: Turn-off debug mode for builder script by @doks5 in #469
- control-service: Update Builder Job Version in helm by @doks5 in #467
- control-service: Update CONTRIBUTING.md by @YanaZhivkova in #439
- control-service: Update Helm chart by @tpalashki in #78
- control-service: Update the GraphQL model with the last execution by @tpalashki in #486
- control-service: ability to add offset to dates in logs URL by @mivanov1988 in #644
- control-service: ability to deploy a data job before all it tests by @mivanov1988 in #539
- control-service: add IT for graphQL sort, filter by next run by @mrMoZ1 in #564
- control-service: add component readme and contributing.md by @tozka in #109
- control-service: add custom logback-spring.xml config by @mrMoZ1 in #111
- control-service: add enabled flag of deployments to the database by @tpalashki in #421
- control-service: add execution states to GraphQL schema by @tozka in #87
- control-service: add execution status fail/success to graphql schema by @mrMoZ1 in #518
- control-service: add executions logging api by @tozka in #259
- control-service: add git ssl enabled flag by @mivanov1988 in #182
- control-service: add kaniko job builder by @tozka in #152
- control-service: add kaniko job-builder to Gitlab CI by @tozka in #156
- control-service: add last execution info to the data_job table by @tpalashki in #480
- control-service: add logging when updating last execution by @tpalashki in #657
- control-service: add logsUrl to DataJobExecution model by @mivanov1988 in #608
- control-service: add namespace label to Prometheus alerts by @tpalashki in #442
- control-service: add release instructions by @tozka in #93
- control-service: add tag Properties by @tozka in #132
- control-service: add team to job deployment by @versatile-data-kit-dev in #505
- control-ser...