Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

control-service: add executions logging api #259

Merged
merged 4 commits into from
Sep 20, 2021

Conversation

antoniivanov
Copy link
Collaborator

If a user installs a control serivce/versatile data kit by default they
do not have access to execution logs. This is likely to be huge issue
for adoptoin. Requiring that users integrate with tools like fluentd for
logging or having them expose kubectl can be pretty hard.

All APIs that support some kind of
task execution naturally need to expose logs interface to users.
Examples:

  • Lambda (using aws cloudwatch)
  • Data Bricks Jobs (API returns link to spark ui/logs)
  • airflow (rest api)
  • OpenFaas (has faas-cli logs)

Initially it would be used

  • in vdk-server workflow - instead of having users use kubectl, let
    them use vdk execute --logs and we can remove any mention of
    kind/kubectl.
  • in vdk-heartbeat - root cause of vdk-heartbeat test can be very
    complex. After the test finishes it cleans after itself. And this causes
    all logs of that execution to disappear making it very hard to see why
    the data job started by the test has failed.

Currently this is added as only experimental feature. There's some
issues that need to be decided before it can be promoted to stable (will
be noted as TODOs in the code)

Signed-off-by: Antoni Ivanov [email protected]

If a user installs a control serivce/versatile data kit by default they
do not have access to execution logs. This is likely to be huge issue
for adoptoin. Requiring that users integrate with tools like fluentd for
logging or having them expose kubectl can be pretty hard.

All APIs that support some kind of
task execution naturally need to expose logs interface to users.
Examples:
 - Lambda (using aws cloudwatch)
 - Data Bricks Jobs (API returns link to spark ui/logs)
 - airflow (rest api)
 - OpenFaas (has faas-cli logs)

Initially it would be used
- in vdk-server workflow - instead of having users use kubectl, let
them use vdk execute --logs and we can remove any mention of
kind/kubectl.
- in vdk-heartbeat - root cause of vdk-heartbeat test can be very
complex. After the test finishes it cleans after itself. And this causes
all logs of that execution to disappear making it very hard to see why
the data job started by the test has failed.

Currently this is added as only experimental feature. There's some
issues that need to be decided before it can be promoted to stable (will
be noted as TODOs in the code)

Signed-off-by: Antoni Ivanov <[email protected]>
@mivanov1988
Copy link
Collaborator

mivanov1988 commented Sep 20, 2021

As far as I understand this API will behave as tail -n 10 right?. Do we plan to provide Streaming API that will behave as tail -f?

@antoniivanov
Copy link
Collaborator Author

As far as I understand this API will behave as tail -n 10 right?. Do we plan to provide Streaming API that will behave as tail -f?

When/if there's a need for it. This will be experimental for some time and we'd see what feedback we get. Streaming would be a bit more complicated to implement currently. And tail -n 10 is good start

…c/main/java/com/vmware/taurus/datajobs/DataJobsExecutionController.java
@mivanov1988
Copy link
Collaborator

As far as I understand this API will behave as tail -n 10 right?. Do we plan to provide Streaming API that will behave as tail -f?

When/if there's a need for it. This will be experimental for some time and we'd see what feedback we get. Streaming would be a bit more complicated to implement currently. And tail -n 10 is good start

Agreed. I found some simple implementations of Streaming API that could be useful - https://technicalsand.com/streaming-data-spring-boot-restful-web-service/

@antoniivanov
Copy link
Collaborator Author

As far as I understand this API will behave as tail -n 10 right?. Do we plan to provide Streaming API that will behave as tail -f?

When/if there's a need for it. This will be experimental for some time and we'd see what feedback we get. Streaming would be a bit more complicated to implement currently. And tail -n 10 is good start

Agreed. I found some simple implementations of Streaming API that could be useful - https://technicalsand.com/streaming-data-spring-boot-restful-web-service/

Cool. Thanks. I was googling around and did not find anything interesting. I see I need to work on my google skills.

@antoniivanov antoniivanov enabled auto-merge (squash) September 20, 2021 15:36
@antoniivanov antoniivanov merged commit 16cdb65 into main Sep 20, 2021
@antoniivanov antoniivanov deleted the person/aivanov/control-service branch September 20, 2021 15:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants