Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update pyFlink documentation #313

Merged
merged 1 commit into from
Mar 7, 2022
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
29 changes: 18 additions & 11 deletions docs/python_guide.md
Original file line number Diff line number Diff line change
@@ -1,18 +1,23 @@
# Running Python jobs using pyflink API with the Flink Operator

This guide provides how to start a job with a python application using pyflink that is unnecessary to deploy Apache Beam.
This guide demonstrates how to start a job with a python application using pyFlink without deploying Apache Beam

## Prerequisites

Apache Flink is not officially providing any dockerfile including python, you should deploy a dockerfile to pull it into k8s pod.
Apache Flink does not provide any official docker images for pyFlink, you will need to build and host your own image. A sample docker file is provided
in `images/flink/python.Dockerfile`

1. Build a Dockerfile: please follow [DockerSetup#enableing-python](https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/resource-providers/standalone/docker/#enabling-python) in the flink docs
Alternatively:

1. Create your own Dockerfile: please follow [DockerSetup#enableing-python](https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/resource-providers/standalone/docker/#enabling-python) in the flink docs
2. Deploy the Dockerfile to any docker registry

## Starting a job with a python file

You can start a job with a python file by specifying the `.spec.job.pythonFile` property and changing the `.spec.image.name` to yours.
The `.spec.job.pythonFile` is transformed to `python` as an argument in the flink command.
You can start a job with a python file by specifying the `.spec.job.pyFile` property. The `.spec.job.pyFile` is transformed to `python`
as an argument in the flink command.

Make sure you update `.spec.image.name` to point to your pyFlink Docker Image and registry.

```yaml
apiVersion: flinkoperator.k8s.io/v1beta1
Expand All @@ -22,16 +27,18 @@ metadata:
spec:
...
image:
name: your_dockerfile
name: <your_dockerfile>
...
job:
pythonFile: "examples/python/table/word_count.py"
pyFile: "examples/python/table/word_count.py"
```

## Starting a job with one or more python files

If you wrote the application with multiple python files, speicify `.spec.job.pythonModule` and `.spec.job.pythonFiles`.
If you wrote the application with multiple python files, specify `.spec.job.pyModule` and `.spec.job.pyFiles`.
These properties are transformed to `pyModule` and `pyFiles` as arguments in the flink command, respectively.
Refer to the [pyFlink CLI Docs](https://nightlies.apache.org/flink/flink-docs-release-1.14/docs/deployment/cli/#submitting-pyflink-jobs) for further
information.

```yaml
apiVersion: flinkoperator.k8s.io/v1beta1
Expand All @@ -41,9 +48,9 @@ metadata:
spec:
...
image:
name: your_dockerfile
name: <your_dockerfile>
...
job:
pythonModule: "word_count"
pythonFiles: "examples/python/table"
pyModule: "word_count"
pyFiles: "examples/python/table"
```