S3 file sink sample update.

Added instruction on how to test s3 sink locally using PyFlink. Minor update to the sample code.
aws-samples · Feb 7, 2023 · b5a5730 · b5a5730
1 parent fa4107a
commit b5a5730
Show file tree

Hide file tree

Showing 2 changed files with 8 additions and 3 deletions.
diff --git a/pyflink-examples/StreamingFileSink/README.md b/pyflink-examples/StreamingFileSink/README.md
@@ -0,0 +1,7 @@
+# PyFlink local testing - adding file system support for S3 buckets
+
+In order to test S3 file sink locally, please add S3 file system plugin to PyFlink `lib` directory.
+
+1. Download S3 file system implementation such as S3 FS Hadoop from Maven repository [here](https://mvnrepository.com/artifact/org.apache.flink/flink-s3-fs-hadoop). (Please pick a version that matches your apache-flink version.)
+2. Copy the downloaded jar file (e.g. flink-s3-fs-hadoop-1.15.2.jar) to PyFlink `lib` directory.
+   1. For miniconda3, the directory is at `~/miniconda3/envs/local-kda-env/lib/python3.8/site-packages/pyflink/lib/`
diff --git a/pyflink-examples/StreamingFileSink/streaming-file-sink.py b/pyflink-examples/StreamingFileSink/streaming-file-sink.py
@@ -39,9 +39,7 @@
         "pipeline.jars",
         "file:///"
         + CURRENT_DIR
-        + "/lib/flink-sql-connector-kinesis-1.15.2.jar;file:///"
-        + CURRENT_DIR
-        + "/plugins/flink-s3-fs-hadoop/flink-s3-fs-hadoop-1.13.2.jar",
+        + "/lib/flink-sql-connector-kinesis-1.15.2.jar"
     )
 
     table_env.get_config().get_configuration().set_string(