GitHub - lkuligin/dataproc-pubsub-spark-streaming

In this tutorial you learn how to deploy an Apache Spark streaming application on Cloud Dataproc and process messages from Cloud Pub/Sub in near real-time. The system you build in this scenario generates thousands of random tweets, identifies trending hashtags over a sliding window, saves results in Cloud Datastore, and displays the results on a web page.

Please refer to the related article for all the steps to follow in this tutorial: [INSERT LINK WHEN PUBLISHED]

Contents of this repository:

http_function: Javascript code for the HTTP function deployed on Cloud Functions.
spark: Scala code for the Apache Spark streaming application.
tweet-generator: Python code for the randomized tweet generator.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
http_function		http_function
spark		spark
tweet-generator		tweet-generator
.gitignore		.gitignore
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

About

Releases

Packages

Languages

License

lkuligin/dataproc-pubsub-spark-streaming

Folders and files

Latest commit

History

Repository files navigation

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages