Take
your first steps towards discovering, learning, and using Apache Spark 3.0.
We will be taking a live coding approach in this carefully structured course
and explaining all the core concepts needed along the way. In this course, we will understand the
real-time stream processing concepts, Spark structured streaming APIs, and
architecture. We will work with file
streams, Kafka source, and integrating Spark with Kafka. Next, we will learn
about state-less and state-full streaming transformations. Then cover
windowing aggregates using Spark stream. Next, we will cover watermarking and
state cleanup. After that, we will cover streaming joins and aggregation,
handling memory problems with streaming joins. Finally, learn to create
arbitrary streaming sinks.
By the end
of this course, you will be able to create real-time stream processing
applications using Apache Spark. All
the resources for the course are available at
https://github.com/PacktPublishing/Real-time-stream-processing-using-Apache-Spark-3-for-Python-developers