To process continuous streams of data from sources HDFS directories, TCP sockets, Kafka, Flume, Twitter e.t.c spark has two methodologies. Spark streaming in general works on something called as a micro batch. The stream pipeline is registered with some operations and the Spark polls the source after every batch duration…