Thanks for the video Sir... Im new for spark. A Scenario, In checkpointing I have doubt. You told for fault tolerance purpose we are usig checkpoint. As you said I configured checkpoint location and application is running, querying streaming data and store output in some sink. Now due to some reason issue happened and issue is there for 30 minutes, issue like say if my spark query have some bug or some issue in the sink. So now we cannot process data for those 30 minutes.. But streaming data is keep on producing the events. During this 30 minutes failure time, where does the streamed data will store. Due to bug in my query, now how that check point will work and how to restore those 30 minutes data once the streaming application comes up.
1 Comments
Thanks for the video Sir... Im new for spark. A Scenario, In checkpointing I have doubt. You told for fault tolerance purpose we are usig checkpoint. As you said I configured checkpoint location and application is running, querying streaming data and store output in some sink. Now due to some reason issue happened and issue is there for 30 minutes, issue like say if my spark query have some bug or some issue in the sink. So now we cannot process data for those 30 minutes.. But streaming data is keep on producing the events. During this 30 minutes failure time, where does the streamed data will store. Due to bug in my query, now how that check point will work and how to restore those 30 minutes data once the streaming application comes up.
ReplyDeletehttps://www.youtube.com/watch?v=V0JfKa9pTYM