Spark Architecture in 3 minutes| Spark components | How spark works

Ғылым және технология

Spark is one of the most prominent and widely used processing framework in Bigdata world. This videos explains the core components and architecture of spark with a real world example in just 3 minutes.

Пікірлер: 124

  • @satishchippa
    @satishchippa3 жыл бұрын

    Excellent way of explaining things in a most simplified manner. Looking forward to more videos on Spark.

  • @BigDataThoughts

    @BigDataThoughts

    3 жыл бұрын

    Thanks Satish

  • @deepalirathod4929
    @deepalirathod49294 ай бұрын

    Finally it got cleared to me after reading here and there . thank you .

  • @apurvgolatgaonkar6722
    @apurvgolatgaonkar6722 Жыл бұрын

    Ma'am you taught amazing 😍😍 very less time consuming lecture but perfect... Keep it up

  • @bitthal24
    @bitthal243 жыл бұрын

    I havent seen such a lucid way of explaining something this complex concept. Great work!

  • @BigDataThoughts

    @BigDataThoughts

    3 жыл бұрын

    Thanks Bitthal

  • @NIYANTAjmp
    @NIYANTAjmp2 жыл бұрын

    very simple and nice explanation. Thank you for posting this video

  • @theamithsingh
    @theamithsingh Жыл бұрын

    finally, a video that simplifies spark, amazing, keep the videos coming please!!

  • @BigDataThoughts

    @BigDataThoughts

    Жыл бұрын

    Thanks

  • @showbhik9700
    @showbhik97002 жыл бұрын

    This is the only video on KZread which clarified my doubts. Thanks!!

  • @BigDataThoughts

    @BigDataThoughts

    2 жыл бұрын

    Thanks showbhik

  • @BigDataThoughts

    @BigDataThoughts

    2 жыл бұрын

    Thanks showbhik

  • @mehrozalam94
    @mehrozalam942 жыл бұрын

    Great Video, I have learned alot. Thank you

  • @sandeepchoudhary4900
    @sandeepchoudhary49002 жыл бұрын

    Very nice video and it covers everything related to the Spark architecture in just 5 minutes. Keep sharing new videos.

  • @BigDataThoughts

    @BigDataThoughts

    2 жыл бұрын

    Thanks sandeep

  • @seemanthinin448
    @seemanthinin4483 жыл бұрын

    Simple example and easy way of explaining an important concept.. thanks!

  • @BigDataThoughts

    @BigDataThoughts

    3 жыл бұрын

    Thanks seemanthini

  • @ayyappareddymuthikepalli4261
    @ayyappareddymuthikepalli42613 жыл бұрын

    Nice explanation.. Pls keep videos 🎥 like this

  • @nareshkumarbattula5847
    @nareshkumarbattula58473 жыл бұрын

    It's the best video I've seen so far on spark architecture..awesome..keep going..

  • @BigDataThoughts

    @BigDataThoughts

    3 жыл бұрын

    Thanks

  • @SagarSingh-ie8tx
    @SagarSingh-ie8tx Жыл бұрын

    Example was very good for beginners

  • @rahulkoley9447
    @rahulkoley944711 ай бұрын

    Very Informative, with full of clarity, Thank you.

  • @BigDataThoughts

    @BigDataThoughts

    11 ай бұрын

    thanks

  • @vsriga82
    @vsriga823 жыл бұрын

    Lot of information presented in simple way for everyone to understand. 👍

  • @BigDataThoughts

    @BigDataThoughts

    3 жыл бұрын

    Thanks sriganesh

  • @desiengineerashish4908
    @desiengineerashish49082 жыл бұрын

    This made far easy to understand Spark Architecture. Thank u ma'am, you are great

  • @BigDataThoughts

    @BigDataThoughts

    2 жыл бұрын

    Thanks Ashish

  • @mohnishverma87
    @mohnishverma873 ай бұрын

    Just woow, very simple explanation of a complex cluster overview.. Thanks.

  • @BigDataThoughts

    @BigDataThoughts

    3 ай бұрын

    Thanks

  • @adityasisodiya3000
    @adityasisodiya3000 Жыл бұрын

    Amazing content ! Really appreciate the understanding and approach to explain...looking fwd to more

  • @BigDataThoughts

    @BigDataThoughts

    Жыл бұрын

    Thanks Aditya

  • @dancingmoveswithdhruv3649
    @dancingmoveswithdhruv36492 жыл бұрын

    Very clearly explained, really appreciate all your efforts

  • @BigDataThoughts

    @BigDataThoughts

    2 жыл бұрын

    Thanks Dhruv

  • @puneetojha1195
    @puneetojha11952 жыл бұрын

    This video is better than 1 hour course on spark . Thanks

  • @BigDataThoughts

    @BigDataThoughts

    2 жыл бұрын

    Thanks puneet

  • @Azureandfabricmastery
    @Azureandfabricmastery3 жыл бұрын

    Simple and easy to understand. Thanks. I like doodle way of explaining concepts :)

  • @BigDataThoughts

    @BigDataThoughts

    3 жыл бұрын

    Thanks sheik

  • @trainingt9855
    @trainingt98552 жыл бұрын

    Great video

  • @nahomg.4191
    @nahomg.41913 ай бұрын

    I wish I could give 1000 likes. You’re an excellent teacher!

  • @BigDataThoughts

    @BigDataThoughts

    3 ай бұрын

    Thanks

  • @lakshmikanthavilalakumar
    @lakshmikanthavilalakumar3 жыл бұрын

    Beautifully explained short video 👏👏

  • @BigDataThoughts

    @BigDataThoughts

    3 жыл бұрын

    thanks lakshmikanth

  • @athar5867
    @athar58672 жыл бұрын

    Your videos really help me in clearing my interviews and getting the job, now I need some support for my new job. Can you either post videos regarding how pyspark class object work in backend, how parquet/csv reading writing work in distributed environment like which data read by each executor, how to do pagination of some order data etc. If possible provide your LinkedIn profile url or suggest some way to connect

  • @minaksheebagul4938
    @minaksheebagul49382 жыл бұрын

    just love the way u explained it ;) Really appreciated mam.

  • @BigDataThoughts

    @BigDataThoughts

    2 жыл бұрын

    Thanks minakshee

  • @samk_jg
    @samk_jg2 жыл бұрын

    Amazing!

  • @peekagyan
    @peekagyan2 жыл бұрын

    Nice explanation. For 1 GB input data for a batch processing how can we decide how can we decide the cluster size, no. of nodes or no. of executors ? Could you please explain Thanks ma’am

  • @rovashri566
    @rovashri56617 күн бұрын

    How did you make such a good visual explanation? Which tool you used to draw sketches ? Pls guide 🙏

  • @askdoubts6359
    @askdoubts63593 жыл бұрын

    Grate nice explanation

  • @krishnavardhandasari2694
    @krishnavardhandasari2694 Жыл бұрын

    Thank you mam for excellent way of teaching Spark.

  • @BigDataThoughts

    @BigDataThoughts

    Жыл бұрын

    Thanks

  • @ankitachauhan6084
    @ankitachauhan608411 ай бұрын

    the best explanation ever great work !

  • @BigDataThoughts

    @BigDataThoughts

    11 ай бұрын

    Thanks

  • @iamramanr4s
    @iamramanr4s11 ай бұрын

    way of explanation is .....just amazing

  • @BigDataThoughts

    @BigDataThoughts

    11 ай бұрын

    Thanks

  • @sanjoydas007
    @sanjoydas0072 жыл бұрын

    Very helpful, it would be great if you can take an example and illustrate how the data chunking happens

  • @gkethanvarma889
    @gkethanvarma8892 жыл бұрын

    Thank you so.................. much for this

  • @swapnilchilwant6867
    @swapnilchilwant6867 Жыл бұрын

    Thank you ma'am..👍

  • @SaurabhKumar-mc1is
    @SaurabhKumar-mc1is3 жыл бұрын

    Vividly explained. Thanks mam

  • @BigDataThoughts

    @BigDataThoughts

    3 жыл бұрын

    Thanks saurabh

  • @neerajmishra6828
    @neerajmishra6828 Жыл бұрын

    Saw this video.. content looks promising... great job

  • @BigDataThoughts

    @BigDataThoughts

    Жыл бұрын

    Thanks

  • @sheereenhamza3700
    @sheereenhamza37003 жыл бұрын

    Thank you for such a goooood explanation :D

  • @BigDataThoughts

    @BigDataThoughts

    3 жыл бұрын

    thanks sheereen

  • @shreyamoghe6893
    @shreyamoghe68933 жыл бұрын

    Great video! I have one question though. Is it my correct understanding that each student which got the coin bag is same as how data is partitioned. i.e. 1 student = 1 data partition?

  • @BigDataThoughts

    @BigDataThoughts

    3 жыл бұрын

    1 data partition is operated on by 1 slot/task

  • @vemulasunayana904
    @vemulasunayana9049 ай бұрын

    Excellent example 👏

  • @BigDataThoughts

    @BigDataThoughts

    9 ай бұрын

    Thanks

  • @tanushreenagar3116
    @tanushreenagar3116 Жыл бұрын

    VERY HELPFUL BEST EXPLANATION

  • @BigDataThoughts

    @BigDataThoughts

    Жыл бұрын

    Thanks

  • @nishchaysharma5904
    @nishchaysharma59047 күн бұрын

    Thank you for this video.

  • @BigDataThoughts

    @BigDataThoughts

    4 күн бұрын

    Thanks

  • @MrSmarthunky
    @MrSmarthunky3 жыл бұрын

    Really informative Shreya. One quick question. Stage will run sequentially depending on use case and Tasks will run in parallel?

  • @BigDataThoughts

    @BigDataThoughts

    3 жыл бұрын

    Thanks Madhu. Stage may run sequentially or in parallel depending upon whether they have dependency or not. Typically a stage will have multiple tasks running in parallel on a different set of data but doing the same set of operations that the stage contains.

  • @moughosh3640
    @moughosh3640 Жыл бұрын

    Extremely good explanation

  • @BigDataThoughts

    @BigDataThoughts

    Жыл бұрын

    Thanks mou

  • @upskillwithchetan
    @upskillwithchetan3 жыл бұрын

    Great explanation Ma'am, please add more videos and arrange it in seq. under playlist

  • @BigDataThoughts

    @BigDataThoughts

    3 жыл бұрын

    Thanks Chetan. yes there are more videos coming up. stay tuned

  • @toandao7113
    @toandao71139 ай бұрын

    This was marked to know that I'm here on 3/10/2023

  • @sandeepmullangi4413
    @sandeepmullangi44132 жыл бұрын

    Nice video. Really liked it. So you said one node can act as driver. I want to know what is the best practise here? I usually submit jobs by doing SSH to master node (atleast in GCP dataproc) and then submit job. So should I consider my master node as driver? Is it right to do that way?

  • @BigDataThoughts

    @BigDataThoughts

    2 жыл бұрын

    To give an example if we are using Yarn - when you are submitting a spark job in cluster mode. Container where the Application Master runs acts as Master node (driver) and the containers where all the executor process runs the tasks are called Slave Node. When the job gets submitted first the spark submit calls resource manager which in turn starts the application master and from there driver takes over.

  • @vikastangudu712
    @vikastangudu712 Жыл бұрын

    one of the bests

  • @BigDataThoughts

    @BigDataThoughts

    Жыл бұрын

    Thanks vikas

  • @nareshb5859
    @nareshb58593 жыл бұрын

    Very Nice Explanation

  • @BigDataThoughts

    @BigDataThoughts

    3 жыл бұрын

    Thanks Naresh

  • @hemanthkumarreddyedde
    @hemanthkumarreddyedde2 жыл бұрын

    Superb delivery

  • @BigDataThoughts

    @BigDataThoughts

    2 жыл бұрын

    Thanks Hemanth

  • @vedantshirodkar
    @vedantshirodkar Жыл бұрын

    Mam, I have one question. If spark has to write a data to sql database and as our data is broken on to multiple worker nodes, so is it driver who establishes single connection with sql db or it is worker nodes who establishes multiple parallel connections ?

  • @BigDataThoughts

    @BigDataThoughts

    Жыл бұрын

    When Spark writes data to a SQL database, it is the driver program that establishes a connection with the database and manages the write process. Each worker node will write its portion of the data to the database through this single connection established by the driver

  • @vedantshirodkar

    @vedantshirodkar

    Жыл бұрын

    @@BigDataThoughts Thank You Mam for the elaborated explanation.

  • @iwonazwierzynska4056
    @iwonazwierzynska405610 ай бұрын

    Excelent video :)!

  • @BigDataThoughts

    @BigDataThoughts

    10 ай бұрын

    Thanks

  • @ThankGod143
    @ThankGod143 Жыл бұрын

    Extraordinary mam

  • @BigDataThoughts

    @BigDataThoughts

    Жыл бұрын

    Thanks harika

  • @vidyaradhakrishnan5611
    @vidyaradhakrishnan56112 жыл бұрын

    Very nice explanation,

  • @BigDataThoughts

    @BigDataThoughts

    2 жыл бұрын

    Thanks Vidya

  • @shivratanmishra1459
    @shivratanmishra14593 жыл бұрын

    Great work ma'am

  • @BigDataThoughts

    @BigDataThoughts

    3 жыл бұрын

    Thanks Shivratan

  • @RaviYadav-cx2pb
    @RaviYadav-cx2pb Жыл бұрын

    Amazing explanation mam 😊😊👍

  • @BigDataThoughts

    @BigDataThoughts

    Жыл бұрын

    Thanks ravi

  • @Ramakrishna410
    @Ramakrishna4102 жыл бұрын

    One executor in one core and 2 partitions are assigned so one by one will execute. My que is if 10 tasks are there then these tasks wil execute parallelly or sequentially in partition level

  • @BigDataThoughts

    @BigDataThoughts

    2 жыл бұрын

    A task operates on a partition of data. Tasks do run in parallel. If you have multiple cores you can specify how many cores will a executor use. The number of concurrent tasks an executor can run is equal to the cores assigned..

  • @mohanmannam
    @mohanmannam2 жыл бұрын

    good explanation...

  • @BigDataThoughts

    @BigDataThoughts

    2 жыл бұрын

    Thanks

  • @mdatasoft1525
    @mdatasoft15252 ай бұрын

  • @hlearningkids
    @hlearningkids4 ай бұрын

    Very nice 👍

  • @BigDataThoughts

    @BigDataThoughts

    4 ай бұрын

    Thanks

  • @hlearningkids

    @hlearningkids

    4 ай бұрын

    @@BigDataThoughts did you explained in this style big query also. ? improvement in this video can be summary in slow way. please dont get hurt because i gave comment. you did really well in video. excellent explanation.

  • @SandyRocker
    @SandyRocker Жыл бұрын

    Thanks a lot mam

  • @BigDataThoughts

    @BigDataThoughts

    Жыл бұрын

    Thanks Sandy

  • @SandyRocker

    @SandyRocker

    Жыл бұрын

    Subscribed ✅

  • @bintangmuhammad7082
    @bintangmuhammad70822 жыл бұрын

    Can you please turn on the subtitles? thank you

  • @ur8946
    @ur89463 жыл бұрын

    could you pls explain more on parttiton

  • @BigDataThoughts

    @BigDataThoughts

    3 жыл бұрын

    The Dataset is divided into partitions and each partition is the unit on which a task works. That's the input to task.

  • @SM-mq1iq
    @SM-mq1iq2 жыл бұрын

    Can you make it slow to follow. I felt this was fast to get to know the terms.

  • @trainingt9855
    @trainingt98552 жыл бұрын

    Can you help in understanding RDD

  • @BigDataThoughts

    @BigDataThoughts

    2 жыл бұрын

    RDD are resilient distributed datasets and they are the lowest abstraction in spark. check this video - kzread.info/dash/bejne/dqmLqaSGdpqngsY.html

  • @RameshKumar-ng3nf
    @RameshKumar-ng3nf3 ай бұрын

    At the start of the video i was so happy seing all the diagrams.. Later got fully confused & felt complicated and i didnt understand well 😢

  • @vibhad-cv4sf
    @vibhad-cv4sf8 ай бұрын

    Great Video....!! Appreciate your efforts!🎉 One question, Where does a cluster manager fit in in this architecture? What role does it play in comparison with driver?

  • @BigDataThoughts

    @BigDataThoughts

    8 ай бұрын

    Cluster manager's job is to provide resources for job execution. Ex - yarn, mesos etc. Driver is the one controlling the overall job execution and which executors take part in the job

  • @vibhad-cv4sf

    @vibhad-cv4sf

    8 ай бұрын

    @@BigDataThoughts ohh okay!! Thank you!!

  • @sumonmal009
    @sumonmal0093 ай бұрын

    Good playlist for Spark kzread.info/head/PL1RS9FR9qIPEAtSWX3rKLVcRWoaBDqVBV

  • @BigDataThoughts

    @BigDataThoughts

    3 ай бұрын

    Thanks

  • @harigovindk
    @harigovindk2 ай бұрын

    18/april/2024

Келесі