Advancing Spark - How to pass the Spark 3.0 accreditation!

With the announcement of Spark 3.0 comes a new certification - Accredited Developer for Apache Spark 3.0! Simon recently took the exam and is here to share some advice and study notes about how you too can become a certified Spark 3.0 Developer!
If you're interested in Spark/Databricks training, don't forget to check out our website www.advancinganalytics.co.uk/... or just get in touch to find out when our next public courses are happening!
Planning on taking the exam? Already certified? Let us know in the comments!

Пікірлер: 101

  • @Simondoubt4446
    @Simondoubt44462 жыл бұрын

    This is great. Thank you so much for posting such helpful information!

  • @TheSQLPro
    @TheSQLPro3 жыл бұрын

    Nice simple explanation to help map out my certification journey. Thanks!

  • @purushothamchanda898
    @purushothamchanda8983 жыл бұрын

    Too good , now i got enough confidence to hit the exam. Thank you

  • @arthur95chionh
    @arthur95chionh2 жыл бұрын

    Awesome pictorial explanation of the physical architecture. The explanation of slots and how they relate to tasks was super enlightening. Thank you very much!!! :)

  • @senthilkumarpalanisamy365
    @senthilkumarpalanisamy3654 жыл бұрын

    Superb explanation with much clarify. Not seen anything like this any tutorial. Thanks for posting it. We need more from you 👌👌👏👏 I will refer this channel for my whole office team.

  • @paxnene
    @paxnene Жыл бұрын

    Thanks! Great video! I loved Spark Architecture's explanation (4:19)

  • @fernandosouza2388
    @fernandosouza23883 жыл бұрын

    This video helped me a lot to take the exam, thank youuu!!!

  • @BangaruJeevitam
    @BangaruJeevitamАй бұрын

    Thank you so much for the book recommendation , I would also highly recommend the same book and also make your own notes from the book. It took me 3 weeks of preparation to pass the exam. Thank you so much 🙏🏻

  • @gustavorocha9774
    @gustavorocha97742 жыл бұрын

    Best explanation I've ever seen

  • @GhernieM
    @GhernieM3 жыл бұрын

    Thank you a lot for this video. I am taking the exam on Wednesday. Keep your fingers crossed for me! :)

  • @GhernieM

    @GhernieM

    3 жыл бұрын

    I nailed it. If you follow these advices, you will surely pass it.

  • @sivaram8513
    @sivaram85133 жыл бұрын

    I got lot of information from this video which helped me to pass the certification today.. thank you

  • @AdvancingAnalytics

    @AdvancingAnalytics

    3 жыл бұрын

    Wahey! Congrats on passing!

  • @sundarkris1320

    @sundarkris1320

    3 жыл бұрын

    Is it 200$ for one attempt?

  • @headindata

    @headindata

    2 жыл бұрын

    @@sundarkris1320 correct, as of today. If you do not pass the exam, you will have to pay $200 again to retake it.

  • @pankajbaghela8903

    @pankajbaghela8903

    2 жыл бұрын

    Can you help me to pass this exam

  • @niru9048

    @niru9048

    2 жыл бұрын

    Hi Siva. Could you please help out on the validity aspect of this certification? However, if I directly try to see some public badge issued to few people, it shows expiration date as 2 years from issue date. In few KZread videos it mentions it never expires but is tied to the specific version of Spark. Could you please help out on this. I can't seem to find clarification anywhere.

  • @najju1987
    @najju19874 жыл бұрын

    Excellent

  • @sivaram8513
    @sivaram85133 жыл бұрын

    Thanks for the informative video. I am preparing for the Spark Scala certification and felt Python API docs is much better than Scala API which is having a lot of information and examples

  • @edwardgelberg5438
    @edwardgelberg54383 жыл бұрын

    Thanks so much for this video! I read through "The Definitive Guide" and felt ok, but not super confident, I watched this (and some of your other videos) in the week leading up to the exam, and I just passed!

  • @AdvancingAnalytics

    @AdvancingAnalytics

    3 жыл бұрын

    Woohoo! Congrats on passing - glad our videos helped!

  • @niru9048

    @niru9048

    2 жыл бұрын

    ​@@AdvancingAnalytics Could you please help out on the validity aspect of this certification? However, if I directly try to see some public badge issued to few people, it shows expiration date as 2 years from issue date. In few KZread videos it mentions it never expires but is tied to the specific version of Spark. Could you please help out on this. I can't seem to find clarification anywhere.

  • @Loutchianooo
    @Loutchianooo Жыл бұрын

    Thank you man :D

  • @yayatisule
    @yayatisule2 жыл бұрын

    The exam does not require an external webcam when given on Laptops. This video gave me some good points for exam day. Appreciate the work being done here👍🏻

  • @AdvancingAnalytics

    @AdvancingAnalytics

    2 жыл бұрын

    Ah cool - it was stated on the instructions when I originally took it, guess they've relaxed as the world has gone more remote :)

  • @priyankadevi-zl9jx
    @priyankadevi-zl9jx3 жыл бұрын

    This is a great video! I have a question since this exam will only test the data frames API, should we go through all the Pyspark functions, or just the data frames and SQL functions are required? Thanks!! Expecting more videos of such from you. :)

  • @zeal0502
    @zeal05023 жыл бұрын

    Thanks, I didn't even notice that there is a pdf of spark doc to use in the exam!

  • @headindata

    @headindata

    2 жыл бұрын

    Dewei Zhai, Databricks also recently published the actual PDF version of the spark doc you see in the exam here: www.webassessor.com/zz/DATABRICKS/Python_v2.html

  • @KZoldyck1
    @KZoldyck1 Жыл бұрын

    Learning Spark with David Guetta, tomorrow is my assessment, I hope approve 🍀

  • @KZoldyck1

    @KZoldyck1

    Жыл бұрын

    Passed!

  • @ynwtint
    @ynwtint4 жыл бұрын

    It helps me a lot on the prep of certification on Spark 3.0 thanks!

  • @divyadarshan8914

    @divyadarshan8914

    3 жыл бұрын

    Any tips on practice material besides definitive guide and official docs

  • @Hiillevii
    @Hiillevii3 жыл бұрын

    This was a fantastic video - thank you so much for sharing this content! Subscribed!

  • @AdvancingAnalytics

    @AdvancingAnalytics

    3 жыл бұрын

    Thanks for subscribing. I am glad it helped.

  • @rigoauebreturns
    @rigoauebreturns3 жыл бұрын

    Hey, great content! Quick question: Did you have any questions on Spark MLlib that required understanding of the actual algorithms or.. at all? Thanks for the info!

  • @AdvancingAnalytics

    @AdvancingAnalytics

    3 жыл бұрын

    Nope, there's no requirement for knowing the data science libraries, pure spark engineering!

  • @raghuram2383
    @raghuram23833 жыл бұрын

    Hi Simon: Are you aware of any full length practice exams for the DataBricks certification. I would like to take one of those mock exams before diving in. Thanks

  • @2ow2ow
    @2ow2ow3 жыл бұрын

    cool, let's get it done.

  • @AdvancingAnalytics

    @AdvancingAnalytics

    3 жыл бұрын

    Good luck!

  • @mamamiakool
    @mamamiakool2 жыл бұрын

    Any inputs on the resources to help prepare for Databricks Professional Data Engineer certification? Genuinely appreciate the inputs !!

  • @micha5781
    @micha57812 жыл бұрын

    Hi! Great content on Your channel. I was wondering if You could make a certificate comparison of Associate Developer and Associate Data Engineer (not the professional DE) in terms of what materials one should add to prepare for the Associate DE exam. Cheers! Edit: Would be nice to see your thoughts about Professional DE cert as well :)

  • @AdvancingAnalytics

    @AdvancingAnalytics

    2 жыл бұрын

    Good suggestion, I've not dug into the various new certifications since making this video, probably worth revisiting now there's such a range out there. I should also probably actually run through the Professional Data Engineer cert at some point too! :D Simon

  • @micha5781

    @micha5781

    2 жыл бұрын

    @@AdvancingAnalytics That would be great. Your material is always very helpful!

  • @hger8495
    @hger84953 жыл бұрын

    Hi, great content. Gives a good idea on the difficulty level of the exam. Does the exam contains question on streaming?

  • @madhu1987ful

    @madhu1987ful

    3 жыл бұрын

    No questions on streaming

  • @joyo2122

    @joyo2122

    2 жыл бұрын

    there are different leveles of exam

  • @nikhildavis3844
    @nikhildavis38443 жыл бұрын

    Thanks for this video. I had a question Which is the best certification for spark? Which would you recommend and why?

  • @AdvancingAnalytics

    @AdvancingAnalytics

    3 жыл бұрын

    Hey - the only spark cert I'm really aware of it the Databricks Certified Associate Developer one, you've got a choice of Scala & Python, but it's generally a good overview of the tool, digs into understanding of the engine/architecture etc - academy.databricks.com/exam/databricks-certified-associate-developer

  • @nikhildavis3844

    @nikhildavis3844

    3 жыл бұрын

    @@AdvancingAnalytics thank you

  • @nikithabramadi6434
    @nikithabramadi64343 жыл бұрын

    Could you please suggest how or from where to practice the format of this test, to be prepared with managing time.

  • @AdvancingAnalytics

    @AdvancingAnalytics

    3 жыл бұрын

    Hola! I've not seen any practice tests, although there may be some around! As for actual practice/preparation - Databricks have a free community edition, it's a single-node public cluster, but great for practicing: databricks.com/try-databricks

  • @divyadarshan8914
    @divyadarshan89143 жыл бұрын

    Any suggestions on how to practice. Understanding the concepts is one thing but until you have practiced on some sample questions, or problem statements, its bit tough to get level to confidence to appear for exam

  • @AdvancingAnalytics

    @AdvancingAnalytics

    3 жыл бұрын

    Hey, sorry - missed this during the break. The best way to practice is to spin up the Databricks community edition - it's a free learning environment! The Databricks docs have a ton of example notebooks that you can import & work through the code with. After that, pick up a personal project & work it through in anger. I'm definitely a "don't learn it till I try it out myself" person! Simon

  • @lucaspelicheck2457
    @lucaspelicheck2457 Жыл бұрын

    Good morning! Could you explain better how do you define the ideal number of partitions on a shuffle setting?

  • @girirajbagdi797
    @girirajbagdi7972 жыл бұрын

    How to access the notebook being shown in demo

  • @adrianajimenez523
    @adrianajimenez5234 жыл бұрын

    Excellent! How about Low level APIs ? RDDs ? are there questions about that? Thank you..

  • @AdvancingAnalytics

    @AdvancingAnalytics

    4 жыл бұрын

    Can't go into actual questions but the exam is focused on the DataFrame API so there's no driver for low level API commands. Understanding how data stores RDDs & how different DataFrame transformations impact RDDs behind the scenes should put you in the right place!

  • @adrianajimenez523

    @adrianajimenez523

    4 жыл бұрын

    @@AdvancingAnalytics thanks you for your time and anwser :)

  • @adrianajimenez523

    @adrianajimenez523

    4 жыл бұрын

    @@AdvancingAnalytics I want to share with you that I passed the exam!! =D thank you for all your videos about databricks. It helped me a lot to complete my learning!

  • @AdvancingAnalytics

    @AdvancingAnalytics

    4 жыл бұрын

    @@adrianajimenez523 woohoo! That's great to hear, congratulations! Glad the videos helped :) Simon

  • @nva1719

    @nva1719

    3 жыл бұрын

    Hi guys, can you please let me know if there were questions on Delta lake. I will be giving the exam in less than 2 weeks. I was planning to write 2.4 first and then write 3.0. only difference between them portion wise is Delta Lake.

  • @eugenemagloire9456
    @eugenemagloire94563 жыл бұрын

    It's long time that I am looking the explanation about slot.please safe me

  • @murifedontrun3363
    @murifedontrun33632 жыл бұрын

    Thank you for making this video. I have 2 questions 1) Will there be questions with more than 1 option correct? 2) do they negative marks for incorrect questions?

  • @AdvancingAnalytics

    @AdvancingAnalytics

    2 жыл бұрын

    I honestly cannot recall if there are options will multiple correct answers, hopefully someone else can help! There are no negative marks for incorrect questions.

  • @headindata

    @headindata

    2 жыл бұрын

    Hello Sanjeev. There is only one correct answer per question.

  • @murifedontrun3363

    @murifedontrun3363

    2 жыл бұрын

    @@headindata Thank you sir for responding :)

  • @jrakesh143
    @jrakesh143 Жыл бұрын

    is the exam available only online?do we have any test centres to take the exam

  • @mindfulcreativity8613
    @mindfulcreativity86133 жыл бұрын

    Can you clarify if a single task can run on multiple slots? Or is it that every task should be granular enough to run on a single slot.

  • @AdvancingAnalytics

    @AdvancingAnalytics

    3 жыл бұрын

    Hey - a single task can only run on one slot. That means a slot cannot spread across multiple workers (which makes sense as it's data held in memory). So the size of your RDD blocks / Tasks affects how neatly you can utilise the available slots across your workers. Too chunky and they don't spread evenly, too small and there's an overhead of accessing each task and things slow down. It's a tricky balance :) Simon

  • @gksaisaketh5413
    @gksaisaketh54132 жыл бұрын

    is it for fresher who doesn't know anything about spark , do we need any prior experience before giving the exam.

  • @douglassoares5671
    @douglassoares56712 жыл бұрын

    hello, do you know about this other certification? Databricks Certified Professional Data Engineer

  • @mamamiakool
    @mamamiakool2 жыл бұрын

    Can an executor span across multiple worked nodes? Lets say if during spark submit I asked for 4 executors and 4 cores, and the cluster has 8 nodes (2 core each), would the "logical" executor theorectically be spanned across nodes? OR each executor will be granted 2 cores only?

  • @AdvancingAnalytics

    @AdvancingAnalytics

    2 жыл бұрын

    Don't believe an executor can span across machines/nodes. Lots of managed spark platforms assume a single executor per node, as there's not much benefit of splitting a node across multiple workers

  • @akshaykhule1906
    @akshaykhule19063 жыл бұрын

    Hi which course I need to select to get databricks spark 3.0 certificate

  • @AdvancingAnalytics

    @AdvancingAnalytics

    3 жыл бұрын

    Hey - there's a specific "Associate Developer For Apache Spark 3.0" course - academy.databricks.com/exam/databricks-certified-associate-developer

  • @jonasmedaer9166
    @jonasmedaer91662 жыл бұрын

    Can you use ctr+f or some other search functionality on the pdf provided ?

  • @AdvancingAnalytics

    @AdvancingAnalytics

    2 жыл бұрын

    Not at the time - had to get really good at scrolling :D - that said, pyspark docs have changed quite a bit since this video, not sure if the format for the exam has been changed to keep up to date!

  • @rakeshdey6970
    @rakeshdey69703 жыл бұрын

    browsing documentation is allowed? Because they are providing pdf.. So I am wondering if that same document is allowed to search in browser... Thanks for this video, lots of information

  • @AdvancingAnalytics

    @AdvancingAnalytics

    3 жыл бұрын

    I don't recall there being a search mechanism, everything is embedded in the testing program. Better to just be familiar with the docs and good at scrolling! :)

  • @veraclmartins

    @veraclmartins

    3 жыл бұрын

    @@AdvancingAnalytics Hi! Just a kick question... How is the pdf version of the documentation organized? Is it divided by modules and each module with their classes, methods and attributes... or...? I don't know, any tips? :)

  • @fernandosouza2388

    @fernandosouza2388

    3 жыл бұрын

    @@veraclmartins Did you get some awnser?

  • @joyo2122
    @joyo21222 жыл бұрын

    so if i make a join then filter then group is a job where i have to shuffle ?

  • @AdvancingAnalytics

    @AdvancingAnalytics

    2 жыл бұрын

    Hey! You will have one spark job, but that job will have multiple stages. Each time you see a stage, it means there is a shuffle. So a join/filter/group transformation could have two shuffles, one if the join is wide, one if the group is wide. You would have one job and three stages in this case. Hope that makes sense!

  • @spandans2049
    @spandans20492 жыл бұрын

    This applies only for developer associate correct ? Could you please share details for developer professional ?

  • @AdvancingAnalytics

    @AdvancingAnalytics

    2 жыл бұрын

    Ooh - I hadn't even seen the "Certified Professional Data Engineer" course was introduced! I haven't taken the exam, if/when I do, I'll make a video! Simon

  • @arunkay7686
    @arunkay76863 жыл бұрын

    Is the laptop internal camera acceptable for this exam?

  • @AdvancingAnalytics

    @AdvancingAnalytics

    3 жыл бұрын

    They list an external camera as a requirement - they ask you to set it up so they can see you from the side, including your screen. Internal laptop cam might disqualify you, not sure!

  • @headindata

    @headindata

    2 жыл бұрын

    As far as I know the internal camera is acceptable.

  • @Texas2Nellai
    @Texas2Nellai3 жыл бұрын

    Please share the exam code no for spark 3.0

  • @AdvancingAnalytics

    @AdvancingAnalytics

    3 жыл бұрын

    I don't believe it has a code - it's a certification backed by Databricks not Microsoft. I had a skim of the website, my purchase, the exam certificate etc and they all refer to it as "Databricks Certified Associate Developer for Apache Spark 3.0" - no code to be found! academy.databricks.com/exam/databricks-certified-associate-developer

  • @abdullahsiddique7787
    @abdullahsiddique77873 жыл бұрын

    What if I know Scala spark and not pyspark does the exam consider this .

  • @AdvancingAnalytics

    @AdvancingAnalytics

    3 жыл бұрын

    Yep, there are two different flavours of the exam, one for Scala and one for Pyspark. From what we've heard, the Scala one is slightly harder as the documentation is a little harder to navigate, but if you're familiar with the Scala docs it'll be fine! Simon

  • @abdullahsiddique7787

    @abdullahsiddique7787

    3 жыл бұрын

    @@AdvancingAnalytics thanks much Simon

  • @antonyraj4037
    @antonyraj40373 жыл бұрын

    How much time needed to prepare for this certification

  • @AdvancingAnalytics

    @AdvancingAnalytics

    3 жыл бұрын

    Depends on your level of spark experience! If you've been using spark most days for a year or so, you'll get by with a day or two of refreshing & cramming. If you're new to spark, it could take a couple of weeks of research, learning & revising. It's very hard to say!

  • @headindata

    @headindata

    2 жыл бұрын

    I would argue that some of the architecture questions in the exam are quite tricky, even if you have been working with Spark for a while. So, differing from Simon, I would say that you need at least a week of review, even if you have been using Spark for a while.

  • @sabkabaap7206
    @sabkabaap72064 жыл бұрын

    First

  • @madhu1987ful
    @madhu1987ful3 жыл бұрын

    If there are 8 cores available in total in worker nodes and spark default shuffle partitions is 200, what happens? How does 200 make sense when only 8 slots are available? Pls explain. Thanks

  • @AdvancingAnalytics

    @AdvancingAnalytics

    3 жыл бұрын

    The 200 tasks are allocated across the workers, the slots will chunk through the tasks (so each of the 8 slots will likely process 25 tasks). So you generally want the default partitions to be a clean multiple of the number of cores as a rule of thumb. But yeah, it's likely that the 200 default isn't right for that size cluster. The modern spark engine (Spark 3.0 / Databricks runtime 7+) uses a few techniques to override the default during query execution and actually pick an appropriate number of shuffle partitions :)

  • @babuganesh2000
    @babuganesh20003 жыл бұрын

    When you lean back the audio quality getting bad