Big Data Analysis with Scala and Spark

7 жыл бұрын

RDDs: Transformation and Actions

7 жыл бұрын

RDDs, Spark's Distributed Collection

7 жыл бұрын

Transformations and Actions on Pair RDDs

7 жыл бұрын

Data-Parallel to Distributed Data-Parallel

7 жыл бұрын

Shuffling: What it is and why it's important

7 жыл бұрын

Structured vs Unstructured Data

7 жыл бұрын

Datasets

Пікірлер

@user-xl9fs3tp8y11 ай бұрын

Really helpful !!!

@bres6486 Жыл бұрын

At most one key value pair per id per node (not key value pair per node as far as I understand) after using the reduceByKey().

@vikastangudu712 Жыл бұрын

you are awesome.

@mateusznowakowski6805 Жыл бұрын

Great video

@bigdataenthusiast Жыл бұрын

simply great

@ddoshi39 Жыл бұрын

Thank you so much

@damianoderin48742 жыл бұрын

Awesome course. Thanks a lot!

@rydmerlin2 жыл бұрын

How can I combine queries to multiple data sources and get one result?

@rydmerlin2 жыл бұрын

Why does it flicker so much?

@balanceresume28022 жыл бұрын

🤩😍🥰 heather miller

@Manapoker12 жыл бұрын

thx you for this video, it helps a lot! <3

@WaterWheel3602 жыл бұрын

commenting for the KZread algorithm

@ashwinichandran88393 жыл бұрын

Wonderful explanation.... waiting for many videos from you on different technologies like HIVE and PySpark

@ManikantGoutamReal3 жыл бұрын

this is god-level video. thanks a lot.

@user-ep2vw2ss5y3 жыл бұрын

the only sorry that i cant get english

@souravbanerjee57443 жыл бұрын

can you share the link of the scala course referred often in this series ?

@nageshbs89453 жыл бұрын

we can't say database are structured, many no sql database do not support schema

@Mryajivramuk3 жыл бұрын

Very impressive mentor you are....pls do full series on spark and scala ...and be a part of our journey.

@madhu1987ful3 жыл бұрын

Coalesce is a wide transformation? Can u pls explain in detail. Thanks

@andys75963 жыл бұрын

So many videos in other channel but this one after so many years still has best value content. Thank you !

@LivenLove3 жыл бұрын

What are the deciding factors for number of partitions

@LivenLove3 жыл бұрын

Only channel where a don't increase playback speed

@avsbharadwaj81903 жыл бұрын

why there is no mapper side optimisation for the groupByKey operation?

@underlecht3 жыл бұрын

Hello, 1:30 for "fastest" calculation you apply shuffling in line 3, and after that you measure the duration. Why don't you include shuffling to duration? Data preparation also takes time. Unless you mean "shuffle once and for all", but in reality it is hard to imagine that you will be grouping by one column only in your calculations. Thanks.

@narendernegi74933 жыл бұрын

Amazing.

@gothamsudheer47513 жыл бұрын

Your teaching skills excellent. You know how to teach.Thank you so much......

@oguzhan23933 жыл бұрын

finally, I found good videos about spark and scala and she is using crystal clear english

@yangmingwang1603 жыл бұрын

You make the best video among the Spark tutorials on KZread, thank you!

@aspait4 жыл бұрын

we can use pre-partition in map RDD(like hash and range) how can I use it in Dataframe?

@aneksingh44964 жыл бұрын

Please keep posting new videos on spark and scala ... Your videos are awesome 👍

@DatNguyen-ry1vr4 жыл бұрын

Gold!!

@aneksingh44964 жыл бұрын

absolutely great .... please add some more videos on spark real time use cases ... thanks

@pratikkawalgikar48394 жыл бұрын

The concept is now clear for me after searching all over the net from last 3 months. Thanks a lot. Your videos are very simple to understand. Please upload more on spark as I have finished watching all your videos and they are simply superb.