Optimizing Apache Spark SQL Joins: Spark Summit East talk by Vida Ha

Пікірлер: 10

@IndianDashCamAdventuresАй бұрын
Watching in 2024
@indrajareddy40784 жыл бұрын
This is a gold mine! Thanks Vida Ha!!
@gounna17957 жыл бұрын
Thank you!
@hongxuanchen6 жыл бұрын
It’s very useful .Thank you !
@mohanp78196 жыл бұрын
Nice talk.Thanks
@thinkingaloud18333 жыл бұрын
It's surprising to me that you need to manually tweak theta join for it to work. You don't expect that for relational database right? So basically range filtering is slow on Spark even if we filter on simple integer partition key?
@epschronos5 жыл бұрын
Good talk. but one comment: the last thing you can say about RDDs is that these are deprecated...
@RahulSharma-datasaur
3 жыл бұрын
Agree. They are never going go away. The base structure would always be an rdd
@shubhashis10005 жыл бұрын
Very sorry to say that, all the topics discussed i guess most people already know...nothinhg extra I got from this and bdw, that is SM join ...
@rajarshidutta8
5 жыл бұрын
Agreed!! I did not get any special insight from this talk. any google search blog or article would throw much better explanation to these problems!!