Spark Basics | Partitions

Ғылым және технология

Spark is a distributed computing system that is used within Foundry to run data transformations at scale. This series covers the core Spark concepts you need to know for working with data in Foundry.
In this video we introduce partitions, discuss the importance of partition sizing, demonstrate how to find the count and size of partitions for a dataset in Foundry, and describe methods for changing the number of partitions in a Spark DataFrame.

Пікірлер: 9

  • @curiousMe1000
    @curiousMe1000 Жыл бұрын

    Please keep this series going. Your spark tutorials are very useful. ! Making me love your product more and more

  • @mactech816
    @mactech816 Жыл бұрын

    Hi Team, Found this video really informative, I'll be really grateful if you guys can put some more data partitioning concepts and methods along with some advance best practices while working with spark. I'm new to Spark, I wanna learn it very thoroughly. Thanks

  • @ENNAJIHamza
    @ENNAJIHamza5 ай бұрын

    This video gave me ideas about my recurrent OOM driver problems, cause : many too small partitions

  • @MinecraftGamer0990
    @MinecraftGamer0990 Жыл бұрын

    Great video! More hadoop videos please)

  • @thousandsunny100
    @thousandsunny100 Жыл бұрын

    can we get into detail on the methods on repartition?

  • @BishalKarki-pe8hs
    @BishalKarki-pe8hsАй бұрын

    need more videos

  • @devaharshaveerla3100
    @devaharshaveerla31007 ай бұрын

    I had a requirement of having space in partition.But when I am writing data to S3 in parquet format with space in partition, it is failing Can I please have a solution?

  • @gardnmi
    @gardnmi Жыл бұрын

    Use delta lake 2.0 and the optimize command and never worry about the headache of managing partition size or counts again.

  • @adib4361
    @adib4361 Жыл бұрын

    The video quality is quite good, but I'd appreciate if the videos are more beginner friendly. 😀

Келесі