Understanding Delta Lake - The Heart of the Data Lakehouse

Ғылым және технология

Data Lakehouse is taking the world by storm as the new data warehouse platform! In this video, I demonstrate how Delta Lake provides the core functionality of the Data Lakehouse and demystify this powerful technology.
Join my Patreon Community and Watch this Video without Ads!
www.patreon.com/bePatron?u=63...
Databricks Notebook and Data Files at:
github.com/bcafferky/shared/b...
Uploading Files to Databricks Video
• Master Databricks and ...
See my Pre Data Lakehouse training series at:
• Master Databricks and ...

Пікірлер: 17

  • @mainakdey3893
    @mainakdey3893Ай бұрын

    at last somebody is clearing the confusion, Good job Bryan

  • @amarnadhgunakala2901
    @amarnadhgunakala2901 Жыл бұрын

    Thank you Brother, this helps people.

  • @stylish37
    @stylish3710 ай бұрын

    Top stuff Bryan! Thanks a lot for this playlist

  • @BryanCafferky

    @BryanCafferky

    10 ай бұрын

    YW

  • @gatorpika
    @gatorpika Жыл бұрын

    Great explanation! Thanks!

  • @BryanCafferky

    @BryanCafferky

    Жыл бұрын

    You're welcome!

  • @rahulberry5341
    @rahulberry5341 Жыл бұрын

    Thanks for the nice explanation

  • @BryanCafferky

    @BryanCafferky

    Жыл бұрын

    YW

  • @parisaayazi8886
    @parisaayazi88862 ай бұрын

    Thanks Bryan! I'm wondering how it's possible to create a CSV table using the CREATE TABLE command, which allows us to write SQL queries against it, but we can't use saveAsTable with format('csv') to achieve the same result

  • @BryanCafferky

    @BryanCafferky

    2 ай бұрын

    Originally Spark could not create updatable tables. Instead it could only create a schema for a flat file like a CSV. The schema describes the data in the file so SQL select statements can be used on it. You can't update the table though and it is not a Managed table meaning if you drop the table for the CSV file, the file remains. Updateable tables (supports CRUD and ACID) was added with Delta tables.

  • @parisaayazi8886

    @parisaayazi8886

    2 ай бұрын

    @@BryanCafferky thanks a lot.

  • @panzabamboo1901
    @panzabamboo1901 Жыл бұрын

    Hi Brian, would you be able to elaborate more on the file types, currently supporting etl jobs running databricks, still using trial and error to figure out the file type/ how to load em

  • @BryanCafferky

    @BryanCafferky

    Жыл бұрын

    Hi Panza, Assuming you mean source files types to be read, most file types supported via Spark, i.e. csv, json, SQL databases, parquet, delta, avro. Are you looking for a specific type?

  • @gautamgovinda5140
    @gautamgovinda51402 ай бұрын

    Cool👍

  • @user-cj2wt4mi5b
    @user-cj2wt4mi5b6 ай бұрын

    Thanks, this is great video and well explained

  • @BryanCafferky

    @BryanCafferky

    5 ай бұрын

    Thanks. In my experience, it is important to have the original data you loaded into a DW bc 1) troubleshooting issues, 2) recovery if some part of the data fails to load - you reload from the copy, 3) auditability - you can show what you loaded. It's especially critical if you cannot go back at a later date and retrieve that data again from the source.

  • @sajeershahul8361
    @sajeershahul836111 ай бұрын

    How can I not subscribe 👌🏽

Келесі