How does BigQuery store data?
Ғылым және технология
Blog post → goo.gle/3iClDVr
Partitioning docs → goo.gle/3xXF4yE
Clustering docs → goo.gle/3kGaRQR
How does BigQuery’s internal storage work? In this episode of BigQuery Spotlight, we share how BigQuery stores data so you can make informed decisions on how to optimize your BigQuery storage. We’ll also talk about partitioning, as well as clustering and how it allows for efficient lookups.
Timestamps:
0:00 - Intro
0:27 - Overview
0:50 - Columnar Storage
1:46 - Capacitor File Format
2:35 - Colossus Distributed File System
4:30 - Storage Optimization
5:00 - Partitioning
6:23 - Clustering
7:12 - Partitioning + Clustering
Storage optimization docs → goo.gle/2V0q7gj
Watch more episodes of BigQuery Spotlight → goo.gle/BQSpotlight
Subscribe to Google Cloud Tech → goo.gle/GoogleCloudTech
#BigQuerySpotlight
product: Cloud - Data Analytics - BigQuery; re_ty: Publish;
Пікірлер: 16
Oh man. Finally a clear explanation of clustering. I've been studying for my Data Engineer cert for the last couple of months and this is the best explanation of clustering that I've read. Thanks!
Thank you for explaining how this operates with the tables and capacitor file format
The video maybe contains an error at 1:35 - it says "OLTP" while it should probably say "OLAP" for Columnar Storage
@alexyt75029
Жыл бұрын
Exactly.
Awesome explanation
Thanks for protecting my Data 🙏👍☺️
Awesome 👍👍
thanks!
Anyone else noticed that she switched the places of the paintings in the background ?
Ferrari = BigQuery in terms of Speed🔥
Does a Capacitor store a whole table in a columnar layout or does one capacitor store one column?
👍
partitioning is a condition to clustering...
@johanpicard3741
2 жыл бұрын
It used to be the case when clustering was added to BQ, but that changed a while ago ! They can now be used independantly. Users often chose to partition before thinking about clustering anyway though :)
I am a bit disappointed, I was expecting for an explanation of the streams, and why we can't delete or update a row that was added/updated/deleted in the last 30-90 mins.
@leighajarett221
3 жыл бұрын
Sorry to hear you're disappointed - covering streaming sounds like a great idea for a future video! As for deleting or updating a row that was added / updated / deleted recently, I will pass that feedback along to our team.