How we optimize Power Query and Power BI Dataflows

Ғылым және технология

Let's go into Adam's mind as he optimizes Power Query to make some Power BI Dataflows run faster to avoid resource usage. The result is INSANE AMAZING!
Power Query M function reference
docs.microsoft.com/powerquery...
📢 Become a member: guyinacu.be/membership
*******************
Want to take your Power BI skills to the next level? We have training courses available to help you with your journey.
🎓 Guy in a Cube courses: guyinacu.be/courses
*******************
LET'S CONNECT!
*******************
-- / guyinacube
-- / awsaxton
-- / patrickdba
-- / guyinacube
-- / guyinacube
-- guyinacube.com
**Gear**
🛠 Check out my Tools page - guyinacube.com/tools/
#PowerBI #PowerQuery #GuyInACube

Пікірлер: 78

  • @GuyInACube
    @GuyInACube2 жыл бұрын

    One thing not mentioned in the video is to be aware and careful about SQL Injection.

  • @Plackyu

    @Plackyu

    2 жыл бұрын

    Would love to see a video on SQL injection + Power BI and how to mitigate potential issues

  • @kkkkkkkkkkkkkkkk-k

    @kkkkkkkkkkkkkkkk-k

    2 жыл бұрын

    Can Best Practice Analyzer catch SQL injection?

  • @ItsNotAboutTheCell
    @ItsNotAboutTheCell2 жыл бұрын

    M-ind bending! Love the Advanced Editor and some neat tricks in here too for those performance folks! Keep up the PQ videos! One happy CAT!😻😻😻

  • @daryllynch998
    @daryllynch9982 жыл бұрын

    Hi Adam, thanks for the Video. I use this approach all the time. One thing that I don slightly different is to use the List.Buffer function because sometimes, I want PQ to read the List only once to get the full Set. I find it sometimes help performance. One suggestion for each time is consider Folding Query scenario. In this situation the following statement works: #"Filter Rows" = Table.SelectRows( Source, each List.Contains( CurrencyList, [Source Column] ) ) This leads to folder query in most data sources. I also feel this can be more effective T-SQL because the filter is applied to the main table without the need to the Left or Inner Join.

  • @JCMCodaste
    @JCMCodaste2 жыл бұрын

    This is a great video! thank you, Adam. Good to see that you guys are playing more with dataflows now! I will take it as a good sign of their "health" within the Power BI roadmap! (at my own risk ;) It would be great to have more videos on how to optimize dataflows deployment when linking them across different workspaces (also where the latter are engaged in a deployment pipeline!) ..choosing between the two different connectors in those specific situations is still making me scratch my head, to be honest. Happy 2022!

  • @michelleleroux1728
    @michelleleroux17282 жыл бұрын

    Learning to optimise is my goal for 2022! Thanks, Adam.

  • @pabeader1941
    @pabeader19412 жыл бұрын

    Added this to my BI Goodies playlist. Good job and good timing for me. Have a project where this is going to come in handy. Not so much for the performance gain as an easy way to get excel into PQ.

  • @tkadosh
    @tkadosh2 жыл бұрын

    Really amazing and efficent way ... Congratulations

  • @louism.4980
    @louism.4980Ай бұрын

    Very insightful, thank you!

  • @antoniogarrido3058
    @antoniogarrido30582 жыл бұрын

    Ingenuos solution.. 👍 Really powerful dataflow trick!

  • @klaustrampedach7312
    @klaustrampedach73122 жыл бұрын

    Great video! 😊 The trick also works with a non-native query. Might come in handy 😎

  • @dbszepesi
    @dbszepesi2 жыл бұрын

    Nice, more of this please!

  • @romualdulcyfer
    @romualdulcyfer2 жыл бұрын

    That's impressive !! I have some mergers in my PQ (eg. at data cleansing) which I need to somehow optimize after your video.

  • @michalmolka
    @michalmolka2 жыл бұрын

    Really great video!

  • @bijilabs7599
    @bijilabs75992 жыл бұрын

    Wow😍 You guys are awesome!

  • @trevorc7734
    @trevorc7734 Жыл бұрын

    Brilliant!!

  • @manawaajirioghene6231
    @manawaajirioghene6231 Жыл бұрын

    This is such a great video

  • @cirilolhoycuenco6261
    @cirilolhoycuenco62612 жыл бұрын

    This is so cool!

  • @danmeiss
    @danmeiss2 жыл бұрын

    Adam's version of "unplugged". See the thought process behind problem solving...

  • @Anthony_Lecoq
    @Anthony_Lecoq2 жыл бұрын

    Great stuff Adam ;)

  • @jls14
    @jls142 жыл бұрын

    Excellent video 👍. If advanced editor is scary, you can also do the text combine as a new step too. That way you don't have to do so much text editing. But SQL injection is pretty cool. Fantastic explanation as always 👍.

  • @billcoleman2009
    @billcoleman20092 жыл бұрын

    Nice. You could probably hook straight to the table, use selectcolumns() & filter steps and let query folding do its thing to keep it clean for incremental refresh etc - that should work, right?

  • @matthieumamet3497
    @matthieumamet34972 жыл бұрын

    Great video ❤

  • @blank3786
    @blank37862 жыл бұрын

    You guys are amazing

  • @ekaterinazamosha4149
    @ekaterinazamosha41492 жыл бұрын

    wow !!!!! Thank you!

  • @TheAalouis
    @TheAalouis2 жыл бұрын

    Great Job, explaining the refresh protocol columns. I am having an pretty expensive relative join in my dataflow script (find keyword in a text field and join that). Maybe this is something for your goal 2022! :)

  • @remyschrader9286
    @remyschrader928610 ай бұрын

    In Advanced Editor: ‘PromotionKeysAsList = Promotion[PromotionKey], CurrencyKeysAsList = Currency[CurrencyKey],’ Values from a column in a table return as a list from ‘NeedValuesAsListType = TableName[ColumnName]’

  • @wmfexcel
    @wmfexcel2 жыл бұрын

    WOW! It makes a huge difference! May I know if there is any performance boost if we applied the Filters instead of doing it by Merging Queries?

  • @kebincui
    @kebincuiАй бұрын

    Super ! 👍👍❤❤

  • @MrSparkefrostie
    @MrSparkefrostie11 ай бұрын

    For anyone running into issues where your list converts to a table, this should help if your query is SQL, will need some tweaks if you use a differrent source. I am guessing it changes to table as its a live source and not sitting with load disabled Text.Combine(Sql.Database("Server", "Database", [Query = "SELECT CAST([NeededColumn] as VARCHAR) AS NeededColumn FROM [dbo].[NeededTable] WHERE Conditions = Conditions "])[NeededColumn], ", ")

  • @GuyInACube

    @GuyInACube

    11 ай бұрын

    Appreicate the call out on that. Will need to look at that more 👊

  • @MrSparkefrostie

    @MrSparkefrostie

    11 ай бұрын

    @@GuyInACube All good, this issue was driving me...bananas....

  • @ReadySteadyExcel
    @ReadySteadyExcel2 жыл бұрын

    Awesome thumbnail!!

  • @sidharh100
    @sidharh1002 жыл бұрын

    Hey Adam, I was just wondering, if you could possibly make a video regarding Significance Testing in Power Bi?

  • @coolblue5929
    @coolblue59292 жыл бұрын

    I’ve been doing this for a long time, initially in datasets (and excel) and then in dataflows. I wrote custom functions to convert from tables to SQL clauses (IN or VALUES) and a custom, multi-replace function to hack the SQL text. I also use parameters of the form :from and :to, for example, to fold back range filters into the native SQL also using the multi-replace function. Things change quickly with monthly updates but, the biggest problem with this approach has been finding the right pattern to satisfy the formula fire-wall. It would be great to compliment this video with a structured commentary on this, along with careful explanation of the potential problems with SQL injection.

  • @coolblue5929

    @coolblue5929

    2 жыл бұрын

    I would also suggest that, if doing such a trivial merge breaks M then M is seriously broken.

  • @sirojiddinsobirov5508
    @sirojiddinsobirov55082 жыл бұрын

    Insane Amazing!!!!!

  • @kasmirasmarzo
    @kasmirasmarzo2 жыл бұрын

    So useful. I’m doing merges everywhere and because I’m not a ‘real data scientist’ (I’m in regulation!) I have an itty bitty laptop. I’m constantly running out of memory.

  • @noahhadro8213
    @noahhadro82132 жыл бұрын

    Awesome video. How would I do this if I wanted to filter that table but was not using a native query. so I connect to a table in sql and then I want to filter the table based upon a column IN a list of items from an excel spreadsheet?

  • @olivierfiliatrault6027
    @olivierfiliatrault60272 жыл бұрын

    Amazing! One question: will the resulting query fold?

  • @LifeatAustralia
    @LifeatAustralia Жыл бұрын

    Amazing Video Guy in a Cube 🙂 Would you please tell me how to manage merge queries when merged two tables through two different API calls. Is there any way to optimize API calls?

  • @fluffigverbimmelt
    @fluffigverbimmelt2 жыл бұрын

    Nice, I might be able to use this on a db where DBA only let's us access via NOLOCK hint (yeah, don't get me started on that)

  • @Kierslee
    @Kierslee5 ай бұрын

    Adam, where can I find the details regarding the time and resources to run the refresh (cpu, memory, etc.)?

  • @mikemagill68
    @mikemagill685 ай бұрын

    Q. Awesome video. This is exactly the challenge I’m facing currently but m pulling in a SQL table with c. 54. million rows and then need to eliminate rows based on an external table with c. 10,000 rows. I totally get what you’ve done but can you inject a SQL ‘WHERE’ command with a comma separated list of 10,000 values? Is there an upper limit?

  • @Korallis1
    @Korallis12 жыл бұрын

    Ok so this is all great and everything but what about when you have to merge in order to get the data you require for creating relationships? Is it better to just create a SQL query to provide the exact data your after? I have such a complex model structure and as such I have a lot of merges but the source is almost always sql

  • @visheshjjain
    @visheshjjain2 жыл бұрын

    Instead of filtering, If you want a column from your an external file in your fact table, is there any other way, apart from merge, to do it in PQ? Thank you!

  • @emilymorrison4263
    @emilymorrison42632 жыл бұрын

    How do you approach this if the merge is on two or more fields?

  • @WojennyMlotek
    @WojennyMlotek2 жыл бұрын

    Yep, more optimazation please

  • @vikramanmohan121
    @vikramanmohan1212 жыл бұрын

    I have done this for multiple SQL queries where one SQL statement executed using a parameter and becomes an input for the next. One thing I learnt will make things easier is to store the quietly in a variable and pass it to source and you can paste your SQL query from your code editor directly on to the variable in Power Query

  • @9zQx86LT
    @9zQx86LT2 жыл бұрын

    Can this work with cosmos db?

  • @jenniferpavey1
    @jenniferpavey12 жыл бұрын

    This!! Omg

  • @pawewrona9749
    @pawewrona97492 жыл бұрын

    Was laughing really hard when I heard that Adam and Patrick had a "race competition"

  • @Narses3
    @Narses32 жыл бұрын

    Nice video concept to explain the mashup containers, also a nice showcase of other people Frankensteining queries/dataflows potentially over time from different people . Not sure if the exact problem you solved for the customer was as simple as this problem, but if it was then the team that manages dataflows/powerbi really should have seen this a mile off , but I guess not everyone knows everything to do with powerbi. I appreciate things are obvious in hindsight but I would be really surprised if a problem like this really did make its way to your desk.

  • @davidlopez-fe2lb

    @davidlopez-fe2lb

    2 жыл бұрын

    To your point Jonathan, I feel like the "architect" team should've have guidelines on how to build things, so monsters like this don't get created. We just brought on Power BI Desktop + Service onto our org and we've set no guidelines on how to build, so we'll see plenty of monsters like this. It's just a classic org problem of migrating processes over to the new platform asap, "we don't care how it gets done as long as its done" scenario. Luckily I watch Guy in a Cube weekly, and so far have minimized the monsters I've built.

  • @ItsNotAboutTheCell

    @ItsNotAboutTheCell

    2 жыл бұрын

    Can confirm, with the constant release of new features, it's often difficult to keep up with each nuance of the product, especially for those who aren't in Power BI for 40 hours a week (as simple or as complex as the topic may be). I always appreciate the opportunity to see the product with "fresh eyes" though and see some light bulb moments when people learn something new that can help them in the future.

  • @biexbr
    @biexbr2 жыл бұрын

    Mannn, I do this a Lot! And I mean a lot lot.

  • @matthiask4602
    @matthiask46022 жыл бұрын

    This almost like in Inception. Just with a list.

  • @DanielWeikert
    @DanielWeikert2 жыл бұрын

    Where can I see which operation is "expensive" and should be avoided? Is there a reference doc for that? Thanks

  • @Narses3

    @Narses3

    2 жыл бұрын

    Dataflow = slow, Understand what each step is doing, and what isn't folding to the source, An inner merge (filter) isn't folding to the source = all the data is being queried before then being filtered by the mashup engine There is no silver bullet , but if you understand what each step does (and isnt doing) then you can optimize, as with a lot of things it comes with experience.

  • @bbangel90
    @bbangel90 Жыл бұрын

    I have a Power App Dataflows which transforms and load data to a Dataverse table . This operation now takes around 50 minutes to complete, and I am speculating that the issue is not in the calculation in Power Query, but rather in the loading operation to the Dataverse table. How can I go about optimizing/improving the data loading performance?

  • @fsanfo
    @fsanfo11 ай бұрын

    I have this query that takes 20m to run in my Oracle PL/SQL client. And I discovered that PQuery Online evaluates querys for 10 minutes only. So what do you recommend me? Retrieve the data by chunks or something else? I'm Pro User using Power BI service.

  • @gravestoner2488
    @gravestoner2488 Жыл бұрын

    All my data comes in excel sheets, i have 1 sheet per month for 2 different data fact sheets. Is there a good way to combine these automatically without merging?

  • @MrSparkefrostie
    @MrSparkefrostie11 ай бұрын

    What I am looking for is to apply predicates against an upstream dataflow, maybe it does this already, time will tell

  • @rakeshverma-fi2ju
    @rakeshverma-fi2ju2 жыл бұрын

    Hey everyone, We are refreshing a dataset against a dataflow that is using an incremental refresh and our C drive of gateway servers is hitting 0 MB space( from 400GBs). Is there anything we are doing wrong…

  • @mohammedameen9477
    @mohammedameen94772 жыл бұрын

    Awesome Video, How can we solve this when we have two tables which we want to merge and the data is coming from a flat file like csv or txt. I would want to perform left outer join on these two table based on some condition like IF (Table1[Column] = Table2[Column], Table2[Column], 0) Suppose if we have only one column from Table2 which we want to bring in Table1. We can convert that column in a list and use List.Select and write a condition but when we want to bring two columns or more how will do that. Can you help me with this. Thanks :)

  • @olemew

    @olemew

    2 жыл бұрын

    csv file is not sql based, you cant do that

  • @asjones987
    @asjones9872 жыл бұрын

    Nice example, but curious on the syntax in the where clause. What were the “&” doing around PromotionList and CurrencList?

  • @dipeshbhundia1290

    @dipeshbhundia1290

    2 жыл бұрын

    Concatenation to access the variables

  • @pabeader1941

    @pabeader1941

    2 жыл бұрын

    They are there to 'build' the actual string that is being sent to the data source. & is the concatenate operator. It's used to combine strings together.

  • @matthiask4602
    @matthiask46022 жыл бұрын

    finally a GIAC M-Code video. #avoidthewait

  • @ItsNotAboutTheCell

    @ItsNotAboutTheCell

    2 жыл бұрын

    More M!!!

  • @Baldur1005
    @Baldur10052 жыл бұрын

    I know this against Roche's Maxim, but refresh time-outs are not worth it. I do "Group Bys" and Merges in DAX and if I want to to enrich table I use relationship or lookup function, doing those in DAX hurts performance by little margin and increase model size, but time-outs are much much worse.

  • @Phoenixspin
    @Phoenixspin Жыл бұрын

    The thumbnail makes me think of monkey brains. (Indiana Jones reference)

  • @Paul-pv8mo
    @Paul-pv8mo2 жыл бұрын

    🌈 P r o m o S M!!!

  • @PaulBailey3
    @PaulBailey32 жыл бұрын

    That's 🍌s

Келесі