Check out our weekly series: "5mins of Postgres" (published every week), where pganalyze founder and CEO Lukas Fittl walks through interesting Postgres content from the previous 7 days. We also publish Postgres presentations, our conference talks, product walk-throughs, and tutorials.
About pganalyze:
pganalyze provides deep, actionable insights into Postgres. Specializing in PostgreSQL database monitoring and optimization, it gives automatic insights into Postgres query plans, helps improve query performance with its Index Advisor, and lets you perform query drill-down analysis, observe per-query statistics and conduct trend analysis. It integrates with both self-managed Postgres servers as well as Database-as-a-service providers like Amazon RDS.
DBAs and developers use pganalyze because it surfaces opportunities to optimize Postgres performance and gives recommendations that help speed up queries. It comes with helpful security features like PII filtering and SSO integration.
Пікірлер
This is top-notch Postgres content. Thanks for presenting it!
great info, is still inside the 17 release?
Thanks, great content as always
Hello. I guess 25% was recommended as starting point, because it is better when default value. And, of course here may not be enough memory for another processes. As a result, the 45-50% is a finish.
thanks for sharing
You're very welcome! Thanks for your interest!
What is CTE?
CTE = Common Table Expression, or "WITH [name] AS ( SELECT ...) SELECT" in a query. See www.postgresql.org/docs/current/queries-with.html
Damn, that's a great explanation in 5 mins. You rocked it Lukas
That's a booby trap if I've ever seen one. Luckily the fix seems pretty easy!
What are the demerits of setting random_page_cost as low as 1.1 If there are some queries on my db , that get benefited from lower the random page cost, is there any other queries that would get hampered due to this?
@pganalyze, can the same prepared transaction be utilized across multiple connections in postgres?
Interesting! Thank You!
Would prepared transactions be no benefit in oltp wirkloads where Assuming that each client will only be executing single block of transactions as opposed to executing multiple blocks of transactions?
You did great thing there. Thanks for great content, as always!
Currently spending the next 6 weekends purging data from a large complex 3nf database with UUID FKs between all of the tables. On an extremely large instance(192G Ram, 48 vCPUs) I'm getting <31 deletes/sec with all the ugly UUID FK lookups.
That sounds like a lot of work! I assume you have an index covering the foreign key lookup? (since you commented on an episode where we talk about indexing UUIDs, I assume you're running into the exact problem with indexes on UUIDs being bloated?)
@@LukasFittl Correct, and all those complex index lookups for all the referencing tables are slowing down deletes, even though we've handled(rolled off and purged) all the referencing rows. All referencing tables have been fully vacuumed as well after cleanup so they should be fresh. Still super slow.
@@awksedgreep Makes sense. If you haven't done so already, it might be worth trying a REINDEX CONCURRENTLY on the affected indexes that support the foreign key. Even after a regular vacuum, there would be lots of inefficient index structure left behind that a reindex can fix up.
Never heard about that +0 solution. Genious!!!! Thanks for sharing it.
Indeed.
Woo hoo. Nice job to everyone involved on this as well, including you Lukas for helping raise awareness of this and connect everything together. In Active Record Rails apps, I’ve seen problematic IN clauses with a large amount of values, showing up in PGSS. My feeling is this is going to help a lot when this becomes available.
Yeah, I'm excited about what this will bring for the typical Rails app - these overly long IN lists are very much a Rails pattern that should see a noticeable performance benefit here with 17.
very effective 5 minutes -- thanks for the great vid!
I am very glad to have found this channel! Got some performance concerns when working with a correlated subquery and this was very enlightening! Thanks
Happy to hear you found it useful!
Awesome!
Nice boost. Vacuum can never be fast enough ❤
Thanks for doing these Lukas! I really appreciate your summaries of the features/functionality in commit logs
Thanks Wulf! Its an excellent excuse to actually test the new functionality :)
This is a gem!
Great video, very well explained!
Very interesting! Curious does the unpartitioned table has an index on the partition by column? If not, is the join by index better than partition-wise join
Thanks, Lukas! Very interesting overview, I was not aware of many details. BTW there will be a talk on this topic by Dilip at PGConf this year. Looking forward to it.
Thanks Lucas ❤
Nice examples Lukas showing the SQL to set up ranges using the new functions.
Thanks Andy!
Thanks Lukas! For this kind of excellent, but densely filled blog post, on high value planner optimizations with example queries that benefit, I appreciate having this overview of the items to start to digest them. It's easier to slot the quick audio into my day, then I can go back into the nitty gritty details of the post and try the examples out. Thanks to David, Melanie, and all the contributors for these improvements!
Thanks Andy, happy to hear you found it useful!
Congratulations on 100th episode Lukas!
Erwin the stackoverflow legend. Almost on every post about relational databases you gonna find his answer.
Very useful, as usual. Ty!
Very good summary. Thank you.
Thank you for the excellent post and the detailed examples! Definitely had a hard time editing this down (started with almost 20 minutes of walking through your post) :)
Loved it, very informative 🎉
Wow, pg_input_is_valid is very good missing part:) Thanks!
Just to add in complexity, you've also now got Postgres in a container .... on Kubernetes
thanks fotr sharing , keep up the good work 🔥
It often works even better if you group the data on itemid and put a chunk of time (say a day or a week) of that item’s data together. Works especially well for physical measurements, as they tend to change more gradual. This is basically what TimescaleDB does, reaching over 90% compression if configured properly.
very informative, thanks for sharing
Awesome! Thanks for the video!
We also have a huge table with 6 billion rows sizing to 450GB. We found that hash partitioning on id column with 128 partitions with ~3.5GB each, immediately shown decreased response times and way faster index creations. Simple select queries are way faster.
Simply brilliant!
Incredible video. Great explanation and example
Thanks Lukas!
Brilliant. Thanks Lukas!
This seems problematic to me. What if you need openssl in fips mode for security purposes, but want md5 for non-security related work? Seems like the FIPS capacity needs to be more tunable.
From the FIPS-related conversations I've personally been part of (unrelated to Peter's work on this in Postgres), the requirement I've heard is that it actually has to be removed/turned off completely to fulfill the requirements. That's been in the context of operating a managed service that needed to be FIPS-compliant. I'm sure there can be nuances to this - maybe this could be addressed by having an explicit "pg_md5" module that one could load if needed. All that said, in my experience md5 isn't a particularly good hash function for non-security purposes either - I personally wish we had something like xxh3 hash built-in with Postgres.
Excellent content as always!
uploadd the entire series of postgres ffrom beginners to advanced level
Neat!
You don't touch existing PK or UNIQUE indexes (which sounds logical) , but how about supporting indexes for FKs ? Seems like the low hanging fruit to me.
Good question! There is no reason indexes supporting deletes on foreign keys couldn't be considered in the model (you could consider them a scan, just indirectly running), but so far we haven't integrated it yet. Definitely on the roadmap though, and something we'll likely support in the near future.
Detecting them seems relatively obvious, if they are registered in the schema/catalogs. @@LukasFittl
Hi Lucas, nice video. I'm the original author of Aurora's QPM, and I'd be happy to tell you more about it. Many people think of QPM as you described it - as a way to force the planner to use a known-good plan or to avoid a known-bad plan, but it's actually a fully cost-based mechanism. Multiple plans may be approved, and the planner will choose the minimum cost plan among them for the current bind variables and literals. If you think about prepared statements with bind variables, it's immediately apparent that a single optimal plan may not be enough, but the same is true with literals. The ability to evolve the set of known-good plans enables plans to just get better and better with time as the planner discovers new and better plans, especially across major version upgrades.
Thanks for providing the additional context! And yes, makes a lot of sense to allow better costed plans vs just a single fixed plan.