Stop using COUNT(id) to count rows

Ғылым және технология

📚 Learn more about PlanetScale at planetscale.com/youtube.
------------------
00:00 Intro
01:04 Origins of the myth
02:00 What does COUNT(*) mean?
02:30 COUNT(*) example
04:00 Primary and secondary indexes in MySQL
05:20 COUNT(id) example
------------------
💬 Follow PlanetScale on social media
• Twitter: / planetscaledata
• Discord: / discord
• TikTok: / planetscale
• Twitch: / planetscale
• LinkedIn: / planetscale

Пікірлер: 609

  • @DoubleM55
    @DoubleM5510 ай бұрын

    I had a strong feeling that something like this is going on in DB engine, but only "argument" I had is pretty weak: "It would be very stupid if DB engine pulled all data from all columns just to pass it to count() function and return single int", and it would be very easy to implement such optimization. So I was using select(*) while simply trusting the DB to do the right thing. Thanks to this great video, now I have a confirmation and also know exactly how DB decises how to count rows in optimal way. Great video :)

  • @PlanetScale

    @PlanetScale

    10 ай бұрын

    Nice to have your gut feeling proven right! Glad you liked it.

  • @QWACHU

    @QWACHU

    10 ай бұрын

    This was maybe true far far in a past, but many years ago count(*) was optimized in DB engines to not use whole columns.

  • @SergeDuka

    @SergeDuka

    10 ай бұрын

    I suggest using ‘count(1)’ to specifically show that no data is used. It eliminates any confusion.

  • @timucinbahsi445

    @timucinbahsi445

    8 ай бұрын

    I think this is one of the most important skills in programming. Knowing you are not the only smart person around :) If it's so obvious to us, developers of the engine might have thought of it as well. Modesty W

  • @ErazerPT
    @ErazerPT10 ай бұрын

    The most important take from the video is not about count(), but generically that (if speed is critical) you should always review the execution plan. Easier to do while at the prototyping phase, but can still be accomplished in production, just needs more testing and QA.

  • @meenstreek

    @meenstreek

    10 ай бұрын

    The problem is the execution plan the database will use while in development with very little data will be different to the execution plan in production when the query is now traversing millions of rows. So you just have to make educated guesses and use experience to know when a query will work or not and then iterate.

  • @wasd3108

    @wasd3108

    10 ай бұрын

    I don't think you got the main point. It's about shutting down your uncle when everyone is around.

  • @christianbarnay2499

    @christianbarnay2499

    9 ай бұрын

    If speed is critical you should always write your queries in the most simple and easy to understand way. Optimizing the execution plan is one of the main purposes of the DB engine. Just let it do its job without interfering. When performances degrade, the first response is to update statistics to let the engine have an accurate view of the data. If the engine still fails at finding the fastest path, rewrite your query in a better way. And ultimately give it directions as a last resort solution.

  • @ErazerPT

    @ErazerPT

    9 ай бұрын

    @@christianbarnay2499 "If the engine still fails at finding the fastest path, rewrite your query in a better way". That's what i said. "Review the execution plan" not "optimize the execution plan". If you can see that the DB engine can't make heads from tails of what you want, THEN you need to think on what/where to change so that it can. And waiting for degradation is a bad idea, because it can take long enough that when you NEED to make changes you might find out you CAN'T make them anymore (without breaking stuff). Also, most people writing queries are software developers with not that much DB background. You can't expect them to write "excellent queries" or design "excellent schema"... As a "dumb software developer" i count my blessings when i have really good DB people around so i can show them my stuff and they can keep me from shooting my own feet :D

  • @christianbarnay2499

    @christianbarnay2499

    9 ай бұрын

    @@ErazerPTDatabases are so central to software that I can't label someone a software developer if they can't write decent SQL requests. And SQL is so easy to understand it only takes a couple hours to get the basic knowledge that will fulfill 90% or your needs.

  • @dimasshidqiparikesit1338
    @dimasshidqiparikesit133810 ай бұрын

    This is actually incredibly educational. Thanks, planetscale!

  • @teej_dv
    @teej_dv10 ай бұрын

    What about counting in OurSQL?

  • @aarondfrancis

    @aarondfrancis

    10 ай бұрын

    I can always count on you! YouSQL

  • @AmitMerchant

    @AmitMerchant

    10 ай бұрын

    😆@@aarondfrancis

  • @dmsalomon

    @dmsalomon

    10 ай бұрын

    Chill comrade, we don't need to expropriate the database engine for the proletariat

  • @ejaz787

    @ejaz787

    10 ай бұрын

    in soviet Russia you don't count(*) , * counts you

  • @XmasApple

    @XmasApple

    10 ай бұрын

    Did you mean YQL used in YDB?

  • @binaryfire
    @binaryfire10 ай бұрын

    "You'll have won the argument, which is what the holidays are all about." 🤣

  • @TravisTennies
    @TravisTennies10 ай бұрын

    Just subscribed. Hard to find people who actually talk real facts these days.

  • @lostcarpark
    @lostcarpark10 ай бұрын

    Thanks for this. I've had this argument so many times. I think some early SQL engines did look at all columns for count(*), but I believe pretty much all of them have this optimization at this point.

  • @user-ps7zt3vm9q

    @user-ps7zt3vm9q

    10 ай бұрын

    I believe it is a key point. Some early SQL engines did not have that optiomization, now they have it. To make a coclusion with a smart look now, is not exactly right. So man should make it automatically and even do not spend time, always use `count(1)` and not to worry about the peformance or search for confirmations. That solution works everywhere.

  • @Metruzanca
    @Metruzanca10 ай бұрын

    This is the type of content I didn't know I wanted. More please.

  • @anothermouth7077
    @anothermouth707710 ай бұрын

    Two thumbs up to you. My TL was at times bugging me on some reviews. Even when I showed him SQL docs saying that most optimisation is already being done by the MySQL there's no need to over engineer the solution.

  • @kentlarsson1263
    @kentlarsson12637 ай бұрын

    Great stuff! Love that you mix in a bit of fun with the content, it's what got me to subscribe!

  • @bazcuda
    @bazcuda10 ай бұрын

    "todos" also means "everything" in Spanish. So, "Select * from todos" means "select everything from everything". Terribly inefficient but it'll be great fun writing the code to sort out what we want from the results 😜 - a grizzled vet.

  • @CraftDownloads

    @CraftDownloads

    10 ай бұрын

    I get that, also works in portuguese 😂

  • @GuilhermeMG
    @GuilhermeMG10 ай бұрын

    With count(1) you get the performance boost without all the confusion

  • @PlanetScale

    @PlanetScale

    10 ай бұрын

    Yup! Counting with a constant is totally a viable option. See 06:09.

  • @CottidaeSEA

    @CottidaeSEA

    10 ай бұрын

    It's what I always do, because I feel it is the most expressive.

  • @fano72

    @fano72

    10 ай бұрын

    I also like to do that. No data is needed from the rows to count them.

  • @TehKarmalizer

    @TehKarmalizer

    10 ай бұрын

    This is what I've been doing for ages. Using count(*) at least has been slower in some older or obscure database engines. And I've worked with several.

  • @elkhayder
    @elkhayder10 ай бұрын

    Great video. Short, helpful, and straight to the point.

  • @mutatedllama
    @mutatedllama10 ай бұрын

    What a great video. Earned a subscribe. Looking forward to more!

  • @aaronmeder
    @aaronmeder3 ай бұрын

    Love it! Thanks guys for sharing

  • @spacemanmat
    @spacemanmat8 ай бұрын

    Always nice to see how the optimiser is working under the covers. I’ve seen a few cases where the original program had done something dumb but the optimiser picked up the issue an optimised the issue away. Still makes me uneasy relying on it though.

  • @pukkimi
    @pukkimi10 ай бұрын

    I might call myself a somewhat senior programmer. Sometimes query optimizers did not realize to use a clustered or whatever else indexes of a table when using count (*). This happened at least on Oracle 8 and the workaround was to use count ([indexed column]) where [indexed column] = something. Count (*) caused full table scan in some cases, at least if table contained lobs. So there might really be a reason why some grey beards warn on count(*). When in doubt, check the execution plan.

  • @TheGreatAtario

    @TheGreatAtario

    10 ай бұрын

    I think the real takeaway here is that Oracle sucks

  • @pukkimi

    @pukkimi

    10 ай бұрын

    @@TheGreatAtario Sir you are most certainly correct on Oracle, but there are same kind of stupid behaviors in almost every database as far as I know. Not this fault but many more and different. The real takeaway is that always check the execution plan :)

  • @alexU42k

    @alexU42k

    9 ай бұрын

    It is always a challenge to figure out if something was done on purpose or by lack of knowledge or any other reason

  • @DmitriyYankin

    @DmitriyYankin

    9 ай бұрын

    Why count should use index?

  • @alexU42k

    @alexU42k

    9 ай бұрын

    @@DmitriyYankinit is up to query optimizer to use index or not, but in general it uses cheaper solution (less IO operations)

  • @artemisamberdrive583
    @artemisamberdrive58310 ай бұрын

    for specific purposes (e.g. extremely large table, statistics, etc) i set up a count-table with a single row and column, holding the information of the count of rows of the "parent table". this requires setting up triggers on the parent table insert+delete procedures to increment+decrement the value of the count-table. keep in mind that this setup slows down the process of writing data, but data is usually read many times in contrast to written.

  • @GuruEvi

    @GuruEvi

    10 ай бұрын

    That seems extremely 'hacky' and you end up doing the same thing an auto_increment lock does (with more hoops) and if your system gets busy or needs to scale you basically lose any concurrency. Also makes your application a whole lot less portable and you could make a mistake (eg. most examples I've seen online, do not take into account that an INSERT or DELETE can target multiple rows, but the trigger only gets called once, so now you need looping logic or you have a bug). Not sure if you actually "need" a count if you're working with that many records, but most database engines can provide an estimate or perhaps, you may be able to use a different database system altogether that is better optimized for providing statistical information.

  • @SXsoft99

    @SXsoft99

    10 ай бұрын

    yes and at the same time you need to keep on separate columns all the conditions combinations for filtering, also good luck finding programmers that actually use things like triggers since it hides the application business logic in the database

  • @christianbarnay2499

    @christianbarnay2499

    9 ай бұрын

    I don't know for MySQL but most DB engines already do that on their own. They have a "table information" table that contains all the metadata of each table, including the row count. select count(anything not nullable) without a where clause will automatically trigger a lookup in that table to get the current count of rows of the selected table.

  • @rahulxcr
    @rahulxcr10 ай бұрын

    Great explanation. Thanks that's very helpful.

  • @user-yb6rd1fm5e
    @user-yb6rd1fm5e9 ай бұрын

    A couple of corrections: The COUNT() instruction can receive as parameter an EXPRESSION or a wildcard, that is to say, you could write: COUNT(*) or COUNT(1), COUNT(pepito), COUNT(id) or COUNT(99999) which will give you the same, the inside is considered a wildcard, the "*" is used as a wildcard by convention , but we could use any character because by definition, it doesn't use information about any particular column (and including for the COUNT the rows that contains NULL values in any column). In the case that you comment the COUNT(id) and the COUNT(*) bring the same result because the "id" is declared as if it was a wildcard so the behavior is the same and the server takes the license to optimize the process as you have explained in the video. But, if you really wanted to count the values of a field, the correct way would be to specify COUNT(ALL id) and this expression does have a difference with respect to the COUNT(id), and it is because it will only consider for the count the NON NULL values inside that field In the case of the example of the video COUNT(id) and COUNT(ALL id) should return the same result, since the "id" field, being a primary key, would never be empty, but the difference would be that you would force the server to use the index of the primary key to execute the COUNT(ALL id). Finally, while it is true that the server often saves us from ourselves, it is not exactly true that it always makes the best decisions, as a DBA with over 10 years of experience I have found myself in several situations where after checking the execution plan I realize that the server is taking a not so optimal index for the instruction that has been requested and you have to address it to use the correct index for some instruction, this is seen quite often in big data querys.

  • @myonlynick

    @myonlynick

    9 ай бұрын

    i partially disagree with your second paragraph. I Put it to the test. I Downloaded the open source 'world database' for mysql. I ran 2 queries. select count(*) from country; which gives a reply: 239. The second query is: select count(IndepYear) from country; which gives a reply: 192. Indepyear is not a primary key and has several NULL values. IF you are wondering: select count(ALL IndepYear); returns the value 192 as well. Hence, in mysql 'ALL' is optional.

  • @LeeKao

    @LeeKao

    9 ай бұрын

    I felt my brain growing as I was reading your comment

  • @DmitriyYankin

    @DmitriyYankin

    9 ай бұрын

    Why someone liking this absolutely wrong answer? count() depends on particular columns. count(id) will count only NON NULL ones. COUNT(id) and COUNT(ALL id) are absolutely the same exepression as count(id) is implicitely ALL.

  • @user-yb6rd1fm5e

    @user-yb6rd1fm5e

    9 ай бұрын

    @@DmitriyYankin read the damn documentation before you said something. Also, obviously some databases work a bit different from what I said, if you use other SQL database read you own documentation 🙄.

  • @DmitriyYankin

    @DmitriyYankin

    9 ай бұрын

    @@user-yb6rd1fm5e yep, read it yourself. dev mysql com: "COUNT(expr) Returns a count of the number of non-NULL values of expr in the rows retrieved by a SELECT statement." ... "COUNT(*) is somewhat different in that it returns a count of the number of rows retrieved, whether or not they contain NULL values." ...

  • @lakhanpurohit6969
    @lakhanpurohit696910 ай бұрын

    Thanks for sharing ur knowledge 😊

  • @adamtak3128
    @adamtak312810 ай бұрын

    Please make more education SQL content. This was fantastic.

  • @Wangaruro
    @Wangaruro10 ай бұрын

    I actually learned something new today! Thanks!

  • @airbornesnail
    @airbornesnail10 ай бұрын

    "You'll have won the argument which what holidays are all about" - best sentence I've ever heard. :D

  • @diegocardenas4522
    @diegocardenas45225 ай бұрын

    Best ad ever, keep them coming

  • @x364
    @x3649 ай бұрын

    Very nice! Thank you!

  • @medilies
    @medilies10 ай бұрын

    Maaan, I didn't know this channel is posting such content :0 I liked it, subscribed and activated notifications.

  • @SeraphPatrick
    @SeraphPatrick10 ай бұрын

    Great video, thanks!

  • @Austin-ft8pn
    @Austin-ft8pn10 ай бұрын

    I love content like this, I'm going to have to check this out for my self!

  • @ragsChannel
    @ragsChannel10 ай бұрын

    A nice gotcha indeed! One question : this "optimization" -- is it applicable ONLY to MySQL or is also the case with say, Postgresql ??

  • @romanstingler435

    @romanstingler435

    10 ай бұрын

    in postgres the star is not necessarily the fastest, it also depends if an index is used or just a scan is used, due to the amount of data in the table, and if auto vacuum was successful recently.

  • @Keelyn1984

    @Keelyn1984

    9 ай бұрын

    It applies also to oracle. Count(*) counts the rows not the data. But keep in mind that inline-views or subselect still have to fetch data for the sql to even work. Using a constant instead of * is also a common (most offen 1 is used) viable alternative.

  • @adamzaczek6342
    @adamzaczek634210 ай бұрын

    Holy count, I just found an awesome channel to subscribe to. Love the humor at the end!

  • @aarondfrancis

    @aarondfrancis

    10 ай бұрын

    Holy count 😂

  • @HishamElsayad
    @HishamElsayad9 ай бұрын

    Thanks so much for providing this great information.

  • @wadecodez
    @wadecodez10 ай бұрын

    Your thanksgiving conversations sound interesting

  • @ahmad-murery
    @ahmad-murery10 ай бұрын

    I know about count(*) but I didn't know how the Optimizer decides about what index to use. One learn new things everyday, Thanks Aaron!

  • @tomasma4896
    @tomasma48969 ай бұрын

    Cools, I am using MySQL for years but never heard of this one. Please more videos like this :)

  • @jhonatanwen
    @jhonatanwen10 ай бұрын

    Incredible video!

  • @justinwduff
    @justinwduff10 ай бұрын

    Very interesting, thank you!

  • @jannickbreunis
    @jannickbreunis10 ай бұрын

    “When you’re arguing with you family…” haha nice one.

  • @xavier.xiques
    @xavier.xiques10 ай бұрын

    Good video, thanks

  • @adriancs6455
    @adriancs64557 ай бұрын

    thanks for this info

  • @JuanLuisEcheverria
    @JuanLuisEcheverria10 ай бұрын

    Hey dude, your way to explain this topic is very well !! Congrats

  • @PlanetScale

    @PlanetScale

    10 ай бұрын

    Hey, thanks!

  • @djenning90
    @djenning9010 ай бұрын

    I love your style!

  • @aarondfrancis

    @aarondfrancis

    10 ай бұрын

    Thank you!

  • @mariomario4676
    @mariomario46767 ай бұрын

    thanks for the educational content

  • @Zach2825
    @Zach282510 ай бұрын

    Wow, thank you!

  • @alexcoding99
    @alexcoding9910 ай бұрын

    very informative!

  • @nicolasguillenc
    @nicolasguillenc10 ай бұрын

    you are amazing at explaining things man

  • @aarondfrancis

    @aarondfrancis

    10 ай бұрын

    Thank you!

  • @quintennn
    @quintennn5 ай бұрын

    "Tell your family on thanksgiving" It would take me about 5 winters to explain this to my family.

  • @devhaua
    @devhaua8 ай бұрын

    One of the best videos I haveever watched on SQL, tq

  • @PlanetScale

    @PlanetScale

    8 ай бұрын

    Thank you! Love hearing that

  • @cangurcan99
    @cangurcan999 ай бұрын

    Wow, using mysql for decades and never knew this. Thanks man.

  • @PaulSebastianM
    @PaulSebastianM10 ай бұрын

    primary keys are clustered, aligned to disk clusters, physically, so counting them means traversing the disk to gather the count, and if the order on disk is not adjacent, then the index is fragmented so counting can take a lot of time, while non primary keys or indexes are no clustered, meaning they don't need to follow the disk physical alignment so they are most often stored off-table in a much more compact data structure, which even when it gets fragmented, the data is still going to be close to each other because all the data structure holds is index records, not every row in the table, like what each clustered index follows.

  • @aarondfrancis

    @aarondfrancis

    10 ай бұрын

    Correct!

  • @jvapr27

    @jvapr27

    10 ай бұрын

    Very cool to know. FYI: I think for some databases this is not true though. Clustering can be set differently than primary keys. For db2, for example primary indexes are not by default clustered. For databases like snowflake, they do not index the primary key. Each DB may be different. Still very cool. Thanks!

  • @PaulSebastianM

    @PaulSebastianM

    10 ай бұрын

    @@jvapr27 correct, you can have one single index that can be clustered because that orders the records of that table physically. But that index doesn't have to be the primary key though some dbs might enforce that.

  • @debasishraychawdhuri

    @debasishraychawdhuri

    10 ай бұрын

    count(*) and count (id) are semantically equivalent, the database should not behave differently for those queries.

  • @PaulSebastianM

    @PaulSebastianM

    10 ай бұрын

    @@debasishraychawdhuri well, no. One is explicit, the other is implicit. Implicit means suggested but not expressly stated. Thus they are completely different. That is logical deduction.

  • @CodeKujo
    @CodeKujo10 ай бұрын

    The advice to avoid count(*) predates mysql. It may have even been true in mysql at some point or some obscure schema. As you said, count(*) relies on the optimizer to do the right thing. I use count(0) myself, but I wouldn't be surprised if sometimes you have to be more specific to get the right query plan. I think the biggest lesson in this video is not to rely on advice, but to check the plan and know that counting an index can be faster--which is great advice!

  • @ivanskyttejrgensen7464

    @ivanskyttejrgensen7464

    9 ай бұрын

    When I got a job that involved Oracle 7 in 1999 the DBA told me to use count(*) because the old hack with count(1) wasn't needed anymore. So it was presumably true at some point in the 90s.

  • @davidlean8674

    @davidlean8674

    9 ай бұрын

    I was teaching performance tuning on behalf of a database vendor in 1989. COUNT(*) was the recommended approach for SQL Server (both Mifcosoft & Sybase), Oracle, DB2, & Ingress. So yes it predates mySQL. The syntax alternative was to specify a column name, But that was only if you wanted to find the count of non-null fields in that column or expression. Count (constant) was never necessary in any platform I've used. Yet lots of people suggested it. Most had minimal clue about DB internals or query optimisations.

  • @ABaumstumpf

    @ABaumstumpf

    9 ай бұрын

    @@ivanskyttejrgensen7464 I have used it on oracle6 so at least even back then the advice to use count(1) was already outdated. Likely it was something for pre-ansi sql.

  • @christianbarnay2499

    @christianbarnay2499

    9 ай бұрын

    But all queries rely on the optimizer. The optimizer is the core of the querying engine. As long as you don't mess with the execution plan by forcing a path through hints, the optimizer will always kick in and do its job.

  • @BlairdBlaird

    @BlairdBlaird

    8 ай бұрын

    @@christianbarnay2499 a big difference is that count(*) semantics are defined by the SQL standard itself, so its optimisation is a lot more likely than count(constant) being recognised as equivalent... to count(*).

  • @TheyCalledMeT
    @TheyCalledMeT10 ай бұрын

    which also means it needs a secondary non null index to function the way you indicated. redo the trial without such an index to see what it does

  • @YOUdudex
    @YOUdudex10 ай бұрын

    Interesting, thanks ✌️

  • @hieungo770
    @hieungo77010 ай бұрын

    I love how you explain things, is there any course from you that teach from the ground up

  • @PlanetScale

    @PlanetScale

    10 ай бұрын

    Check out our MySQL for Developers course: planetscale.com/learn/courses/mysql-for-developers/introduction/course-introduction

  • @greatestuff
    @greatestuff9 ай бұрын

    Great video

  • @hasenhirn1965
    @hasenhirn19659 ай бұрын

    You never finish learning Great explanation 👍

  • @martymoo
    @martymoo10 ай бұрын

    I didn't know this. Thanks! Has it always been this way?

  • @michelprovencher8518
    @michelprovencher851810 ай бұрын

    Like the EXISTS function where the "SELECT * FROM ..." is just a predicate for the function to work and only the WHERE clause is meaningful

  • @CirTap
    @CirTap10 ай бұрын

    Thank you for saving everyone's Thanksgiving! 😂

  • @Neakas
    @Neakas8 ай бұрын

    This is also true in MS-SQL. it will use the most narrow Non Clustered Index on a Table. If there is non, the Tables clustered Index has to be checked, which is slow. but i think in MS-SQL you can use sys.sysindexes to look up the rowcount even faster

  • @ZanarkandStarplayer
    @ZanarkandStarplayer10 ай бұрын

    Now I'm ready for the holidays 💪🏽

  • @PlanetScale

    @PlanetScale

    10 ай бұрын

    Go get em!

  • @harrytsang1501
    @harrytsang150110 ай бұрын

    Yes, very often the problem you are trying to solve is more generic, and can be expressed in more generic terms. Using language features and trusting that smarter people have put more effort in optimizing the language itself is often more optimised than what we can come up with ourselves. One classic C trick was to not use multiply/divide and instead add or subtract bitshifted values. However, in modern systems that no longer take dozens of clock cycles to multiply, the compiler knows better and will just replace your whole expression with a multiply.

  • @Demonslay335

    @Demonslay335

    10 ай бұрын

    The compiler may even optimize your division into multiplication or bit shifts, and all other kinds of fun wizardry. It really is more important to have readable code in many cases nowadays.

  • @CottidaeSEA

    @CottidaeSEA

    10 ай бұрын

    @@Demonslay335 In some cases it doesn't do what you want it to do; but only then is it worth looking into optimization. We should always be aware of the performance impact our code has, but there's no need to go crazy about optimization before you even have any performance information to work with.

  • @hwstar9416

    @hwstar9416

    6 ай бұрын

    usually compiler isn't as smart as you think. It can do simple optimizations like the one you mentioned, but anything slightly more complex it fails at.

  • @harrytsang1501

    @harrytsang1501

    6 ай бұрын

    @@hwstar9416 That is because you are using more dynamic languages. With more strict rules for memory safety and type setting, newer languages like Rust and Zig are doing wonders. It doesn't save you from algorithm problems with Big O of n cube tho

  • @hwstar9416

    @hwstar9416

    6 ай бұрын

    ​@@harrytsang1501 I don't use dynamically typed langs, I use C/C++. People often overestimate how optimizing the compiler is, it's not as impressive as you think

  • @sadhakbj
    @sadhakbj8 ай бұрын

    Love this video. Love the way he teaches.

  • @PlanetScale

    @PlanetScale

    8 ай бұрын

    ❤️ thank you so much

  • @SR-ti6jj
    @SR-ti6jj10 ай бұрын

    Does this mean adding a non-null secondary index will improve count performance on tables that don't already have one?

  • @parkamark

    @parkamark

    10 ай бұрын

    You could create a secondary index on the same column(s) as the primary index. That would then speed up counting, unless the DB engine is doing some clever stuff under the hood that means you don't necessarily have to do that. But given what he's said in this video, creating a secondary index on the same column(s) as the primary is certainly a good workaround. Maybe someone who knows more than I do could clarify this point.

  • @parkamark

    @parkamark

    10 ай бұрын

    To answer your question directly, having ANY index on a table is way better than none at all, both in the case of searching and counting. So having a simple non-null non-unique index would be the minimal requirement for fast counting. It also matters if the columns are fixed or variable length datatypes, eg. int is fixed length, varchar is variable. If all columns are fixed, then the database can also do a fast count without any indices because it knows that the length of each row is fixed, and it knows the full size of the entire table, thus what the number of rows must be.

  • @dennisdashkevich
    @dennisdashkevich10 ай бұрын

    Nice video, thank you! And what's the time complexity of this operation in MySQL? Is it linear? How would you go about counting records in a table with millions of rows?

  • @IARRCSim

    @IARRCSim

    10 ай бұрын

    O(n) but not all O(n) algorithms take the same time in practice. The constant factor of n is probably several times different.

  • @nm6x
    @nm6x10 ай бұрын

    This video should have more views and likes ❤

  • @rayonsi
    @rayonsi9 ай бұрын

    Great video about performance of counting, I am curious what is name your db client manager?

  • @PlanetScale

    @PlanetScale

    9 ай бұрын

    Table plus!

  • @GhiveciuMarian
    @GhiveciuMarian6 ай бұрын

    This make sense when you select all rows from table, but throw in there any WHERE clause, or any filtering then the advantage might evaporate. The hardest thing i found was to display results on filtering ... In this for example might you want to show how many todos are done from total. This specific table does not have the field 'done' but if its having done, and was not specified in a key it will result in a table scan.

  • @odysseus655
    @odysseus65510 ай бұрын

    I stopped using count(*) decades ago when I ran into a problem with our database engine at the time where there was some catalog corruption and this was erroring out with "column not found error". Lately I've been using count(1). Likely not a great reason to continue not using it (and I'm expecting the execution plan to be the same in any case).

  • @therealcomment5622
    @therealcomment56226 ай бұрын

    I can't wait to go to the next thanksgiving.

  • @Peter-Ja
    @Peter-Ja9 ай бұрын

    Looking forward for the next argument with my family about the performance of the SQL count operation. Everyone will be so excited

  • @PlanetScale

    @PlanetScale

    9 ай бұрын

    Hopefully you win! "Happy Thanksgiving, y'all don't know anything!" - Peter, probably

  • @andrewkamoha4666
    @andrewkamoha466610 ай бұрын

    6:15 "Over Thanksgiving, when you're arguing with your family about whether count(*) or count(id) is faster" I'm sure grandma will love to talk about that ...

  • @aarondfrancis

    @aarondfrancis

    10 ай бұрын

    My dad was actually a DBA, so there's a non-zero chance it will come up

  • @nskeip
    @nskeip10 ай бұрын

    I should create a database engine where COUNT(*) is "multiply number of columns by the number of rows", COUNT(/) is "divide rows by columns", COUNT(+) is ... (you get the idea)

  • @Ceelbc
    @Ceelbc10 ай бұрын

    With a proper SQL implementation, this should not matter; the compiler should handle this.

  • @Ceelbc

    @Ceelbc

    9 ай бұрын

    @@lawrencechiasson975 *which is part of the compiler.

  • @violin245
    @violin2456 ай бұрын

    How is someone talking about SQL this charming

  • @smeedee
    @smeedee8 ай бұрын

    You could also use the „rows“ from the explain query. In some use cases this is already good enough ^^

  • @adamtretera273
    @adamtretera2739 ай бұрын

    Fire video ❤

  • @PlanetScale

    @PlanetScale

    9 ай бұрын

    Thanks 🔥

  • @mortona42yt
    @mortona42yt10 ай бұрын

    I knew the db optimizes the query, but didn't know details like this. I would be surprised if after 20+ years of development, it would interpret count(*) as "load everything from the table and count it". What about when you don't have a secondary index, or doing a join query? Probably still counts the returned rows, or some optimization with joined indexes?

  • @victornogueira2346
    @victornogueira234610 ай бұрын

    more content about SQL, please!

  • @StEvUgnIn
    @StEvUgnIn10 ай бұрын

    Well said

  • @wmafendi
    @wmafendi10 ай бұрын

    new thing for me. TQ

  • @Im_Ninooo
    @Im_Ninooo10 ай бұрын

    I've been using COUNT(1) for a while now. I wonder if the behavior is the same in other databases such as CockroachDB

  • @jeffmccloud905
    @jeffmccloud90510 ай бұрын

    actually, WAY back in the day in Oracle (1990s), it was recommended to use SELECT COUNT(1) and not COUNT(*) because it actually did make a difference. but they fixed that. but some grizzled old devs kept that convention.

  • @edbutler3

    @edbutler3

    10 ай бұрын

    Yeah, I've been using count(1) on Oracle for decades. I've suspected for a while that recent versions have a smart enough query optimizer to do the right thing with count(*), but I haven't taken the time to verify. And for MS SQL Server, the "culture" has always been to use count(*), so I've assumed it's ok there.

  • @PanduPoluan

    @PanduPoluan

    10 ай бұрын

    I always use COUNT(1). Because COUNT(*) depends totally on the engine's optimisation.

  • @djnormus
    @djnormus3 ай бұрын

    Hello, what is your SQL software that you use in this video? Thank you

  • @MohamedAmer-hn1tv
    @MohamedAmer-hn1tv10 ай бұрын

    Great video, It was incredibly well-presented. By any chance, would it be possible to remove the credit card requirement for creating free DB? Thanks a bunch!

  • @airjuri
    @airjuri10 ай бұрын

    You should also create index for columns that are used for whatever is usually in "where" ;)

  • @mintx1720
    @mintx17209 ай бұрын

    "Trust the compiler" is actually a learned skill.

  • @GrantGryczan

    @GrantGryczan

    8 ай бұрын

    After seeing this channel's other video about how the YEAR function doesn't use indexes (where the compiler theoretically could totally make the optimization to use indexes), I do not trust the compiler... (new to SQL)

  • @rumisbadforyou9670
    @rumisbadforyou967010 ай бұрын

    I never got into using SQL databases, but having written a few on-disk and over-the-network data structures, I'd expect the count(*) to be smart enough to use some cached "total_length" value, especially considering that a lot of effort went into writing query optimizers. I guess, people would think that because of lack of experience of writing a data store yourself?

  • @sohn7767

    @sohn7767

    10 ай бұрын

    would be easy enough on a simple table query, in fact count() and indexes probably do do that. However whenever you call a function or a view or whatever script with at least a little logic, the final length is unknown

  • @brdrnda3805

    @brdrnda3805

    10 ай бұрын

    I never wrote a data store, but worked with relational databases for almost 25 years. Still, assuming COUNT(*) wouldn't be fast and it would process the whole record sounds utterly absurd to me. (And, honestly, I never heard that myth)

  • @Neptun084
    @Neptun0849 ай бұрын

    Ok thanks, I'll introduce this to my entire family, specially to my gran grands

  • @paulthomas2577
    @paulthomas257710 ай бұрын

    This changes by database server. This is true for MySQL but SELECT count(*) is much slower in SQL Server. In SQL Server the way I learned to to do it was SELECT count(1)

  • @rosieroti4063
    @rosieroti40639 ай бұрын

    great info. However, I'd also like some more insights into count(*) in case we have a where clause in the query. Since count(*) uses the smallest secondary non null key, will it be "slower" if I'm counting rows where a column value is null (or perhaps some other where clause combination which might include nulls) ?

  • @imacomputer1234

    @imacomputer1234

    9 ай бұрын

    It won't be slower, but it will only count rows where that column isn't null!

  • @Sweenus987
    @Sweenus98710 ай бұрын

    Just curious, what if you copied the id field as a secondary key so whenever id gets a value this copy would also get a copy, just for this purpose?

  • @ShivaprasadBhat-dm3kn
    @ShivaprasadBhat-dm3kn9 ай бұрын

    Great video! What is the tool you are using for query?

  • @PlanetScale

    @PlanetScale

    9 ай бұрын

    TablePlus!

  • @ShivaprasadBhat-dm3kn

    @ShivaprasadBhat-dm3kn

    9 ай бұрын

    @@PlanetScale thank you! installing now 😁

  • @TheDragShot
    @TheDragShot10 ай бұрын

    - *Newbies:* Using COUNT(*) - *Dinosaurs:* Using COUNT(ID_COLUMN) - *Me and my two neurons:* Using COUNT(1)

  • @ms0624371
    @ms062437110 ай бұрын

    Thanks for your video! I have a question, does sql server work the same way?

  • @PlanetScale

    @PlanetScale

    9 ай бұрын

    I'm not sure!

  • @Petoj87
    @Petoj879 ай бұрын

    Would be interesting if the same is true for other sql engines like litesql, postgres, sql server and postgres

  • @DanelonNicolas
    @DanelonNicolas10 ай бұрын

    ok. I'll suscribe! but promise a video about sqlite and posgresql comparing this too haha

  • @dealloc
    @dealloc9 ай бұрын

    Even in SQLite COUNT(*) takes less opcodes compared to COUNT(1) and COUNT(id) as it will just read value that is stored already, instead of having to aggregate.

  • @jameslucas5590
    @jameslucas559010 ай бұрын

    I'm a vet and will use *. I don't care if it bothers anyone, because I want to just move on. However, I Love this video and the knowledge you share.

  • @LV4EVR
    @LV4EVR9 ай бұрын

    Thanksgiving comment totally, 100% EARNED the like. Cheers.

  • @PlanetScale

    @PlanetScale

    9 ай бұрын

    Haha thank you! 🫡

Келесі