DIY Programming Language #1: The Shunting Yard Algorithm
Ғылым және технология
In this video I demonstrate how the Shunting Yard Algorithm works, and produce a simple implementatin to solve basic mathematical expressions like a calculator would.
Source: github.com/OneLoneCoder/Javid...
Patreon: / javidx9
KZread: / javidx9
/ javidx9extra
Discord: / discord
Twitter: / javidx9
Twitch: / javidx9
GitHub: www.github.com/onelonecoder
Homepage: www.onelonecoder.com
Пікірлер: 265
So are you BODMAS, or PEDMAS, or something else?
@samuelhulme8347
25 күн бұрын
I’m BIDMAS, taught that by a UK school.
@irishbruse
25 күн бұрын
BIMDAS im from Ireland
@stephenkamenar
25 күн бұрын
i've only ever heard PEMDAS
@TheHackysack
25 күн бұрын
PEMDAS
@mantictac
25 күн бұрын
BEDMAS (Ontario, Canada)
i've been writing code for almost 30 years and im still learning new things. thank you so much
@irishbruse
25 күн бұрын
Ah a man of culture nice to see you here
@javidx9
25 күн бұрын
Im in the same boat, fancied trying to understand how the languages work for a bit, rather than just using them lol
@stephenelliott7071
25 күн бұрын
@@javidx9 great idea for a series! 🙂
@markuszeller_official
24 күн бұрын
Same to me! Still hungry for coding entertainment and algos.
@jhfoleiss
19 күн бұрын
It's so awesome to see one of my favorite KZreadrs watch videos by another one of my favorite KZreadrs!
Dude, I started following your stuff when I was a junior dev. Almost 10 years have passed, nice to see you still make great content.
During the first 8 minutes I was thinking: You know he is a programmer when he does not need a computer to program! Thumbs up!
In school, we were taught "Please Excuse My Dear Aunt Sally" as the acronym for the order of operations. Also, another word for value is operand. For instance, in 4 + 5, the "+" is the operator and "4" and "5" are the operands.
@TheSelfUnemployed
20 күн бұрын
kids now learn PEMDAS as Please Excuse My Dope Ass Swag
@YourMom-rg5jk
6 күн бұрын
@@TheSelfUnemployedI really missed out
Just got into the Pixel Game Engine and happy to see you still making content! Thanks for everything you do for the community, you are appreciated!
@javidx9
20 күн бұрын
Thanks buddy!
Just as a callout to people who learn alternatives such as PEMDAS and BIDMAS - the order of multiplication and division doesn't matter - they have the same precedence and are just done from left to right together. The same with addition and subtraction. You can swap M/D and A/S if you feel more comfortable that way, but these are within the scope of this video the only pairs that are interchangeable as of right now!
@BlastinRope
25 күн бұрын
To add more context, division is just multiplication with a number between -1 and 1, just like subtraction is just addition with a number that is less than 0.
@CJBurkey
25 күн бұрын
@@BlastinRope (division is multiplication by the reciprocal, I think that's what you're saying)
@user-io4sr7vg1v
23 күн бұрын
This is good but those rules are kind of arbitrary. It would be better to have an order of precedence list that can be 'modified' as per the grammar/language requirements, otherwise you get this monolithic structure which isn't easy to extend.
@williamsquires3070
23 күн бұрын
Exactly, and I would lump exponentiation in with other functions (or their equivalent symbology, such as |x| to means abs(x), or x! to mean factorial(x)), so that e^x-2 is exp(e, x)-2, but e^(x-2) would be exp(e, x-2); this makes sense in terms of computer programming where these are often in a library, like in C/C++. So the order of precedence becomes; parentheses, functions, multiplication/division in order from left to right, then finally addition/subtraction in order from left to right.
@scottpageusmc
18 күн бұрын
@@BlastinRope, wow! I'm 46 and have been working with numbers in programming for decades. Thank you for teaching this dog new tricks!
I never knew that it was called the shunting yard algorithm. I still remember that creating a calculator was one of the first programs I had to write while studying 35 years ago.
My mind was blown when I combined Recursive Descent statement parsing and Pratt expression parsing constructed with compile time curried functions. Great analysis of this expression parsing algorithm!
Familiar algorithm, well explained -- good job! One thing you might have added is why it's called the "shunting yard algorithm", because it can be explained by analogy to shunting railway wagons in a yard in such a way as to build a train in the necessary order. In that analogy, only one stack is generally used (though another is needed for evaluating the solution, though in fact it can be the same stack with a bit of cunning).
Custom programming languages are an interesting topic that plagued my mind for years! Great to see you covering this
programming language development on this channel, I'm all for it!!
We called this postfix notation when I was learning compiler design - there's also a prefix way to do it. I believe we used an operator stack and a literal stack, but essentially the same process.
25:50 The answer is 2 but not for the reason given: multiplication and division have the same priority so they are evaluated left-to-right, so: 1x(2+4)/3 = 1x6/3 = 6/3 = 2. The same for addition and subtraction.
@AliceErishech
24 күн бұрын
I wonder if his implementation might have some errors somewhere due to prioritizing those differently.
@javidx9
24 күн бұрын
Not yet, but it could. I've not mentioned "associativity" once in the video, and for good reason, it does introduce complexity to what is otherwise a very approachable algorithm. However we can't escape associativity should we need more interesting operators, which potentially disrupt the natural left to right bias of the solution stack. Interestingly, negation unary operator is contrary to this, but because we know it's the unary operator it's actually handled differently anyway during solving, but that might not always be the case moving forward.
@CharnelMouse
24 күн бұрын
@@AliceErishech You can see an error appearing at 15:22, where 1 + 2*4 - 3 + 5/6 * 2 + 6 - 1 - 4 + 7 is effectively parsed as 1 + 2*4 - (3 + 5/6 * 2 + 6) - 1 - (4 + 7) More simply, 1 - 2 + 3 would be computed to be -4 rather than 2, since it's parsed as 1 - (2 + 3).
@NickFegley
24 күн бұрын
@@javidx9Doesn't it already fail for 2 - 1 + 3 which should evaluate to (2 - 1) + 3 = 4, but instead evaluates to 2 - (1 + 3) = -2 because + has a higher precedence than -? Or am I not understanding something (a distinct and likely possibility)?
@MrChannel42
20 күн бұрын
Powers are an interesting exception here though as they're evaluated right to left (or at least that's what I was taught - wikipedia and google's built in maths evaluation function seem to agree). So 2^2^3 = 2^(2^3) = 2^8 = 256, whereas if you go left to right you would get 64.
I love how eloquent you are explaining alien stuff like this.
I've been programming since 1986, and probably have forgotten more languages than I remember now, but you somehow are able to teach this old dog something new in every video. I'm 46 now, and it is rare that I run across many people with the same background as I have, and you are in my top at least 3 favorite KZreadrs of all time! You're clear, concise, well structured, and brilliant. Thank you for doing what I wish I could do! Outstanding all around!
I teach this algorithm in an introductory data structures course. It's a fun algorithm that makes students "get" the importance of using appropriate data structures to solve problems (in this case, the stack). Great explanation, btw!
I love your lovely placed remarks, like the "system("pause")" one ❤
The master has returned. Always good to see amazing content with value from you and the community. Keep it up javid, and congratulations on X10 growing health:)
The best ever use case for kitchen table. Thanks!!!
This was a fantastic educational video! Thank you for your time with this one. Very excited to see what comes next in the series!
Shoutous to the Stack Overflow answer that gave a fully functional recursive parser in Java! Really helped me in highschool!
Where I'm from we learned "PEMDAS". A phrase we use is "please excuse my dear Aunt Sally"
Great vid! I never really knew where to start when it came to evaluating expressions.
It has been a while since I last watched one of your videos, it feels like meeting with an old friend after a long time. Now I miss programming c++
This is gonna be a real treat! Looking forward to your upcomming videos.
Always, Aaalways I come back to Javid channel I learn something new! Thank you so much for all!
Thank you so much for this video, I've been really looking into how to create a programming language and this has helped me tremendously. Please do a whole series on how to lex, parse, compile/solve/interpret a language, really looking forward to this
Dang I've missed these videos. You have a real gift for making the complex simple. Thank you!
I hope this series gets into tokenization and other related topics. A while ago I tried to make an assembly parser, and only got so far, lol.
A neat detail with the arity check could be to track, whenever either a constant gets pushed to the output stack or an operator gets flushed onto it, you'd be able to track the amount of "available" values there are lower in the stack -- thus the error could be localized better too.
I once read about the shunting algorithm but couldn’t work out what was going on. Being able to visualise it with bits of paper is such a good way of explaining it - I get it now - thank you. Brilliant video 👍🙂
Nice to see you again and thanks for the video. I've been studying splines not so long ago and was trying to make an extension for the unity editor to change values on the fly and immediately see the result in runtime. But the order of operations became something I couldn't overcome. My solution was very close to what you showed though. Thank you very much!
I've been looking into this for a while now, thanks so much for this video
Love your videos! Going to implement this right now myself.
Thank you Javid for always amazing videos.
Thank you for sharing. Great explanation!
nice explanation. i had heard about this algorithm but hadn't looked into it yet
I learned RPN when tinkering with the Jupiter Ace with its inbuilt FORTH programming language back in the eighties! A doomed computer if ever there was one!
Incredible explanation
As a self taught software engineer, with deep interests in computer science, if you want to build up a computer from nand gates, write an assembler, a compiler and an OS for that hardware, I highly suggest you to take the courses nand2tetris and nand2tetris2.
@bossgd100
24 күн бұрын
++
Love the way you explained the algorithm. Although i learnt C, i would like to learn C++ to follow along. Great job.
This is awesome!
thank you so much, I've long been wondering how parsing mathematical expression works
I love your videos, but there is one thing getting me stuck from time to time - naming conventions. Could you please describe why do they differ so much? Once you create a variable using camelCase (e.g. stkHolding), then you create some other variable using snake_case (new_op), and also the operator struct is called sOperator (afaik structs are usually called with UpperCamelCase). Also, is there any reason of shortening variable/constant names? imo "stkHolding" is not as user-friendly as "stackHolding", especially if you watch the video in parts :)
Amazing explanation as always, thanks a lot But a side note, without considering left-to-right evaluation unpredictable results will occur `1+1-1` & `1-1+1` Also waiting your approach for multiple digits numbers
Great video! I used Recursive Descent rather than the Shunting Yard Algorithm to parse my programming language, as Recursive Descent was a bit simpler for me to implement in C and wrap my head around, since it just uses the function call stack :)
New series to look forward to.
woo! Looking forward to this series
I love your channel and glad to see you again haven't seen in a while.... one thing though when I was in school (a long time ago now lol) I was taught the multiplication would only be done first if it was in a bracket. So 1+2*4-3 is actually the same as 3*4-3. I guess I was taught wrong and so for 30-40 years I would have just been giving the wrong answer if I was ever asked. The weird part is that hasn't really hurt me when making computer programs. I suppose this is why when I write code, I use the brackets a lot more than necessary.
Best KZread channel and it's not even close
great explanation!
Absolutely facinating. As per usual @javidx9, I love the combination of your cool-as-a-cucumber delivery and exceptionally clear explanations. Your videos always distract from other side-projects I might be involved in, but it's always such a pleasure, I couldn't possibly complain :)
Wow. So great, thanks!
The real problem is the division sign if used as an infix operator instead of as a horizontal line with one expression above and one below. In school I learned that a slash was to be considered a horizontal line slanted and everything on the left to be considered above the line, everything on the right to be considered below the line, while the ÷ sign is to be considered like a multiplication, so they would be handled differently.
Sick dude. I implemented a "Countdown solver", you know, like TV show. It could solve the numbers game in Countdown. Implemented as an expression tree. I didn't use this shunting algorithm, but it would have worked great.
When I learned to do this, I don't think I was told what it was called. Our assignment was to make a roman numerical calculator so we had an addition layer to tokenize roman numerials and calculate their value. It was fun, the teacher didn't require 'correct' roman numeral notation (i.e. VIIII counted as 9 rather than IX for 9), but I made it work both ways.
We each had to create a compiler for honors project in Computer Science in the 90s. Tokens, lexical analysis, etc. In retrospect it was the hardest thing I ever did. I somehow scored 90%.
It would be much more fun to start solving what can be solved before you finish writing the stack, thus preventing it from growing larger than necessary. It also saves you the trouble of double-checking whether a symbol is an operator or not.
Great lesson.
I'd love to see an algorithm for covering expression into Polish Notation (instead of RPN) as well :) Thanks for such an informative video! 😊
(@1:59) don’t forget exponentiation! 😌
Had to write a parser for expressions that included logical operators. Much the same, of course the logical operators also had a '!' which was also uniary. Then we created uniary 'conversion' operators that would take logical and return '1' or '0' for true or false and similar conversion for returning 't' or 'f' for 0 and ==0. Writing this sort of 'mini language' was actually kind of fun. Of course the users had trouble because they just never bothered to read my documentation. lol
We learn it as "Please Excuse My Dear Aunt Sally" or PEMDAS.
I actually wanted to write my own C parser so this is very helpful :)
We also learned BODMAS in India! Its quite funny because bodmas in bengali means 'brat' 😂
I'm also from the UK and ive never referred to subtraction as take. I'd normally pronounce it minus. Also, on BODMAS. I learned that, but my younger sister was taught PEMDAS. At first I was confused because M and D had swapped place. Turns out they have the same precedence so it doesn't matter.
Great Vid. Thank you.
great video!
Amazing, I'll implement it with golang which is my main object of study nowadays
Amazing!
When I was an intern, many years ago now, I had to write a basic scientific calculator. This is what I used. Was fun. I wrote it in Java. Brings back memories. 😂
There is a problem with PEDMAS/PEMDAS. While explicit multiplication and division do have the same precedence, implicit multiplication must always be computed first. This is the source of the trolling on facebook with order of operations. Those are specific expressions that exploit the corner case of implicit multiplication. Pemdas only: 8/2(2+2) = 8/2(4) = 4(4) = 16 Pemdas with implicit multiplication acknowledged 8/2(2+2) = 8/2(4) = 8/8 = 1 (this is the correct answer)
Maybe I misunderstood the code but at 10:00 if '+' has higher precedence than '-' then wouldn't 1 - 2 + 3 be evaluated as 1 - (2 + 3)? That would be unusual, IME. Plus and minus normally have the same precedence and would be evaluated left-to-right.
@vitoswat
20 күн бұрын
Same with multiplication and division
Pedmas > bodmas, math < maths
In India, we use BODMAS. I was confused when I first heard of PEMDAS when reading an American textbook!
quite the coincidence that you upload this a couple of days after I start looking into ebnf and making a programming language haha
That was great fun!!!
I am wondering: for the unary operators you used the pass counter variable. Why not change the type of the initial value of the last symbol variable to unknown? That way your code would have been worked from the beginning without the pass counter variable.
bro's full metalhead mode
@SoDamnMetal
24 күн бұрын
The best kind of mode
to alleviate issue of the unary negate and unary plus having the same symbol as addition and subtraction, i like how some languages have moved to ` for negate. easy to carry to written notation too, just superscript your dash. looks clear and is "backwards compatible" with external brains who don't do it :)
@Cmanorange
25 күн бұрын
(i wasn't able to find a nice way to differentiate unary plus other than to ignore it lol. it's the no-op of math)
Since there is #1, i assume this is going to be a series to make a programming language, is it?
Great video - thanks. I hope you will extend this further. I'm interested in the treatment of functions e.g. sqrt() - probably easy, but what about functions with more than one argument e.g. atan2(x,y) ?
I was used to making classic nested parsers and a tree-walking interpreter for that
Thank you teacher
Hi there mate, i really enjoyed your video about math for game devs. I wanted to ask if you considerd to do some series for it, i would really appreciate that, you have gift for teaching
@javidx9
9 күн бұрын
Thanks! I've no plans specifically as it's just regular maths! Instead I try to approach it from the gamedev angle by exploring those topics and the maths required to do those things at that time.
I am secretly hoping this adventure is the start of a 6502 compiler for the NES emulator project.
Perfect!!!
you could technically do (0-n) to do negative numbers, but it's probably better just to implement negative numbers in the input
one of great of c++ because of the template where vector and list are standard lib in c++
rather than setting the params to 1 and implementing a special case in the solver, can we just up the precedence and push a '0'? And rather than tracking whether the number of passes is 0, can we not just initialise previous symbol to "Unknown"? We also need to be careful if we're differentiating + and - in precedence - if you do so, 1 - 2 + 3 would first do 2 + 3, then 1 - 5, which is incorrect - I hope we fix this next episode by either giving them equal precedence or checking if we're making an addition before a subtraction
Uh oh, here be dragons!
@YourMom-rg5jk
19 күн бұрын
based
PEMDAS Parenthesis Exponents Multiplication / Division (left to right) Addition / Subtraction (left to right)
Would interpreting every subtraction as “+0-“ work, so you don’t have to keep track of the previous character? Im interested to know if this would be faster, since code would only run when i sees a ‘-‘ instead of every character
Thinks for your eXcellent approche
For me in the US I learned it PEMDAS (Please Excuse My Dear Aunt Sally) who I hear is a lovely woman, why she needs to be excused is unbeknownst to me. Hope all is well...
@skilz8098
25 күн бұрын
That's because all woman fart too!
In C, addition and subtraction are at the same precedence and are executed left to right. Same for mul and div. In Smalltalk all binary operators are done left to rigjt. I.e. not all languages use the same rules. Order of ops for add and sub DOES matter. See for example 10-5+2. (10-5)+2 is different to 10-(5+2).
@tolkienfan1972
21 күн бұрын
A fun aside, in floating point, addition isn't associative. Some compilers have flags to enable optimizers that assume associativity. This breaks some algorithms. Relatedly, some algorithms add many numbers in parallel, and this can result in non determinism as numbers are added in a different order each run.
35:55 Why not just initialize it with an open bracket? The initial value only matters in this one situation, the type shouldn't matter for the rest of the implementation, so just make it easier for this case?
@mrseanbob
25 күн бұрын
Or, make an actual symbol type for the beginning of an expression. Remove the ambiguity and make it more readable.
@kplays_6000
23 күн бұрын
We could also just initialise to Unknown
@__christopher__
18 күн бұрын
@@mrseanbob Actually everything that can be at the start of an expression also can follow an opening parenthesis, with the exact same meaning. Therefore an opening parenthesis makes perfect sense here, even if other cases are added where the previous token matters. Indeed, you could simply put the total expression into a pair of parentheses, and remove all special handling of either beginning or end (if the expression is valid, the extra final parenthesis will cause the hold stack to be completely flushed, just as otherwise at the end of the expression).
@mrseanbob
18 күн бұрын
@@__christopher__ yes it does work, but a parenthesis is not the same thing as the beginning of the expression. There might be some future case where you're interested in applying logic only at the beginning, or only on a real parenthesis. I don't see any reason not to distinguish between the two.
Nice video as always! Thanks!! 33:24 This is why I hate auto (and every other programming language feature of this sort): the compiler knows the type, the editor knows the type, heck, even the writer knows the type most of the time. The only one who doesn't know the type is the reader. All because the writer wanted to save something very small (like pressing ctrl . or similar in the IDE or using 10 characters worth of display real-estate). I really don't see the point..
@rulojuka
24 күн бұрын
Sorry for not being complete: I think you specifically did it to aid in the explanation, avoiding complexity until it was necessary, as a great teacher would do! I was talking about auto's use in production code, not here!
Can you already tell what kind of programming language? Ive written a stack machine myself once, would love to see a video about register machines like LUA uses