Computing the Euclidean Algorithm in raw ARM Assembly
Ғылым және технология
In this video, we translate C++ into equivalent ARM assembly code to compute the GCD of two numbers (also known as the Euclidean Algorithm). We visualize our completed GCD calculator on a raspberry pi.
---
Timestamp:
00:00 Intro
00:44 C++ Equivalent
02:26 Base Case in ASM
04:36 Writing Comparison
05:26 Recursive Case
07:29 Manual Creation of a Modulus
09:10 Calling the function
10:11 Compiling and Running Code
---
Links Mentioned in Video:
github.com/LaurieWired/Assemb...
---
laurieWIRED Twitter:
/ lauriewired
laurieWIRED Website:
lauriewired.com
laurieWIRED Github:
github.com/LaurieWired
laurieWIRED HN:
news.ycombinator.com/user?id=...
laurieWIRED Reddit:
/ lauriewired
Пікірлер: 467
As someone who subscribes to a ton of tech-related channels, this is the first time KZread recommended me to this channel.
@HadesTimer
3 ай бұрын
Same
@Alberto_Cavalcante
2 ай бұрын
Same
@steadystreamofserotonin5500
2 ай бұрын
Came for the thumbnail, left for the content
@luisdanielmesa
2 ай бұрын
Yeah, me too... The algorithm is getting worse and worse.
@pluto9000
2 ай бұрын
Same
This is a great introduction and walk through for newcomers to assembly, but beware that the implementation has a quite significant flaw that will only reveal itself when called from other functions that have several live registers and happen to use r4 and store a value in it when it makes the call to this gcd function. The calling function will expect that the gcd function preserves the r4 register, as it is not a scratch register like r0-r3 are. This will manifest in miscalculations, unpredictable behavior and likely instability/crashes in the application using this function. An easy fix would be to push the r4 register along with lr to the stack, and pop it again (i.e. push {r4, lr} - pop {r4, pc}), but this of course increases the stack usage even more. Another approach would be to use r12 which is also a scratch register. It would also be a good idea to make the implementation non-recursive as it can overflow the stack if it recurses too many times. For this algorithm this is quite easy as you can do tail call optimization by simply removing the push, replacing the pop by "bx lr" and doing a "b gcd" instead of "bl gcd" (which means the lr does not get overwritten for each iteration and retains the original return address that was written when the function was called the first time). This means the stack won't grow for each iteration.
How has Google not recommended this channel to me before! 6502 machine coder here, loving this.
@sno_crash
2 ай бұрын
Because you'd probably get a stack overflow if you tried using this approach 🙂
@OneAndOnlyMe
2 ай бұрын
@@sno_crashWhy would a recommendation algorithm cause an overflow?
@sno_crash
2 ай бұрын
@@OneAndOnlyMe The algorithm wont, but the use of this Euclidian function on a 6502 might. It was a joke; that the Google Algorithm was protecting you from such a demise, by not recommending it to you previously (as you had rightfully asked).
@OneAndOnlyMe
2 ай бұрын
@@sno_crash LOL - I meant more that I the KZread algorithm hadn't recommended Laurie's channel.
@speed999-uj5kr
Ай бұрын
Liar
Easily the greatest common divisor in the history of all common divisors
@alicewyan
3 ай бұрын
Who knew it'd be seven?
I have been watching tech and programming channels for many years. This is the first time KZread has recommended you. Glad to see it!
So it's been a few years since I was last writing in ARM asm (32-bit MCUs), so I might be a bit rusty... but I thought I saw a few tricks you'd missed, opened a notepad and downloaded an arm instruction-set reference and had a go myself. What I noticed: 1: this is an ideal opportunity for tail recursion, simplifying the recursion to effectively a loop and removing the need to unwind the calls from the stack at the end 2: move can set flags, removing the need to compare to determine zero both on entry and within the loop 3: once you don't need to push/pop, bx lr can be used to return and can even be used with condition codes ;; arm version gcd: movs r1, r1 ;; mov can set flags? easiest way to get the z flag set for a zero input? bxeq lr ;; conditional return. nice. assumes A32 instruction set tho... .loop: sdiv r2, r0, r1 ;; I find it surprising that ARM doesn't have some way to return mul r2, r2, r1 ;; the remainder from a division... the bits are right there at the sub r2, r0, r2 ;; end of the process, but then they get lost? and I have to recompute them? smh. mov r0, r1 ;; but even so, the ARM version is still smaller and neater than the x86, movs r1, r2 ;; which has to dance around the fixed register usage of its DIV instruction bne loop ;; mov setting flags means that we can conditionally loop - also could set flags on the sub bx lr ;; tail recursion means no push lr/pop pc, and we can use the direct return 4: all this is very academic since most of the time will be in the div operation anyway... ;; x86 version for comparison gcd: ; expects a in edi, b in esi mov eax, edi test esi, esi jz .out ;; no conditional return .loop: xor edx, edx ;; top half of dividand must be zeroed div esi ;; div takes edx:eax as dividand, returns quotient in eax and remainder in edx mov eax, esi mov esi, edx test edx, edx ;; have to use dedicated instruction for setting flags jnz .loop ;; but at least these test/branch combo's get turned into a single µop when executed .out: ret ; return is in eax ;;; the iterative logic as C int gcd(int a, int b) { if(!b) return a; int c; while(c = a % b) { a = b; b = c; } return b; }
@steffen5121
3 ай бұрын
When ARM asm ASMR?
@xydez
3 ай бұрын
Yeah thats what i thought aswell, i gulped when she said recursive function.
@michaelbauers8800
2 ай бұрын
@@xydez I wonder what the default stack size on the PI is?
Underrated channel. I cannot wait to get fat enough in my CS courses to understand this
@TheBrainn
3 ай бұрын
I cant wait to get fat enough either
@turkviktor
3 ай бұрын
You will probably learn nothing meaningful from the CS courses.
@paulhbartley8030
3 ай бұрын
@@turkviktor really?
@poopsmith890
3 ай бұрын
@@paulhbartley8030 No not really, CS courses can teach you plenty of useful things and expose you to very passionate people. That doesn't mean they're absolutely necessary, but that guy isn't giving anything beyond an insipid sensationalist analysis
@yogxoth1959
3 ай бұрын
@@turkviktor Bullshit. It's obvious you haven't been to university.
I finally watch Lain and now the KZread algorithm is rewarding me
@ferocious_r
3 ай бұрын
With what?
@Summanis
3 ай бұрын
@@ferocious_r more Lain
@horvathpeter8130
3 ай бұрын
Should it run too in assembly!
great video - also your comfortable commentary as you do the typing/transposing etc. was liquid smooth and really makes this video of a higher quality.
@808v1
3 ай бұрын
oh, sub'd.
Serial Experiments Lain! And the Wired. Wow!!! 😳 Awesome references. Just stumbled on this channel, so I wanted to acknowledge that. 😊
@Johno3998
3 ай бұрын
same, what good editing/references
praise the algorithm gods... how am i only *just* learning about this channel?
@Karuda_
2 ай бұрын
Getting the same exact feeling right now
This is so good! Your attention to detail is insane! Congrats
I don't know why I found this channel only now ??!! This looks like to be an amazing channel, will go and watch other videos 😍😍 I really enjoy it :)
New subscriber here, and just noticed that you have a playlist of ARM Assembly tutorial. I'll definitely be binge watching that.
I'm instantly a fan! Notepad++, Windows XP, talking about assembly language, and why it's great to scaffold in higher-level languages is awesome. I'm looking forward to exploring more of your channel in the future! Thank you for this
@SteveWaltersY
3 ай бұрын
And the Solaris? border around her facing camera. Nice touch.
@lordlyamiga
3 ай бұрын
it's a XP theme no actual Xp cuz they are using modern CMD and also edge and explorer logo are from win 10
@rogerwinright2290
3 ай бұрын
@@lordlyamiga I had figured as much! I saw another cool game developer on here using Linux Mint with an XP theme too. Super cool!
@TheFern2
3 ай бұрын
the XP theme is a nice touch my favorite windows of all time, my first laptop was on XP much simpler times before metro ui
@deanwilliams433
3 ай бұрын
@@TheFern2don't get the nostalgia for XP, it was a security nightmare. Windows 7 was leaps and bounds better.
I was looking for something to snooze to and thought this was ASMR Assembly.
@esra_erimez
3 ай бұрын
🤣
@jack_2612
3 ай бұрын
I was just checking the comments half way into the video and you, sir, just made me notice that it's not ASMR Assembly indeed lol
@jack_2612
3 ай бұрын
also, don't know anything about coding, but YT is doing YT things with its algorithm
Haven't done any coding since 1985-86. But somehow your video popped up in my feed... I did enjoy the video!
The quality of the video and the content is top notch. Here before 1 M subscribers
Thanks for your amazing content dear Laurie. 🎉
Loved this! Really great quality all around.
I haven’t touched assembly since my C64 days, this channel is a great motivation to play with the rpi. And since I am from the 80s I love the vibes of this channel, it has some Max Headroom/beakman’s/matrix vibes, really well produced!
@ProBloggerWorld
3 ай бұрын
For me it is the opposite: I haven’t stopped touching C64 assembly, still active in the demo scene. 😅
Thanks for this Laurie. Already waiting for the next one.
I'm so used to using Java/Kotlin and having a hard time translating that experience to assembly. I finally get how function signatures map to registers. Really eager to dig into the back catalog now!
KZread algorithm brought me here. This channel is pure gold. Subscribed.
Thanks for teaching me that roundabout way to calculate a % b. It's simple and clear, but I never would have thought about it.
Glad you finally had a video blow up so much. Great channel
Incredible production quality!
What was this! Great tutorial and also Burnout in the end? Amazing.
This has been a very interesting look into modern-day assembly, thanks! But I enjoyed the aesthetics of the video production almost as much! The hacker cave, coding in "new x" files in Notepad++, and on Windows XP? Nevermind the full-screen capture with the camera feed in a separate window. Feels right like the late 2000s.
ARM ASM, Windows XP theme!! This channel just got recommended to me! Nice video!! Subscribed!!
This so cool! I just had to implement the euclidian algorithm as part of a compilers course assignment!!
Blessed by the algorithm today. I never thought about how recursive functions are implemented in assembly
Thank you dear YT overmind part of the YT algorithm for bringing us all together here! Amazing channel !
I haven't done assembly language coding since the 80's in my Microprocessor Design (6800, along with some machine coding) and Assembly Programming (8086) classes in college. I thoroughly enjoyed this video.
Thanks for this! Love your videos' aesthetic, but even moreso I appreciate an expert breaking programming tutorials into individual functions and algorithms instead of programming complex apps to teach many ideas all at once.
@xydez
3 ай бұрын
Not to break your bubble but I would hardly call this expertise as that would be insulting to actual experts. Still, it was definitely a good produced video, I like the aesthetic.
Ther quality is this channels is so good it’s suspicious. Also +1 for lain
wtf, awesome! why haven't I discovered your channel sooner?
I last coded in Arm for an Acorn Archimedes about 35 years ago. Thanks for the memory jolt.
This is such a fun setup!
Hi Laurie, you’re amazing and unique, please don’t give up
@iro4201
3 ай бұрын
Don't give up is the worst kind of encouragement as well as my comment advice.
My professor has actually given us as homework to compute the Euclidean Algorithm however, he told us we had to do it in x86. Still helped me out understanding how to do it tho!
Very informative glad I found this channel
just in case, the modulo implementation works out if you use integer division not "normal" division (2//5)*5=0, but (2/5)*5=2
Why did KZread take this long to recommend me this channel. Instantly subscribed
I don't know, this just clicks every key, this atmosphere is perfect
I like the recursive version for its readability, but it's worth noting that optimizing compilers will (typically) use a tail recursion optimization to convert the generated assembly into a loop. This is faster and can't blow the stack. I understand this is a (cool) exercise. Just thought someone might be interested.
@hemmper
3 ай бұрын
yup, but her example didn't use stack for recursion
@tolkienfan1972
3 ай бұрын
@@hemmperincorrect. It in fact uses "branch with link" along with "push lr" and "pop pc" to implement the recursion. The arguments are not on the stack, which saves some stack space, but each recursion level is indeed storing the link register on the stack.
Oh my god this is the most unique coding tutorial ever!
love ur XP theme :D brings back memories
@lashlarue59
2 ай бұрын
Thank you! I was going to ask is she doing all of this under Windows XP!
Amazing instruction, so fun to learn!
I can watch you play all day and listen non-stop. No AI model can beat you!
@RameshS-tb4co
Ай бұрын
You're cute.
Really unique content, informative and enagaging
such a fun format
looking for KZreadr making tutorial about arm assembly, never found, and tonight KZread recommended this channel, okay KZread, okay
Before I phrase my note: I stumbled over your video by chance - as often on KZread - and immediately wanted to know about how (simple or complex?) this algorithm looks in assembly language. I highly appreciate what you do and I understood everything you explained. When I reached around 8:00 in the video I thought - am I incorrect about what parameter sits in what register? Until that position I assumed "a" to be in r0 and "b" in r1. And meanwhile I am absolutely certain about it. Then you said: 1. 7:47 the "sdiv" command: all good 2. 7:54 "remember b is in our r0 register" - eh, no?!? 3. 8:15 the "mul" command: b is in r1: correct 4. 8:34 the "sub" command: you say "a is in r1", what is not correct, but you correctly wrote "r0" I confess: if I would not have had a question mark in my face two times during that passage, I would not have repeated it several times. And now with the repetition I will sure remember it forever and I am also certain about having fully understood your implementation. Possibly you can add some textual remarks into the video at the two mentioned positions. Thank you for your video!
OMG. Last time I wrote assembly was like 20 years ago! I love this woman!
This video style gives me a lot of nostalgia
I like the channel theme and content 👏
ah. you've unlocked good memories. gracias.
Took me a while to realize the whispering at the start reminds me of 'AIR - How Does It Make You Feel? ' Had to re play that at least 5 times, and then walked around for a bit, before I knew what it reminded me of.
Everybody who comments "the algorithm brought me here" needs to immediately like and subscribe. *Thats* how we can boost channels like this
I like your expression and head movements.👍
Assembly has changed a helluva lot since I last wrestled with it. Showing my age but I last wrote assembly code for 16bit 68000 series processors as an engineering student!
I really love the Lain dp! and of course the content!
WOoooooOW I love your CRT's !!!!!
I don't know what the heckin flip I'm watching but I freakin know it's epic dude.
your content is amazing!
I love videos like this!!!
I don't have the patience for coding but I geniunely enjoyed this
How I miss 6502... This is a welcome bit of nostalgia!
this is where it's at. you got a new sub
Well done!
Great post!
Only 27k subscribers? Deserves way more
Just as a clarification: argument passing in asm is standardized. The requirements for the caller and callee are defined by the "arm thumb procedure call standard" (ATPCS). If you follow this standard you can also call c functions from asm code
Really cool project 😊😊
legendary tutorial
The style of this video is fucking great. Love the way this is presented, I have zero interest in this topic but I'm fascinated by the vintage computer graphics.
The Burnout 3 part at the end was the cherry on top for me
I didn't know you were on KZread only followed you on X. Good to see you here 🤩
I'm not even a coding person but you do make it interesting
The Angel of Arm returns 🙏💘📿
Great channel.
Interesting, nicely done. I don't know how old you are, but I suspect I last wrote ARM assembler before you where born. Wrote code on the Acorn Archimedes in 1990ish. Personally I can't write assembler any better than GCC can generate it, I sometimes use gcc -S to have a peek under the hood but have not had any need to write ARM assembler in many years. Nice to see it done though, thanks.
The comments are 90% simp 😂
While passing arguments to a function in registers is the fastest method, the downside is maintaining the code. Many (probably older) C compilers let you add the keyword "register" to the parameter, the compiler would (hopefully) pass it in a register. That being said, if a new parameter were to be added to the function (passed into the next register), and the function is using purely register manipulation, then all registers numbers in the function need to be manually updated. In the older days, many compilers passed all arguments on the stack, with it making easy to add new local variables also on to the stack. In regards to this video, it would be interesting to see a C++ to ARM video where the assembly language function has local variables...
This is maybe the best video I've ever seen. Present day. Present time. Hahahahaha.
Great Video!
This would be super helpful for Javascript programmers looking to learn to program.
This is a really unique format! Well explained too. Just curious, is that Windows XP ribbon an overlay or is everything *really* being emulated?
@gplastic
3 ай бұрын
it looks like it's running overtop windows 11
I love the XP rice 😂
So, newish to assembly (at least haven't written any in two decades), but as the gcd function is a tail call recursion algorithm, this could, in theory, be done without the regular adds to the stack (avoid stack overflows if GCD is called from an already deep stack). Would this be easy to implement?
@JiffyJames85
3 ай бұрын
Would it end up being something like: ``` gcd: push {lr} gcd_sub: // Base case ... mov r1, r4 bl gcd_sub ``` and that way only the initial call to the routine saves the return address.
@sagemitchell8646
3 ай бұрын
I was thinking about the tail call elimination opportunity too. Wouldn't we want that last line to be `b gcd_sub`(notice `b` instead of `bl`)? My understanding is `bl` would change the link register before jumping to `gcd_sub` whereas `b` would just jump. Maybe it wouldn't matter, though. The address from which `gcd` is called is already on the stack and we pop that value back into the `pc`register to return there. I suppose it may not hurt to write to the link register between those steps.
@JiffyJames85
3 ай бұрын
@@sagemitchell8646 that sounds right. Most of the stuff I did was with the 6502, and I wasn't sure if an arm has extra side effects. So, I made the same assumption as you and left the bl.
@robiniddon7582
3 ай бұрын
A plain b for branch would work. The link register is used to provide the subroutine return address which hasn't changed in your hypothetical iterative implementation. Also the iterative function doesn't need to push the LR cos it never calls anyone else and can return with mov PC,LR or whatever the mnemonic for that is.
@robiniddon7582
3 ай бұрын
BTW, I agree, excellent video. 👍
It's a great video, but can you use the same color palette as for C code for Assembler? Thanks.
great video it almost seems comprehensible! What is this asm used for on this device?
a woman with assembly language this can be dangerous! I'm glad to see that!
@xydez
3 ай бұрын
Who would care whether it's woman? To quote linus, i don't care what plumbing you're born with.
OMG the Vintage vibe
Really cool video :) I just want to suggest having Notepad in dark mode too as I just got flashbanged by it at 1 am 😂
+1 subscriber. This is awesome!
Is recursion in c++ different than recursion in assembly? I think if you had a variable in the original c++ function then there would be another instance of the variable for every calling of the gcd functions, correct? The assembly version would not have this, and kinda works more like a regular loop. Thinking of it idk how you would do something like this in assembly if you really needed to keep track of some different data for each instance of the function.
Good video. Subbed.
Copland OS theme? G3 iMacs?! ARM?!? Instant subscribe!