Failure is Always an Option - Dylan Beattie - NDC Copenhagen 2022

Ғылым және технология

Software runs the world. We use software to manage our calendars, talk to our friends, run our businesses - and, as our societies inevitably try to replace people and paperwork with apps and algorithms, we find ourselves facing some vital questions about the reliability of that software. If you take the time to actually read the terms and conditions, you’ll find that just about every system we rely on comes with no warranties and no safeguards - you use it at your own risk, and if it doesn’t work, that’s your problem.
But there’s more to building reliable systems than just writing good code. Reliability isn’t just about software engineering, it’s about systems engineering; about taking a holistic view of services that includes software, hardware, networks, and people.
Join Dylan Beattie for an insightful look at the history of systems engineering, at some of the strategies and design patterns that we can use to build reliability into our systems, and at what happens when the software that runs the world has a bad day.
Check out more of our featured speakers and talks at
www.ndcconferences.com
ndccopenhagen.com/

Пікірлер: 112

  • @-parrrate
    @-parrrate Жыл бұрын

    failure is not an Option, it's a Result::Err

  • @TheMCMaster

    @TheMCMaster

    Жыл бұрын

    rust lore

  • @asandax6

    @asandax6

    Жыл бұрын

    or {panic}

  • @grumblycurmudgeon

    @grumblycurmudgeon

    Жыл бұрын

    STDOUT

  • @clooskey

    @clooskey

    Жыл бұрын

    Failure Oriented Programming 😅

  • @grumblycurmudgeon

    @grumblycurmudgeon

    Жыл бұрын

    @@clooskey Did someone say FOP? I see you worked in Angular2 for a time as well!

  • @tharfagreinir
    @tharfagreinir Жыл бұрын

    Listening to Dylan Beattie is always a treat. He knows how to pick interesting subjects and also how to present them in an engaging and entertaining fashion.

  • @f.d.3289

    @f.d.3289

    7 ай бұрын

    That's nearly word for word what I was about to say :)

  • @MichaelButlerC

    @MichaelButlerC

    5 ай бұрын

    I only just found out about him somehow and have been going on a binge, every talk is interesting

  • @rexbaumeister7377
    @rexbaumeister737711 ай бұрын

    I love listening to Dylan, and sometimes I get so engrossed in whatever historical subject he's delving into that I forget he's giving a software talk.

  • @solarlaura3355
    @solarlaura33558 ай бұрын

    Feynman wasn't the first person to know about the O-ring failures on the Challenger. Three Morton-Thiokol engineers knew the night before the disaster that they would fail. They tried to stop the launch but were stopped by NASA managers because " Reagan was had demanded a launch". The three engineers were fired for "whistleblowing" because the would not shut up. The whole story is told in IEEE Spectrum magazine in a 1989 article about the problems the three engineers experienced. The booster O-ring specifications required that ambient temperature must be 48F for 24 hours before the launch and NASA had this information. Political pressure led to a launch after days of freezing temperatures. That's like knowingly buying tires rated for 100mph and then driving at 150mph. The design was rated for 48F or higher and NASA bureaucrats chose to launch at 32F. The result was a certain failure.

  • @edgeeffect
    @edgeeffect Жыл бұрын

    Just gotta mention that Ed. Lorenz's Royal McBee LGP-30 (34:38) was programmed by.... Margaret Hamilton (05:15) - before she went to work for NASA.

  • @DylanBeattie

    @DylanBeattie

    Жыл бұрын

    Wow, I had no idea... today I learned. Thanks! More info about it here: www.wired.com/story/these-hidden-women-helped-invent-chaos-theory/

  • @doublepinger
    @doublepinger Жыл бұрын

    I would not have expected RICHARD FRICKEN FEYNMAN to identify that the rubber seals had failed, and the "root cause" was increasingly lax risk assessment over time. I can't imagine what they were thinking letting him, the last person to be a yes-man, conduct an assessment.

  • @TheGreatAtario

    @TheGreatAtario

    Жыл бұрын

    Probably that they actually wanted the problem addressed.

  • @davidpriestley1650

    @davidpriestley1650

    Жыл бұрын

    technically he didn't - he was given the info by whistleblower (and Astronaut) Sally Ride to look in that direction by providing NASA documentation on the O-rings and their resilience against temperature. She is the true hero of the Rogers Commission which looked into the Challenger Disaster.

  • @stuartanderws5705

    @stuartanderws5705

    Жыл бұрын

    @@davidpriestley1650 R Feyname just put a bent over bit of seal in a glass of ice water with a pipe clamp holding it in place. But at the hearing were no one would sell him to be quiet.

  • @gerdd6692

    @gerdd6692

    3 ай бұрын

    I expect that he was all of these: not involved, independent (his job did not depend on NASA) convinced by the evidence placed in front of him vocal highly reputed (unassailable) publicly known So they couldn't shut him up and they knew the problem wasn't going to go away now with him pursuing the matter.

  • @magdaleneabiuso65
    @magdaleneabiuso65 Жыл бұрын

    28:00 In Australia, we had a similarly catastrophic implementation of a similar system for welfare payments that wrongly identified many people for welfare fraud and demanded repayments that people didn't owe and couldn't repayment. Quite a few people lost their lives.

  • @zuighemdanmaar752

    @zuighemdanmaar752

    6 ай бұрын

    same over here in the netherlands. the cabinet fell and didn't recover. never blindly trust computers on high stake actions unless the system is designed for it hint: almost all systems are not designed for high stake actions

  • @sfdntk

    @sfdntk

    5 ай бұрын

    @@zuighemdanmaar752 I wish we could say our cabinet fell and couldn't recover after the utterly inhumane Robodebt debacle, but Australians really love voting against their own interests, and conservatives really love screwing working people over.

  • @mindasb
    @mindasb Жыл бұрын

    If you try hard enough, failure is not only an option - its a requirement.

  • @grumblycurmudgeon

    @grumblycurmudgeon

    Жыл бұрын

    On behalf of all the smoking craters that never were, upstaged instead by spacecraft successfully landing, I must respectfully counter, "nuh-uh!"

  • @cornoc

    @cornoc

    8 ай бұрын

    @@grumblycurmudgeon if you think there's a field where failure has never occurred, either your definition of failure or your domain of analysis is too narrow.

  • @ari_archer

    @ari_archer

    8 ай бұрын

    im confused wym by this lol like, what i can think of is `(2 + 1) == 4` should always fail, meaning a requirement, and if it doesnt fail, somethings very wrong is that wht u meant

  • @cornoc

    @cornoc

    8 ай бұрын

    @@ari_archer if you're talking about the OP, the meaning i interpreted was "if you try hard enough in any domain, you will necessarily fail many times on the way to success"

  • @filker0
    @filker0 Жыл бұрын

    "Failure is not an option" (it's included as part of the base package) That aside, there is a book by Nathaniel S. Borenstein called "Programming as if people mattered" that you may find worth a read. I don't know if it's still in print, though. I work in the aerospace industry as a software engineer. We endeavor to design systems as though failure is an inevitability, and try to make sure that all failure conditions that can be anticipated are handled, and that any unanticipated failures cause conditions that will do the least damage. For flight instrumentation, it is better to provide no information than wrong information, often the mitigation is to shut down the subsystem and alert the crew. When I started working in this domain I was given the task of designing and implementing software within the platform to deal with loss-of-cooling failures (fan failure, ventilation blockage, the plane sitting on the tarmac in the sun for 2 hours, things like that). I had come from a different part of the computer industry where graceful degradation was the goal when handling failures; I designed a health monitor that would lower the CPU and memory clock speed and/or turn off the clocks to non-critical components to reduce power dissipation but keep this safety critical subsystem operating. My design was unacceptable because there was no guarantee that the subsystem would always produce the same data to the displays that the pilot relies on when going full-rate (no failures) and with an overheat condition. The correct solution, I learned, was to stop reporting anything and put the entire subsystem into as close as a reset state as possible until the conditions were within tolerances. There were 3 copies of this same subsystem on the airframe, each in a different physical location but receiving the same stimulus (data from sensors and discrete signals), and the display subsystem used voting to select the source of the data it displayed to the crew. Stale or improperly smoothed data would not match between the different copies, and this could result in the wrong data getting displayed. When one of the modules went off-line, the display would source select between the two other modules. If they disagreed, the display would highlight the measurement to alert the crew that alternative readings should be consulted. Failure tolerance is a design constraint that is tuned up or down depending on the criticality of the system, whether it is mission or safety critical system, and what the consequences of any failure might be.

  • @DylanBeattie

    @DylanBeattie

    Жыл бұрын

    That's really interesting... thanks for the comment; I'll see if I can track down a copy of the book! Also, by a random coincidence, Nathaniel Borenstein just popped up in the research I'm doing for my next talk - he's mentioned in the Wikipedia article about MIME email formats, quoted talking about the problems of defining MIME version 1.0 without sufficiently specifying how that would then lead to 1.1 or 2.0. Interesting stuff. :)

  • @Sepen77
    @Sepen77 Жыл бұрын

    Dylan Beattie tech talks never fail to impress! Yet another fine one!

  • @nikfp
    @nikfp Жыл бұрын

    This was one of the best presentations I've seen in a while. Thought provoking, engaging, and entertaining. It's already sparked several great discussions and I just finished watching it.

  • @tlrkendl

    @tlrkendl

    Жыл бұрын

    His talks are all great, the Code as Art one and the whole thing about quines is very cool

  • @nikfp

    @nikfp

    Жыл бұрын

    @@tlrkendl Seen it, loved it! Next time someone asks me to do FizzBuzz I'll try to bust out Dylan's version :P

  • @jarosawmalinowski8130
    @jarosawmalinowski8130 Жыл бұрын

    In modern world, every company "thinks" that they know better, than their user. Every product is made, to force you, to do it "right" way.

  • @Duiker36

    @Duiker36

    10 ай бұрын

    Yeah. I'd say it's the fault of the person who coined the phrase "opinionated software".

  • @thepaulcraft957
    @thepaulcraft957 Жыл бұрын

    As a German the Berlin airport example really hurt

  • @amanda.collaud

    @amanda.collaud

    Жыл бұрын

    Ja :D Ist echt peinlich dass es auch international ein Lacher ist.

  • @thepaulcraft957

    @thepaulcraft957

    Жыл бұрын

    @@amanda.collaud Zum Glück wohne ich nicht in Berlin 😅...

  • @amanda.collaud

    @amanda.collaud

    Жыл бұрын

    @@thepaulcraft957 Ich schon *heul*

  • @tamberp
    @tamberp Жыл бұрын

    "Failure is not an option - it is mandatory. The option is whether or not to let failure be the last thing you do" (70MMEM) 😉

  • @stuartanderws5705
    @stuartanderws5705 Жыл бұрын

    Post Office. Even when the people in charge know the computer system was wrong, they still sent the people prison then admit it failed. That is true EVEL (This comment was made before I heard the rest of the segment. I still believe it is true)

  • @casperes0912
    @casperes0912 Жыл бұрын

    Copenhagen... So close to me, yet I have not been able to go yet :( - Want to see Beattie live so bad

  • @TheJacklwilliams
    @TheJacklwilliams Жыл бұрын

    Failure is always an option, has been my saying for years. I’ve experienced it, along with huge success. What’s required is to learn, iterate, change, adapt. When in many cases, your psyche is screaming “I’m done, I’m not doing this shit again”.

  • @rc6431
    @rc6431 Жыл бұрын

    Loved the section about Kenya!

  • @BloodyClash

    @BloodyClash

    Жыл бұрын

    Me too. Shows that you have to be clever to upcome certain obstacles

  • @uchennaofoma4624
    @uchennaofoma46244 ай бұрын

    Loved this talk

  • @springford9511
    @springford95114 ай бұрын

    Thanks Dylan, another engaging talk. SMS verification code timeout as discussed at 32m 0s in video. I had this exact think last week. Phone SIM dead (network problem), in one of the provider's high street shop's on their WiFi, using WiFi calling to call the support number. The SMS message was delayed by a FEW SECONDS when on WiFi and the authentication timed out. Timeout was about ten or fifteen seconds. Luckily I didn't hang up immediately on one occasion and it fell through to a manual process. I very nearly didn't discover this because a few times I hung up on getting the "Ha ha Too Late" audio message.

  • @kylekinnear8878
    @kylekinnear8878 Жыл бұрын

    Super awesome talk. It really did teach me to think of failure in layers. It applies more and more the bigger you scale.

  • @ignatiusezeani6816
    @ignatiusezeani68169 ай бұрын

    It's so great listening to Dylan. Thanks so much!

  • @PecPur
    @PecPur10 ай бұрын

    WOW I use MPESA for everything and I didn't know how it started, I've always been curious about it. Thanks.

  • @LanceBryantGrigg
    @LanceBryantGrigg7 ай бұрын

    This guy was awesome to listen to.

  • @amanda.collaud
    @amanda.collaud Жыл бұрын

    Hahaha the Brandenburg Airport is a really big shame for us germans, we always laugh about it. Its terrible that the whole world knows about it now :( What a shame ^^

  • @lebeinderbadewanne
    @lebeinderbadewanne2 ай бұрын

    Great talk. Thank you. At the print shop in the Principality of Liechtenstein where I did my apprenticeship, they had two Linotype machines standing around. I was told that one of them should still work, but I've never seen them in operation. And I don't know if they still have them.

  • @ShaneDavisDFTBA
    @ShaneDavisDFTBA11 ай бұрын

    19:30 I found the chart confusing for a minute because it suggested that the number of random defects is highest at the start of the lifecycle, rather than just being additive.

  • @pkorobase
    @pkorobase8 ай бұрын

    Great talk. I shall search for this NDC soon 😃

  • @7th_CAV_Trooper
    @7th_CAV_Trooper5 ай бұрын

    "try it and see what happens" is always a good idea when combining human lives and aviation

  • @ScatterlingOfA
    @ScatterlingOfA Жыл бұрын

    excellent!

  • @DevToolsMadeSimple
    @DevToolsMadeSimple Жыл бұрын

    "..land a man on the moon and back before the end of the decade." 😂😂😂 What a user story indeed

  • @Kobay350
    @Kobay3505 ай бұрын

    You should listen to Allan McDonald talk about the Challenger failure. They knew it was a problem and he refused to sign off on the launch which forced his superior in the company to have to sign off. The engineers didnt have a yolo attitude. NASA changed the question from "is it a good idea" to "prove it will fail."

  • @samiraperi467
    @samiraperi467 Жыл бұрын

    I do actually wear a seatbelt for driving. It keeps my ass in place better during direction changes.

  • @grumblycurmudgeon
    @grumblycurmudgeon Жыл бұрын

    Jesus... next time you invent a language, PLEASE call it "Full-Stack"?

  • @traveller23e

    @traveller23e

    Жыл бұрын

    It's a language where the first thing that happens is all the variables are loaded into the stack. It has the huge performance benefit that you know the stack size right at the beginning so the OS doesn't have to use any guesswork to figure it out or anything.

  • @SamTheEnglishTeacher

    @SamTheEnglishTeacher

    Жыл бұрын

    Wonder how hard it would be to create a language called "10x"... It could be just like C but all numerical data types implicitly become the natural log of the number declared, to varying precision depending on the type

  • @grumblycurmudgeon

    @grumblycurmudgeon

    Жыл бұрын

    I was just commenting on how much I hate the phrase "full-stack developer." Dylan mentioned in the talk his creation of Rockstar was the impetus behind LinkedIn no longer looking for "Rockstar developers". I'm just saying should the flight ever take him again, please: let's neutralize another overused and frankly destructive descriptor, and get double mileage outta the gag. ...but really just an instruction queue could be considered a stack... maybe something that fully evaluates a current set of instructions before allowing for the injection of more (the "full-stack must be used up before accepting additional input")? I could see there being an actual value in low-level microcontroller-oversight, or manufacturing...

  • @SamTheEnglishTeacher

    @SamTheEnglishTeacher

    Жыл бұрын

    @@grumblycurmudgeon we're aware and also riffing

  • @logiciananimal

    @logiciananimal

    Жыл бұрын

    @@traveller23e Isn't that sort of Forth?

  • @ralfrolfen5504
    @ralfrolfen5504 Жыл бұрын

    46:38 Very important sentence

  • @TanigaDanae
    @TanigaDanae Жыл бұрын

    I still fight with the issues at 31:55 ... hate the text message verification systems.

  • @mikelward

    @mikelward

    10 ай бұрын

    Have you tried enabling WiFi calling?

  • @TanigaDanae

    @TanigaDanae

    10 ай бұрын

    @@mikelward I figured this exist (a few months ago) but haven't enabled it.

  • @StephenGillie
    @StephenGillie Жыл бұрын

    Failure is when you quit. Not meeting success isn't failure, and in the rapid prototyping mindset it doesn't make sense to quit and be emotional after not meeting success on the first try. Movies where "scientists have a big mission and something goes wrong" so they scrap the whole project - these are as far from reality as movies where the unpowered main character survives a 30 story fall onto pavement. You might die of old age on the road to success, but so long as you don't give up you won't fail.

  • @LeifNelandDk
    @LeifNelandDk6 ай бұрын

    In Berlin they started to wonder if it would be quicker to move city nearer to the airport.

  • @velo1337
    @velo13379 ай бұрын

    what can fail will fail.

  • @ICountFrom0
    @ICountFrom08 ай бұрын

    I'm part of a town where most people here, can't use cell phone verification. More and more things just won't work for this town.

  • @kahnfatman
    @kahnfatman8 ай бұрын

    A very honorable mention: Tegel Airport kept on operational 14 years before BER could come in place.

  • @BloodyClash
    @BloodyClash Жыл бұрын

    Having "Dylan Beattie" in the title is like "Made in Germany" in the 80s

  • @JamesOfKS
    @JamesOfKS10 ай бұрын

    visual studio beater? don't recognize that one :)

  • @mikelward
    @mikelward10 ай бұрын

    32:00 WiFi calling supports SMS messages, too.

  • @springford9511

    @springford9511

    4 ай бұрын

    Yes. But they are slower to arrive on my provider. 2FA recently failed for me because of this. I just commented with all the details.

  • @Erhannis
    @Erhannis Жыл бұрын

    For some reason the applause at the end sound like a clap of thunder

  • @kennichdendenn
    @kennichdendenn Жыл бұрын

    Im living in a modern house in a city centre. A modern building made from concrete... I dont know why, but either its 5G, E or straight up nothing at all.

  • @BenTrem42
    @BenTrem428 ай бұрын

    *_FMECA_* ... different failures, different modes. Some things probably won't fail, some things likely will. Some things are easy to diagnose and fix, other things ... not so much. Criticality matters! _cheers_

  • @ozok17
    @ozok17 Жыл бұрын

    "It's about a bunch of astronauts who go into space. They're on their way to the moon. The ship blows up. It looks like they're all going to die and then they don't die and they get them back and everyone goes, 'yay, America is amazing'." looks like that's about 46 words [1], which is close enough to 20. well, same order of magnitude, or thereabouts. certainly close enough for a figure of speech. [1] counting each contraction as a single word, and omitting gap words like "um". if we instead count expand each of the 3 instances of contraction, and include the (single) gap word ("uh", after "space"), then it comes to 50 exactly, which is still only 2.5x twenty, so still close enough in my book. but yes, i did check. i guess i'm "that guy".

  • @willsterjohnson
    @willsterjohnson Жыл бұрын

    Justification of the try/catch block

  • @kristianTV1974
    @kristianTV1974 Жыл бұрын

    This is the basis of radiation hardened chips - voting circuits allow for alpha radiation flips.

  • @MikeInPlano
    @MikeInPlano Жыл бұрын

    People fear air travel because the probability of death in event of a crash is very high, whereas most car accidents result only in damage to the vehicle and minor injuries.

  • @Nereosis16

    @Nereosis16

    5 ай бұрын

    The chance of dying in a car is still way higher than being killed in a plane.

  • @shinkathe
    @shinkathe Жыл бұрын

    Functional programmers get confused, when someone says failure is always an option. Of course it is...

  • @nigh7swimming
    @nigh7swimming Жыл бұрын

    Failure is not an option, it's an undesired outcome of an activity.

  • @todayontheinternet7790
    @todayontheinternet7790 Жыл бұрын

    As interesting as this is, I like how he sort of just slides it in there that he is Kenyan.

  • @ayasekaru
    @ayasekaru5 ай бұрын

    18:10 "80.000 flights landed safely" is not news the same way "4.000 people die in car crashes every day" because we suck at numbers! If 4.000 people got into one massive car pile up the horror would be plain to everyone to see, but because each individual instance is maybe 1 or 2 folks it's suddenly not worth thinking about.

  • @joachimdietl6737
    @joachimdietl67378 ай бұрын

    They forgot the anti fire protection at the airport in Berlin! Germany is crashing!

  • @gabetower
    @gabetower6 ай бұрын

    lawl, moon-covid

  • @fg786
    @fg786 Жыл бұрын

    5:51 Sounds somehwat off considering they send 3 people to the moon.

  • @emjizone
    @emjizone Жыл бұрын

    4:03 I see, it wasn't an option. It was a feature !

  • @alexgorodecky1661
    @alexgorodecky1661 Жыл бұрын

    No sense

  • @d7ffab979
    @d7ffab9798 ай бұрын

    Bro I liked ur talk until you made up the roots of chaos theory. Chaos theory was invented by Poincare when he tried to win a competition set by another mathematician to proof the stability of the celestial mechanics.

  • @tombyrer1808
    @tombyrer1808 Жыл бұрын

    Just had to go political? (13:50)

  • @donnan190

    @donnan190

    Жыл бұрын

    We live in political times. Being political is not bad, even in spaces where politics aren't that relevant.

Келесі