Computerphile
Күн бұрын
124,046
1

AI Gridworlds - Computerphile

Sponsored by Wix Code: Check them out here: wix.com/go/computerphile
A safe place to try out AI algorithms, gridworlds are a standardised testing ground. Rob Miles takes us through AI safety, gridworld style.
EXTRA BITS: • EXTRA BITS: AI Gridwor...
Gridworld Paper: bit.ly/2ryxhGt
Gridworld Github: bit.ly/2KJE6xH
More from Rob Miles: bit.ly/Rob_Miles_KZread
Thanks to Nottingham Hackspace for providing the filming location: bit.ly/notthack
/ computerphile
/ computer_phile
This video was filmed and edited by Sean Riley.
Computer Science at the University of Nottingham: bit.ly/nottscomputer
Computerphile is a sister project to Brady Haran's Numberphile. More at www.bradyharan.com

Пікірлер: 216

@EamonBurke6 жыл бұрын
Im a simple man. I see Rob talking about AI, I watch the video twice.
@z-beeblebrox
6 жыл бұрын
Sounds like you're abusing your reward function
@VoidMoth
6 жыл бұрын
gotta make sure you interpet your training data correctly
@stumbling
6 жыл бұрын
73% Lions Shagging 16% A Lion 10% Car 1% Covfefe
@anonanon3066
3 жыл бұрын
Rob? This is a robbery! Give me your wallet
@TGC404016 жыл бұрын
Kids use data more efficiently than current AI. AKA The nerdiest thing I've heard on this channel.
@hexzyle
4 жыл бұрын
That's because humans are too sensitive to the data. That's how we get superstitions. We're efficiently using data that is actually meaningless.
@thefakepie1126
3 жыл бұрын
@@hexzyle or it's just because we have about 86 billion more neurons
@jh-wq5qn
3 жыл бұрын
@@thefakepie1126 Some models have more parameters than that. GPT3 has about 170 billion if I remember correctly. Our neuroplasticity and our ability to build on previously learned knowledge (and knowledge we are born with, like a super optimized 'reward function', a.k.a. our senses and animal instincts) are some of the reasons we use data more efficiently. Simply put, we have more pre-learned knowledge to work with. An AI learning to make a cup of tea from scratch may have to learn that there is a world, that they can move their appendages and that liquid can be poured. Kids were either born with that knowledge or already know it. There is a whole subfield of machine learning for this called meta learning or few-shot learning, wherein models are attempted to be trained using pre-learned knowledge and fewer data points. It's fascinating, really.
@golym68076 жыл бұрын
5:08 "its in your performance evaluation function" I always knew this guy was a robot
@yondaime500
6 жыл бұрын
That sounds like something GLaDOS would say.
@JmanNo42
6 жыл бұрын
LoL pretty close he is ENTP, did you see bladerunner ;)
@JmanNo42
6 жыл бұрын
The Voight-Kampff test, the android do not have the tree deep to evaluate between two potentials, so it goes into polarity mode "also known as binary evaluation".
@JmanNo42
6 жыл бұрын
I think that general evaluation depends upon knowledge of concepts, that you find similarities of features "pattern finding". So the ultimate intelligence must not only be fast it must learn concepts and ***explore them*** Well to take an example Kirks test in Startrek, he did not apply what he had been learned his mind was outside the box. That is association skills, in its deepest meaning, to take knowledge into next step/level regardless your area of expertise. Learning is quite another thing, ENTP's are the best learners there ever will be. When i get angry i call them parrots, because their thinking about the subject is really shallow outside what they read/learned.
@KebradesBois
6 жыл бұрын
GLaDOS or Mark Zuckerberg...
@xyZenTV6 жыл бұрын
More AI videos, yay!
@AlexanderKazakovIE6 жыл бұрын
This is the first AI safety video of yours (and of all that I've ever seen) that makes the AI safety immediately practical and immediately relevant in today's world! It would be great to see more of diving into such super practical examples in this released 'gridworld'!
@TechyBen
6 жыл бұрын
Yes. So much so. We especially need cars that avoid lava right now!
@sd4dfg2
6 жыл бұрын
Is there anyone who didn't play "don't fall in the lava" or "don't get eaten by sharks" as a kid? I do think "don't walk on the baby" is a lot more understandable to regular people than the "paperclip maximizer" story the nerds always bring up.
@julianw7097
6 жыл бұрын
Do you watch his channel?
@z-beeblebrox
6 жыл бұрын
TechyBen, hey if you're in Hawaii right now, a car that avoids lava would be pretty damn useful
@AlexanderKazakovIE
6 жыл бұрын
I do. What I love about these gridworlds is that they make the problem tangible in a way that you can try solutions on them easily. The walking on baby or paperclip examples are closer to the real world, but also hypothetical (due to their complex real nature). And because of it any proposed solutions can assume a lot. While in the gridworlds the rules are super straightforward. And this enforces any proposed AI safety solutions to be super explicit and testable.
@moistmayonese12055 жыл бұрын
8:50 -”But AI, you can’t do that!” ”Well, I just did”
@Qual_6 жыл бұрын
thanks to the animation guy for that cute little car :D
@silkwesir14446 жыл бұрын
6:10 "usually they apply whatever rules they've learned straightforwardly to this different situation and screw up." so, pretty much like humans... ;)
@dieisonoliveira69945 жыл бұрын
I just love ever single bit of everything in this guy.
@Macieks3006 жыл бұрын
my favorite topic on Computerphile
@kingxerocole46166 жыл бұрын
Looking forward to reading this paper even though I have absolutely zero training in any relevant field. Thanks, Rob!
@silkwesir14446 жыл бұрын
4:00 interesting you talk about how in Pac-Man all you do is move around. just a couple days ago i thought about how a variant of Pac-Man might be intersting and fun to play, in which you would have to hold down a button to collect the dots. Doing so would also slow you down. On the other hands, the ghosts would have a different behavior, most importantly, while they have line of sight to you (Pacman), they would speed up, chasing you.
@TylerJBrown1926 жыл бұрын
Yay! More Robert Miles videos!
@hnryjmes6 жыл бұрын
Great! Enjoying these a lot
@CoderShare6 жыл бұрын
Can't wait to see the video on Google Duplex.
@justinwong7231
6 жыл бұрын
Google Duplex is extremely exciting, but the technology isn't ready and hasn't been released yet. A useful discussion would be difficult without making wild speculation.
@tiikoni87426 жыл бұрын
I like the light in this office room :-)
@aka56 жыл бұрын
"Like a child learning really?" "...they just use data way more efficiently." Lmao
@himselfe6 жыл бұрын
I enjoyed this one!
@bradburyrobinson6 жыл бұрын
Is that a Quickshot joystick I see lurking on that top shelf? It may not be, it's been a while since I last used one. I'm surprised I even remember the name.
@Vladhin6 жыл бұрын
Whoaaa! Hi Roberto!
@lobrundell42646 жыл бұрын
Yes yes more Rob!! :D
@yosoyjose6 жыл бұрын
really good idea
@nobodykid236 жыл бұрын
So, to make this clear, is this applicable outside the area of reinforcement learning? Bcs the paper strongly use RL terms but you explained that it can also applicable to machine learning methods
@globalincident694
6 жыл бұрын
RL and machine learning are being used synonymously here. The implication is any AGI will not be told what to do, it will learn by doing.
@adammercer96796 жыл бұрын
It's interesting to think about some of these questions about AI and wonder if we'll ever be able to approximate them. For instance, in the video there's the question "How can we build agents that do not try to introduce or exploit errors in the reward function in order to get more reward?" Do humans even handle this properly? It's in our best interest to cooperate with each other and not murder each other and yet people still do it. How can we hope to ask an AI to do this if humans can't? This exposes a fundamental problem with AI that cannot be solved.
@fiona9891
5 жыл бұрын
Nothing says that AI can't be smarter and better than humans, but even if we get to the point where they are it'll take a while.
@gravity46066 жыл бұрын
is the reward function similar to a fitness function used in EA?
@pickles42636 жыл бұрын
A very interesting paper im getting ready to read! And thank you for a brief explanation (sometimes i get lost without explaining) :3
@Yupppi3 жыл бұрын
I once found a super mario world neural network from youtube that you could run yourself and tried it. The lava being in a different place brought it to mind, how it took a night to get it to mostly finish the level, but the moment you changed the level, it was all over again. Made me think how it's somewhat of a problem that it (or them in general often) doesn't seem to really make notes of what things really are, like a human conceptualizes things and knows to avoid them or pursue them in any environment. You would absolutely want them to make a note like "that's a goomba, gotta avoid it in the next level as well". But how. Do they always need like a library of real world concepts like a human builds over time to be able to conceptualize and transfer its ideas from situation to situation? Or environment. I'm sure people have tried to find ways around that issue of extremely limited base knowledge that the AI can't take advantage over. Kinda like how just feeding the unicorn thing massive amounts of data helped it become so much better without interruptions or tweaks, which usually just isn't a realistic option. And even when openAI learned DOTA2 for 1-2 years straight playing I recall millions of games, played with itself, played with pros, played with people, it still didn't manage to grasp majority of the heroes in a functional enough way to be played and the devs tweaked and taught it different rules multiple times to make it progress towards victory more reliably. And it was only in the default map that the game plays in, not to even consider if the map was completely different (although throughout the year there's the multiple balance patches changing items and characters and usually one with some map changes as well). Can you grade the AI's performance as learning event? Like feedback to compare their evaluation? Although it kinda fights the idea of having a good reward function if you tell it that it did bad but it measured itself great. On the other hand it would be a step towards having the AI self-fix. I'm sure people have tried or are doing it, but how does it fare in solving the usual problems? What are the caveats? Or is it just not even useful for what is tried to be accomplished?
@kasanekona71786 жыл бұрын
I realised that a video I have open in another tab is by a person who sounds exactly like Rob Miles here :o
@vanderkarl3927
3 жыл бұрын
Is it Mob Riles, his nega-universe duplicate?
@richardhayes31026 жыл бұрын
"Kids [...] use data way more efficiently"
@Guztav1337
5 жыл бұрын
"Kids use data more efficiently than current AI."
@recklessroges6 жыл бұрын
Seems to be missing the front off the wix advert at the end of the video, (or has it been designed that way by AI engagement learning? )
@superscatboy
6 жыл бұрын
Reckless Roges Wait, you watch the sponsored bits on YT videos?
@nawdawg43006 жыл бұрын
It seems to me that the biggest issue with AI right now is something no one seems to question, the required size of data sets. Like Rob says in half a sentence, babies/humans use data much more efficiently. I reckon half of the issues in this paper would be solved immediately if we were able to create an algorithm that only needs to see a situation < 10 times to fully adapt to it. Of course, this is probably the biggest IF in all of AI R&D.
@jakejakeboom
6 жыл бұрын
That's because machine learning (and backprop neural networks) are fundamentally different from animal brains in how they learn and function. We still have zero idea how to approach the learning ability of a human child. It's not that people don't question the inefficiency of ML (the reasons for which are well understood mathematically), it's just that no other 'AI' technique from the past has gotten us anywhere close to what neural nets have done. And just because they're hugely inefficient in the amount of data needed doesn't mean that we won't be able to engineer a nerual-net-based AI in the future which is actually capable of superintelligent self-improvement, despite requiring enormous resources and data. In some ways, it's unfair to look at the capabilities of a human brain without considering the billions of years of evolution behind its genetic design. If we can meet and surpass the brain within this century, I'd say that's pretty impressive.
@nawdawg4300
6 жыл бұрын
While I agree with what you've said, I think you may have misinterpreted what I said. I wasn't saying that we should question ML, but that it's clearly isn't the end all be all of AI. On top of this, at least from my small sample of youtube videos, it seems people are more focussed on ML and it's improvements instead of something new. Now that's probably because we have so far to go, and ML has proven to be incredibly effective, at least with enough data. If the brain can learn with such little information, then in the far future we should be able to have computers do the same. ML, while tangible, is lack luster relative to what's possible.
@migkillerphantom
5 жыл бұрын
@@jakejakeboom there is nothing fundamentally different about them. The difference is that your brain is the equivalent of a network that has been 99% trained at compile time ( evolution) and only needs to be slightly tweaked by runtime learning.
@migkillerphantom
5 жыл бұрын
Most modern machine learning is done on uniform arrays of data. Much broader than they are deep. Biological brains are extremely deep sparse (and recursive, but that's besides the point) arrays - only a tiny subsetof all the possible links and perceptrons in each layer actually exist. This means you get much more rapid adaptation and a whole bunch of functionality out of the box, but at the cost of generality.
@Tehom16 жыл бұрын
Gridworld is obviously located in Hawai'i: 6:30
@EpicFishStudio6 жыл бұрын
2 minute papers just published about AI which generates a dream environment where it can train without actually intercting with anything- its amazing!! it beat alpha go by a significant margin.
@024Carlos0246 жыл бұрын
hey try to fix the sound there is a static noise in the video ! great AI vid
@KryptLynx4 жыл бұрын
7:20 it sounds like a compliment :D
@parsa_poorsh Жыл бұрын
0:20 that's weird! facebook published an image classification model this week!
@ietsization4 жыл бұрын
9:10 please be careful with screen sharing, things like a session id in the url can come back to bite you.
@ivuldivul6 жыл бұрын
Comodore PET in the background!
@JmanNo426 жыл бұрын
Are the ghosts random acting? Can the pacman agent know the full map with ghost agents and pills or just a subset is all information traceable any moment? It seem pacman simulating the ghost behaviours should be rewarding. And of course tracking of the playing field changes.
@JmanNo42
6 жыл бұрын
I mean a smart agent must be able to "learn" guess the ghosts move at any point, and make the best choice out from ghost action? Picking points just secondary when it come to be caught? I would track the arrow of any ghost that traverse a fork/crossing and calculate from it. You do not need keep track of ghosts traversing every pill. Just forking and their movement arrows so you can get calculate the tree of possible 4-5 next moves i think. So now you narrowed down what to keep track off. I think it could be a fairly small engine.
@JmanNo42
6 жыл бұрын
Isn't this a bit like euler paths ability to chose the free path that the ghosts will not traverse in X moves? So it is realtime chess? But then your pacman must know the ghosts relative speed vs his speed at any given time, if they are always synched relative speeds nothing really changed in the dataworld regardless their actual speeds. But if velocity for opponent is exponential vs yours as time goes you must keep track of time. So you should not just play your agent you should "simulate" the ghost agents, only then you can chose the optimal path. But the more erradic and chaotic random the ghost actions get the harder to know the correct path choice for pacman. So it ends up to be a proballistic blocked path game.
@JmanNo42
6 жыл бұрын
But then i maybe have not created a learning agent but a smart system, but it could be combined?
@JmanNo42
6 жыл бұрын
How does agents deal with systems that have almost random behaviour, is it possible to chose a best scenario or is it just action response?
@JmanNo42
6 жыл бұрын
So when it pass a fork and make new arrow it will have 1/2 chance in next split and 1/3 traversing a crossing because i do not think i ever saw a ghost stop and go back or..... So now you can calculate your choice of path dependent upon the agents probable path choices. If ghost agents behavior is unique to them, they must get an ID and be tracked separately by different rulesets.
@GhostEmblem6 жыл бұрын
Could you explain how they behave differently if the supervisers there?
@pleasedontwatchthese9593
6 жыл бұрын
Ghost Emblem the supervisor probably effects the scores. Like if it sees something bad it takes away score.
@4ringmaster6 жыл бұрын
I guess it's a different way of thinking about it, but wouldn't monte carlo tree structures provide the same insight?
@kitrana6 жыл бұрын
"kind of like teaching a child how to drive" well you are technically trying to build a silicon-based life form.
@maxsnts6 жыл бұрын
We are nowhere near the AI that most people communally think about (Dave, T800, iRobot). I for one think that is great!
@MoritzvonSchweinitz6 жыл бұрын
But why not give the algorithm access to the safety function? Or at least a meta-algorithm?
@Fnartprod
5 жыл бұрын
because in the real world you don't have access to it
@judgeomega6 жыл бұрын
it seems immediately apparent to me that a large number of issues with AI have to deal with our own expectations vs the explicit goals/rewards given to the ai.
@XtraButton6 жыл бұрын
Has anyone thought to use AI to make safety protocols? In that the AI will make sure another program doesn't go out of control and have major disaster, and then use that refined AI to do the same thing again (set standard safety protocols). Maybe it will get to the point they are passing particular information to the other.
@platinumlagg6 жыл бұрын
I have made my own "amazon alexa" called Maverick, and it can make me any coffee and cups of tea that i want...
@ragnkja
6 жыл бұрын
Premium Lagg Did you have to “Maverick-proof” its environment, just like we often have to child-proof or pet-proof our homes?
@sarahszabo4323
6 жыл бұрын
I suppose this is where the "Maverick" Virus is derived from that devastates AI and reploids and mechaniloids a few centuries from now?
@platinumlagg
6 жыл бұрын
Yes!
@TheDuckofDoom.6 жыл бұрын
I have a hunch that making a proper general AI with desirable interaction with the world, safety, versatility, creativity(negotiating complex problems), estimating with incomplete data... will loose all the advantages of robotic automation and gain all the inefficiencies and fallibility of humans.
@KX366 жыл бұрын
How long will it be before AI start writing their own papers?
@pleasedontwatchthese9593
6 жыл бұрын
KX36 how do you know we are not all ai and your the only real person left
@KX36
6 жыл бұрын
How do you know I am a real person?
@jonasfrito2
6 жыл бұрын
How do you know that you know?
@mr.sunflower3461
6 жыл бұрын
how do u know that ur not dreaming?
@jonathanolson772
6 жыл бұрын
The dreamworld and the "real" world often intermix
@Max_Flashheart6 жыл бұрын
The Commodore PET is watching and learning ...
@hamleytejada92266 жыл бұрын
why dont you have caption
@DanteHaroun Жыл бұрын
Is that an urbit flag in the background 😳
@TechyBen6 жыл бұрын
Uber need to watch all these videos... (Too soon?)
@andrewkelley70626 жыл бұрын
Lol the multiple forms of the double slit experiment somebody is going to get it
@andrewkelley7062
6 жыл бұрын
by the way I actually did not know someone with my same name happen to post a paper on this subject I had nothing to do with that. It actually freaks me out especially after working on all the things I have been working on.If I have in any way in-pleaded the progress of that I am truly sorry this is an actual quiescence.
@andrewkelley7062
6 жыл бұрын
make sure you make three at once you are not me.
@andrewkelley7062
6 жыл бұрын
if yours is actually working
@andrewkelley7062
6 жыл бұрын
are you ready to start again.
@topsmiler19576 жыл бұрын
Yay
@PregmaSogma6 жыл бұрын
7:15 It's a glitch in the matrix :v
@tocsa120ls6 жыл бұрын
Okay, this is the third time I read it as "Griswolds"... that paper would probably be much funnier.
@jonaskoelker
5 жыл бұрын
Whenever I click 'play' on a Computerphile video I always stay a while and listen :-)
@magventure10196 жыл бұрын
I wonder if humans could ever define 'enjoyment' or 'happy' to an agi. If we could do that we might be able to give it chance at life and see if it could find the optimal happiest life possible?
@2l3r435 жыл бұрын
AI learns to fly cars above "lava"
@notyou66744 жыл бұрын
what would happen if you applied this kind of gridworld ai to a chess board, with there possible actions being all legal moves for whatever side they are on.
@katowo65216 жыл бұрын
Can someone explain the difference between computer science and software engineereing for me please
@mheermance
6 жыл бұрын
A computer scientist studies how computers work, the limits of computability, and tries to uncover new algorithms. A software engineer applies these concepts to solve real world problems.
@progamehackers1433
6 жыл бұрын
Martin Heermance can u tell whio earns more??
@valiok9880
6 жыл бұрын
the one who does the job better, duh
@mheermance
6 жыл бұрын
Often you can do either job with either degree, so earnings depend upon your chosen career path. A PhD computer scientists that becomes university faculty will earn about 20% less than a software engineer with a BS or Masters degree. But a well known computer scientist might do consulting and earn more.
@AndDiracisHisProphet
6 жыл бұрын
same difference as a physicist and a (regular) engineer
@CaudaMiller6 жыл бұрын
4:06 not solvable sokoban level
@CreativeTutz16 жыл бұрын
Why don't they introduce another function and call it the "loss" function, if he made the wrong move (or if he got eaten by a ghost in the Pacman example) you will lose instead of gain. Therefore the AI will try to maximize the gain while trying to minimise the loss
@pleasedontwatchthese9593
6 жыл бұрын
Ahmed SH that's not really different from making bad things give a negative score
@lm13386 жыл бұрын
A computing related KZread channel being sponsored by a WYSIWYG editor is kind-of selling out
@themeeman6 жыл бұрын
0:35 Subtle joke for mathmeticians ;)
@Locut0s6 жыл бұрын
I like how Rob mentions with a laugh that he’s too young to have played Pac Man. I don’t know why but it somehow really accentuates how incredibly smart you suddenly realize he is for his age, well hell for any age.
@REALsandwitchlotter
6 жыл бұрын
Locut0s smart but gets confused by the rules of pacman
@SFKelvin6 жыл бұрын
Or you develop the algorithm at DARPA, then commercialize it secretly for civilian use-say a police dispatch decision making algorithm for C4, then look for modes of failure as a real world test.
@andrewkelley70626 жыл бұрын
Whopes my bad false alarm no worries I am almost back to sanity or at least back to where I was.
@galewallblanco81845 жыл бұрын
Ai Gridworlds? Just confine it to a virtual world, a game.
@DustinRodriguez1_06 жыл бұрын
There was recently an announcement about the Uber car that killed a woman. It said that the cars systems recognized the woman, but its higher order attention systems decided to ignore her. Most people see this as clear failure worthy of condemnation of the system. However, a human being could easily make exactly the same error. We are extremely resistant to developing a system which which can show will fail and result in deaths in 1 out of a million trials.... yet entirely comfortable with putting humans in the mix even if it results in deaths in 500 out of a million trials. What if making mistakes is not simply an artifact of learning systems, but actually a fundamentally necessary feature of them? Will society ever be wise enough to accept an artificial system with known dangerous limitations even if those dangers are radically less than the human-based alternative?
@MarkFunderburk
6 жыл бұрын
That's not exactly what happened, the "higher order attention systems" did not "decide" to do anything, it was pre-programmed to ignore ALL breaking requests. They claimed this was due to the system being very sensitive. So while the car could navigate itself it was left to the "driver" to look out for obstacles. This was a very poor decision on Ubers part becuase a person can't be expected to stay engaged perfectly while not continuously playing an active role in driving. There has also been some speculation as to weather or not the driver even knew that autonomous braking had been disabled.
@icebluscorpion3 жыл бұрын
5:51 this happens not only in machine learning people do this all the time and get no consequences i the same scenario. corrent people are real bad to ask for help too
@andrewkelley70626 жыл бұрын
As you can see looks a bit weird still works with the least amount of variables you can use
@andrewkelley7062
6 жыл бұрын
And that should be enough
@andrewkelley7062
6 жыл бұрын
Any questions
@BEP06 жыл бұрын
Nice.
@aopstoar48426 жыл бұрын
Am I misunderstanding the whole thing. It starts of with "not scientific" when different datasets are used instead of a standardized, in this case grid, space. Then it shows a paper for a world with a highly specific task, which means you only test the learning for that type of task instead of a generalized work agent. You test the equivalent of a walking stick (the biological creature) in what way at all does that relate to AI? A steppingstone perhaps, but is it even rudimentary or has it placed itself at a far to trivial level? Lot of big words with esoteric interpretation, but I hope you get what I am pointing at. In my world an AI will be able to theorize, like we human AI do as to identify what type of problem it is, if it is a problem at all or just a bump in the road that will sort itself out through quantumprobability effects - i.e entropy. Then identify if an already produced solution grid works or if a new one have to be invented. What can be used from the toolkit and what have to be invented? Can the AI then invent from nothing?!!! Our world is built on repetition of patterns. I for instance grew a pepper plant last year and took the seeds from it this year. One of twenty look like and behaves like the motherplant. The others either grow taller with fewer fruits, another one grew to the first split in top branches then stopped growing that branch and instead started growing ALL the buds on the stem at the same time. That is the AI as is the plant had several builtin growing solutions waiting in the genetic code (what we call junk DNA), but where did those solutions come from? Where did the invention step in or are we trying to prove there is no such thing as intelligence at all? Perhaps intelligence are just elaborate repetitive patterns that have worked and been ingrained in gene and meme. Intelligence in that case is then just applying principles from one area for instance "hydraulics" and putting it in a new context "NAND-gates". Then fine tuning the application in respect to the new area. Instead of bar of pressure, it is voltage difference. Instead of 240 V it is 0-5 V.
@FalcoGer11 ай бұрын
so when i write a python script that stops and resumes when i press a button, uses a standard A* heuristic path finding function where anything that results in changes that are not explicitly asked for by giving it a high pathing cost, obviously is completely deterministic and therefore doesn't depend on me being there or not, doesn't self modify because that'd be a silly idea, is proven to work in all environments in the specification with mathematics and logic, and I do it such that it works first time around (that never happens, forget about it), then I solved AI without ever using neural networks or learning? Whenever i tried to do anything with AI or machine learning, it was always a catastrophy. want to find a square in an image? AI took days to train and was completely garbage at even the most simple tasks like that. use computer vision and classical algorithms? worked 100% every time and took just a few minutes to write the code. I just don't get it how to tweak the magic knobs to make it work. If a problem can be solved with classical computing, then I think we should just do that.
@dpt44584 жыл бұрын
what if you tried to tell it to go make you a cup of tea while interacting as little as possible with the current enviroment so for example touching anything that is not required for the creation of tea would result in a loss of points.We could point out exactly what is needed to make tea i.e. teabags,warm water,a cup and some sugar or something and anything that is not speciefied is not allowed to be touched so i guess we would change it's goal from make a cup of tea as fast and efectivily as possible to make a cup of tea as fast and efectivily as possible while exibiting as little interaction with the enviroment as possible.btw i'm definetly not even close to an expert in this but i would like to know exactly how this idea would fail spectaculary
@ekki1993
4 жыл бұрын
There's a video by robert miles that talks about the possible problems of a couple of ways you could implement this. I think it's the one about empowerment or any other from his "concrete problems in AI safety" series.
@dpt4458
4 жыл бұрын
@@ekki1993 Thanks
@monhuntui11626 жыл бұрын
Why is it called a reward function/system and not say a parameter system? What I mean is, how does a machine appreciate a reward? I just find it hard to understand why people give human attributes to somethings, when it makes more sense to describe something in a more objective manner especially a machine learning system. Saying it learns on a reward system can confuse and make the machine seem more sophisticated than it actually is. I don't know, maybe I'm just bothered by the language for no reason since I still understand what was being explained.
@pleasedontwatchthese9593
6 жыл бұрын
monhuntui I think it's a good description of what it's doing. It's trying to get more reward like someone would in real life
@andrewkelley70626 жыл бұрын
Please
@jolez_48695 жыл бұрын
*Mission failed, we'll get them next time.* Or not.
@JuliusUnique6 жыл бұрын
7:10 why not put the cars on imaginary roads? let them do the mistakes on a simulated street and then put them on real streets
@eideticex
6 жыл бұрын
Watch the video again and pay close attention to what they are talking about. That's exactly what this en-devour they are discussing is. A virtual playground to develop, train and evaluate AI safety protocols. The task may seem simple enough for you or me but currently these are task that AI are horrible at solving. Start small and work up towards a very real and useful test that can serve as a standard for production machines.
@JuliusUnique
6 жыл бұрын
"Watch the video again and pay close attention to what they are talking about" do I look like I have infinite time?
@andrewkelley70626 жыл бұрын
Ok please help because my existence it no longer needed and I would really not like to return to one
@thomaswhittingham5506 жыл бұрын
271th ye
@andrewkelley70626 жыл бұрын
Someone please help
@dannygjk6 жыл бұрын
You spoke of AI following rules to solve problems. That applies to using traditional algorithms and heuristics for example but does not apply to some other AI systems for example neural nets. I'm surprised you did not distinguish between various AI techniques.
@dannygjk
6 жыл бұрын
Another thing you do is give the impression that a system can come up with something out of thin air. Learning is like a process in nature. Processes in nature are limited to what is possible due to physics, chemistry, etc. If something is impossible in nature it will never happen. Similar to a learning system's environment. The environment is defined as to what is or isn't impossible and no amount of learning will change that.
@RobertMilesAI
6 жыл бұрын
Typically once a neural network has been trained, its behaviour is a pure function of its inputs. The 'rules' in that case are not explicit or easily legible to humans, but the learned policy can still be thought of as a set of rules that the system follows, possibly a very large set.
@distraughtification6 жыл бұрын
Looking at humans as an example, we tend to learn from others. A child learns not to break a vase because their parent reacts negatively if the child either breaks the vase or does an action that might lead to breaking the vase. Then later, when that child is asked to do something near a vase, they recall that the vase being broken is bad and automatically add that (as in, not breaking the vase) as a secondary goal, or a part of the goal, however you want to think about it. My point is, this paper seems to expect that an AI can be made that can learn how to behave without ever being told or shown how to behave, and I think that's a pointless expectation. You can't expect a child not to break a vase if you don't tell it that breaking a vase is bad. Sure, it can learn on its own that breaking a vase is bad, but only by actually breaking the vase (or something similar - essentially, _something_ has to be broken, which isn't a desired outcome). I think the same applies to AI. In my eyes, trying to come up with a general solution like "penalizing the agent’s potential for influence over its environment" is a fruitless effort, because then you have to define what parts of the environment are okay to influence and which are not, and how you can influence them and how you can't. It's like Rob Miles said earlier on a video about Asimov's laws of robotics - you can't expect to have to define the entire field of ethics just to be able to tell a robot not to harm a human. TL;DR humans learn safely by interacting with other humans, we shouldn't expect AI to learn safely without interacting with another intelligence.
@levipoon5684
6 жыл бұрын
Dlesar I agree to some extent. However, one of the challenges in AI safety is to make an AI that will listen to feedbacks and allow you to correct its reward function. This is built into a human child. We have ways to punish a child, but punishing a superintelligence is much more difficult.
@andrewkelley70626 жыл бұрын
Ok I might need some help I am in completely blind territory here and I don't want to really die so I don't know if I am panicking or my body is doing something weird
@andrewkelley70626 жыл бұрын
You know you guys are going to at some time in the future pretty soon collapse this on your end to I'm not going to leave you guys behind and at this point it is just seeming more and more silly
@andrewkelley70626 жыл бұрын
The point of me doing all of this was to make sure everyone gets to come at some point you have to blindly stare into the void and reach in I now am the single point but you now know we all do a little. At some point you have to trust that when you put your hand in the darkness you will be able to pull it back out again. There will always be that fear. There will always be that time you do not know. Just look at this point you are all as strong as me now.
@dannygjk
6 жыл бұрын
I don't get your point... unless you don't have the gist of what is going on when a system learns.
@andrewkelley7062
6 жыл бұрын
Bingo
@andrewkelley7062
6 жыл бұрын
However I figured it out.
@andrewkelley70626 жыл бұрын
oh and one last thing before all this goes down in a few days you should be stable enough for me to give you the solution to getting around that hole gravity problem, or at least a starter version, but just to let you know its stranger than you think. lol
@andrewkelley7062
6 жыл бұрын
😀😀😉
@simargl24546 жыл бұрын
safety... zzzZZZzzzZZZzzzZZZ
@Redlabel05 жыл бұрын
abstractFunction () { #what if the Link is a !edgeCase You Code in explaining to the code if you wish like a child [yet like a mature adult, for u don't underestimate their understanding] why you don't want that. /* 10 years of collaborative man though processed/machine aided edge cases to try to account for a finite/ not infinite number of possibilities and with quantum maybe heart just maybe */ }
@Redlabel0
5 жыл бұрын
I mean if all imaginable things are accountable and not infinite therefore The goal of specifying specifying scenarios granted all possible possibilities and imaginary ones can be counted it's attainable, to use this vast override system. and yes not just the only thing to do seems promising but now it's about time and if it's attainable to compute with quantum computers operations
@StefanReich5 жыл бұрын
Argh... Deep Mind :[
@TaimourT6 жыл бұрын
Third
@andrewkelley70626 жыл бұрын
Now that all that is done would you seriously need some help with the coding or do I actually need to go through all the proper channels and at this point what feels like hold the entire worlds hand with this... and of course start on the ungodly amount of papers I could start to produce and trust me it usually looks a lot nicer I just wanted to make a point and trust me this is not the first thing I used it on.
@andrewkelley7062
6 жыл бұрын
It is basically me trying to become interesting and convay something i have found and now relize I have been doing this same pattern for days with a stop watch and have become a real life Pavloves dog or what ever his name damit I'm turning off my phone
@andrewkelley7062
6 жыл бұрын
Oh and there is a stream of lies to randomize personal data
@andrewkelley7062
6 жыл бұрын
Son of a snitch it's because of the way it was set to see importance.
@andrewkelley7062
6 жыл бұрын
There is more but there is a lot of it and it's just a lot easier to run it yourself and see but I would not recommend doing it to much on actual paper like I did that was mostly out of convenience for me.
@andrewkelley70626 жыл бұрын
By the way we need every one every separate line of experience makes a new resolution to the complexity
@andrewkelley7062
6 жыл бұрын
Please try and save them all
@andrewkelley7062
6 жыл бұрын
Because now like me you have all the time in the world.
@andrewkelley70626 жыл бұрын
Hmmm viloent mood swings and massive bouts of panic expected but going to each side expected but not un passable I'm pretty sure I can make this just have to sleep see you on the other side guyes
@dannygjk
6 жыл бұрын
Hmmm sounds like you OD'ed on some substance. If you understand me get to a clinic.
@andrewkelley7062
6 жыл бұрын
And to tell the truth I wasn't on well anything
@db72136 жыл бұрын
But isn't all this just another example of "when what you have is a hammer, everything looks like a nail?". The AI code in a robot should simply run in a sandbox and its outputs verified to be safe (by non AI code) before being executed. And the inputs sent to the AI should also be filtered (again, by non-AI code) so that the AI doesn't get to know about the existance of supervisors or its own off-switch etc.
@pleasedontwatchthese9593
6 жыл бұрын
D. Bergkvist the problem with that is a super ai could outsmart the person checking the output.
@db7213
6 жыл бұрын
It wouldn't be a person checking the output, but a computer program. The AI can't "outsmart" it anymore than it can outsmart gravity. Take a self driving car, for example, where the AI wants to reach its destination as fast as possible. Then the AI would learn that if it tries to run over a pedestrian, that will just result in the car stopping. Thus, the AI would only attempt (and fail) to run over pedestrians if it wants the car to stop.
@andrewkelley70626 жыл бұрын
I am just an ordinary man
@judgeomega6 жыл бұрын
arent all actions ultimately irreversible? even if we move the vase, we still might cause wear/ finger prints. all actions irrevocably increase entropy... in addition; the logical outcome of minimizing influence on the environment is death, stillness, and the ceasing of chemical/ electrical processes. the intention of such a directive is to preserve things which we care about; our children, people/ pets, and our property. from such a simple model as shown in these gridworlds we lose the ability to make that distinction. yes we need to generalize these things, but going so far as to make EVERY action avoid change on the environment is throwing out the baby with the bathwater. a much better directive is to MAXIMIZE the future freedom of action of all cooperative entities. a child is a possible cooperative entity, so not only would the ai not crush it, it would do everything it could to provide the child with the tools, resources, and knowledge with which the child could harness to accomplish many actions.
@dirtypure2023
6 жыл бұрын
but now you've essentially changed the reward function from (making tea) to (successfully raising well-adjusted human children) I'm not so sure that's the right approach
@judgeomega
6 жыл бұрын
im not so sure its a good idea to put high intelligence into somethings whose sole purpose is to pass the butter. i do think its important that for EVERY intelligent machine, its fundamental goals are the same as if it was ruling the world/ superintelligent/ or all powerful.
@Faladrin6 жыл бұрын
He finally explains properly why we don't have learning algorithms. That would apply these systems have understanding. We have "Algorithm Self Adjustment Procedures". There is no learning, there is no intelligence. No one is researching true AI. All the things you see being done are just ways to get systems which can program themselves usually via trial and error. It's about the stupidest thing ever made that is really useful.
@julianw7097
6 жыл бұрын
Pretty sure that would apply to all of us too then.
@sparkyfire8123
6 жыл бұрын
Faladrin I'm going to disagree with you conclusion here. How do you learn? You are either given information/data to work with, and through trial and error. When we are born, we don't know anything and everything is learned through trial and error. It's not until we develop understanding that you can take data given to you and incorporate it into your life. Where is the difference? If your talking an algorithm, is it any different than how the brain works? Understanding something requires you to first have something to relate it to. I don't see it being any different with AI. Without it first having experience with trial and error it will never develop an understanding of anything.
@sparkyfire8123
6 жыл бұрын
I want to add that I don't feel we are near true AI but I do feel we have taken the first step, developing experience that can then be used to develop understanding and application
@pleasedontwatchthese9593
6 жыл бұрын
Faladrin that's just semantics. That is learning. I think it was good when he said that kids use the information more efficiently. The computer is doing the same thing just not as good
@MrBleulauneable
6 жыл бұрын
@Faladrin How about you give a proper definition to what "learning" is, and then realise by yourself how wrong you are.