AI Draws New Pokemon using Simple Math
WATCH PART 1: • AI Draws New Pokemon u...
Pokemon Dataset: www.kaggle.com/brilja/pokemon...
jabrils.com/pokeblend
SUBSCRIBE FOR MORE: sefdstuff.com/science
SUPPORT ON PATREON: / jabrils
JOIN DISCORD: / discord
Please follow me on social networks:
twitter: sefdstuff.com/twitter
instagram: sefdstuff.com/insta
reddit: / sefdstuff
facebook: sefdstuff.com/faceb
REMEMBER TO ALWAYS FEED YOUR CURIOSITY
Пікірлер: 540
I feel like this is just two pokemon pics layed on top of each other with one having very low opacity.
@spinnis
4 жыл бұрын
That's the problem he hinted at at the end. That's essentially what the AI is learning to do. It's doing well, but it's doing well on the wrong task.
@khomikoow5994
4 жыл бұрын
@@spinnis I noticed that the first time he showed them. Maybe people who know photoshop can see it pretty easily.
@ITR
4 жыл бұрын
Overfitting, lol Guess he needs to introduce novelty somehow.
@masternobody1896
4 жыл бұрын
Finnally I was waiting
@cameron4814
4 жыл бұрын
I'm no expert but, i think you'd probably get better results with something like StyleGAN. I think these overlay/opacity trick looking results have something to do with the fact that the autoencoder is trying to learn the images based on a pixel for pixel loss function... like, if you use something slightly more abstract, like the feature vector from VGG, instead of just scoring it pixelwise, then it gets better results maybe? Id recommend this paper: "Image2StyleGAN: How to Embed Images Into the StyleGAN Latent Space?" also check out the "stylegan-encoder" repos on github by "maxisawesome" and/or "puzer". i'd speculate that any GAN architecture has some higher level abstraction built-in, (higher than the autoencoder), because the discriminator is learning the abstract features of the images and is teaching the generator based on that. Great work Jabrils!!!
Great job! I've got some advice for improving this: - Increase your dataset by more than 100x with small random translations and horizontal flips. Do it each batch or epoch if you don't have the memory to pre-compute it. - Once you start introducing more data, don't use the 4000+ dimensional vector, use a regular autoencoder. - Blending should be done by linear interpolation of the feature vector, the bottle-neck of the autoencoder. - Use a bigger model but add dropout to compensate for that. You should get drastically improved results. Maybe I'll give it a try real quick to see ;)
@digital_down
4 жыл бұрын
CodeParade ....and the master comes to give advice 🙌🙌🙌🙌🙌
@Jabrils
4 жыл бұрын
Thanks for dropping in CP! That is an interesting solution, for part 3 I decided to go with another route, which does include ditching the encoded vector approach. If I can find some time towards the end of this id love to give your approach a try, or if you want to give it a try as you've hinted to, I'd love to take a gander at your results 😍 - Jabril
@edeneden97
4 жыл бұрын
I agree that the middle vector should be a feature vector and not a pokemon vector (so bring back the first part of the NN). But adding random translations will just train the network to use some of the neurons in the feature vector for representing those translations so probably not needed. Also, if all the pokemon are looking to the same side then horizontal flipping would just use another neuron for no reason
@jason_shepherd
4 жыл бұрын
I wonder if a one color background might help with the mess.
@wermaus
4 жыл бұрын
I was thinking the same thing, but i didn't know if it was really the right direction or not, I'm happy to find out that i know a little more than i maybe thought i did. mostly the first 2 points were where my head was at.
"Aloha versions .. whatever that means." Ahahahahah
@kaicordes4074
3 жыл бұрын
I know, right?
@Pancake3000
3 жыл бұрын
Lol yea
@archerbrown8733
3 жыл бұрын
Imagine not knowing what alola is lmao uncultured
@elliot_rat
3 жыл бұрын
@@archerbrown8733 imagine not knowing what a joke is
@archerbrown8733
3 жыл бұрын
@@elliot_rat bud i was adding to the joke
Eyy Jabrils, love the video and... Please don't let us wait another month for the next episode 😅
@wondercoll3719
4 жыл бұрын
@T0M yup, it went by real fast
@arathorne8448
4 жыл бұрын
michał co ty tu robisz
@Pigcogames
4 жыл бұрын
I feel like we all need a periodic fix of Jabrils content
@nathansnively1537
3 жыл бұрын
Its been a year now
2:32 did u really just use google to search for google..
@khomikoow5994
4 жыл бұрын
i do the same lol
@JamieAubrey
4 жыл бұрын
We've all done it
@darkinferno4687
4 жыл бұрын
Reminds me of the _floor is made out of floor_ meme
@nickrameau938
4 жыл бұрын
@@JamieAubrey No! I've never done it.
@JamieAubrey
4 жыл бұрын
@@nickrameau938 One day, you will accidentally do it
I'd honestly try with more variation. First of all, having reflections in there seems like a good idea - though have reflection for *all* of them Second, what you *could* do is take out the background color (make it transparent) and replace it either by a random (at-training-time!) flat color or even by random noise, artificially inflating the training set by a HUGE factor while keeping the relevant features consistent. You could even use that to make it do transparent sprites: Anything that's background ought to be put in a transparency channel. Third you could use the multiple facial expressions, but make sure to have every Pokémon have an equal chance of coming up, basically by first sampling from Pokémon number (or something like that), and then from expression (if applicable for the Pokémon)
@yarood
4 жыл бұрын
I think adding shifting Pokémon inside a frame could also help (not sure if there is a lot gain once the background is transparent), but this again could add a lot of "consistent variability" to your data.
@hecko-yes
4 жыл бұрын
maybe also add random hue shifts since poke men can be pretty much any color (like with codeparade's fursona generator)
These pokemon loom like half opacity overlaps w hella noise lmao
My dude, I think you've been hax.
This man said aloha versions.. 🤣🤣🤣
Flipping images is a great way to increase your dataset size. Andrew Ng discusses this in his machine learning course. However, exact duplicates are bad. Make sure to remove those. Thanks for a great video. I love your content 😁
@nikolaselic4529
4 жыл бұрын
Flipping images is alright in some use cases, but this ain't one.
@Jabrils
4 жыл бұрын
Nikola is correct - Jabrils
@mattgoodman2687
4 жыл бұрын
Nikola Selic I’m curious. Why is that?
The fact that at the Horsea+Charizard combo it made a Horsea with angry eyes is pretty dope.
It looks like those images are just overlapped with transparency. Is that normal?
@adriandeveraaa
4 жыл бұрын
for the blending yes. as for the learning algorithm that outputs them, no. Its not as clean because it hasnt learned how to draw various pokemon at different perspectives. it would look more like random pixels if the perspectives were random (ex full body pokemon, portrait, not a front-face profile image etc)
@misterkid
4 жыл бұрын
@Making Tech Friendly I think you spent more time commenting this exact comment in multiple places than any actual contribution
3:51 Jabrils: “there was only about 150” PNG 151-251: “am I a joke to you ?”
There is a website where you can get Pokémon fusions so you can have a potential data set of MUCH more
@trollsometimes9789
4 жыл бұрын
Fusion aren't like this tho
@danzackblack5829
4 жыл бұрын
plus the fusions actually look good not this tho
@jdavis.
4 жыл бұрын
I think what @Lokiop is suggesting is that you could feed in all the original sprites plus all the fusions and you'd have a HUGE library of source images
To answer that question at the end: It's like a 50% transparency of the two. But here's why I think that's important, and why I think you're on to something: It looks like the neural network has, perhaps, memorized a compressed version of each pokémon, instead of memorizing a collection of features in an arrangement that make up a pokémon. But how to fix this?… An alteration to the architecture? An alteration to the dataset? Both? How to do either?… What about adding more layers to the neutral network, but make the layers fairly small? But make a relatively smooth gradient of change to the number of nodes per layer. I'm thinking a neural network somewhere in the ballpark of this: [1st layer node number = (width/2)*(height/2)],[2nd layer node number = (width of 1st node layer/2)*(height of 1st node layer/2)],[3rd layer node number = (width of 2nd node layer/4)*(height of 2nd node layer/4)],[4th layer node number = (width of 3rd node layer/2)*(height of 3rd node layer/2)],[same number of nodes as previous layer],[same number of nodes as previous layer],[same number of nodes as previous layer] Something like that. Wouldn't it be nice if I could communicate this idea a bit more elegantly? _sigh_ The idea is to choose an architecture that makes it very difficult for memorization, but easy for abstraction. Hopefully while maintaining a visually appealing result… But no promises!
We are seeing sqrt(2). Or more specifically the vector is only sqrt(2)/2 ≈ 0.7 long
@Jabrils
4 жыл бұрын
Lol yes. You've solved it lol
@NinjarioPicmin
4 жыл бұрын
well yeah but what does that mean? how will the vector we get from [0,sqrt(2)/2,sqrt(2)/2] look compared to the [0,0.5,0.5]
@Henrix1998
4 жыл бұрын
@@NinjarioPicmin it will follow a circle which goes through [0,1] and [1,0]
@NinjarioPicmin
4 жыл бұрын
@@Henrix1998 Henrix98 yeah sorry for not being clear, i was wondering how that would affect the pokemon we were getting, i can't really imagine how that would have a positive influence on the images we got
@Jabrils
4 жыл бұрын
@@NinjarioPicmin Part 3 coming soon :D - Jabril
I may be wrong, but isn't flipping the image one of the recommended ways to expand a dataset?
"Aloha" pokemon forms? Are you trying to trigger pokemon fans? IT'S ALOLA FORMS
@amesstarline5482
4 жыл бұрын
This is Pokeblend.
@SoullessCD
4 жыл бұрын
Actually it's alolan
@stephen8602
4 жыл бұрын
Still agree with the main point though. The shear point that he doesn't know what and alolan form is makes this video seems like its click bait for the Pokemon community since he doesn't even play the series. Casuals fishing for views is more like it
@Y0y0Jester
4 жыл бұрын
WTF is that shit
@eliserss9179
4 жыл бұрын
@@stephen8602 No? He just wanted to mix some pokemon together. Sure it's a good idea to use pokemon since it'll give better views than something like digimon but.. Pokemon is far easier than other options, and it's nothing evil to try to get more views. That's literally what youtube is about. Getting views.
Really enjoy the videos you make. Great style and fun projects. Thanks for the content and here is to many more years of fun!
Love the content. Even if projects don't work out, I'll still watch!
I love knowing that you pose with your computer just so you can voice over it - makes it more enjoyable to watch
I only seen like two of your videos and you are a motivation and inspiration to me I thank you New subscriber brotha
I love the storytelling on your videos
Just wanted to say that you were the tipping point in my inspiration to become an AI architect. When I saw the AI drone video It clicked. I now hold digital badges with IBM, have clients, and have more job offers than I can deal with. Thank you for being you. May God Bless You.
ManI love the "Aloha" Raichu
idk how I found your channel by watching database videos, but Im glad I did you're funny af
I have another way of fusing Pokémon : just hire me, I draw you any Pokémon combination :>
Try training your neural network with contours of the characters as well and differentiating between their colors/textures and the outlines when generating the new Pôkōmöns(TM). That way it won't make it seem like they're standing infront of eachother, instead it merges them together.
Can’t wait to see how your SQUIRTMANDER turns out in 2 more parts. Would also love to see a setup video on your PC build that allows you to work with huge datasets
There are many augmentation techniques like horizontal flipping, vertical flipping, zoom, shearing, rotate, height shift, width shift etc available in Tensorflow package to increase the size of the dataset. You can try using them Jabrils.
Complete props to Jabrils on this! For those comments about how long this video took to get out...I'm sure this took a TON of effort! Recognize what it took to make this video. He went through building the network, sourcing the data, MANUALLY CLEANING THOUSANDS OF IMAGES (how is this man still sane?), doing all this in an organized and analytical way, oh, and building out and deploying a website using the model. Let's not forget filming and editing the video itself. Jabrils, you're insane! Keep up the great effort!
16:01 Is the network just averaging images when the dataset is this small? I wonder what would happen if you did something similar to early deepfakes and trained it to turn distorted images into the originals. In any case, I’m loving this series!
@Jabrils
4 жыл бұрын
😉😉😉
Such an awesome and well thought out process method.
That Charmander-Porygon blend looks like it has seen some shit lol
The lesson learned at the 12 minute mark is oddly very fitting to me considering how small the files for the original Pokemon Red and Blue are, and yet how iconic they remain to this day! Loved the video!
*"If ur reading this, u've been hax!"*
@osmedia7239
4 жыл бұрын
*hax
@noahsduck4109
4 жыл бұрын
It auto corrected and I didn't notice lmao
@osmedia7239
4 жыл бұрын
@@noahsduck4109 😂
Mystery Dungeon: Explorers of Time is my OG game! All of the expression images are of all the Pokemon in the same pose. Works wonders. I know more about Pokemon than about this whole AI redraw business, but this seems like a cool experiment.
Data Cleanup is 90% of any machine learning project.
The idea of the auto encoder is to extract the important features of the input and then from these features create output. For example when you input pikachu the important features are things like the color, shape... Then from those features it tries to make a pokemon that has those features. So to combine two pokemons you need to combine their features. 1. Feed both pokemons through the encoder to get their features. 2. Combine their features. 3. Use the combined features as input to the decoder to get a combined pokemon as output.
13:13 *Nigel Thornberry*
Here's an idea. For a larger data set, take the 3d models from one of the 3d pokemon games and have a program automatically create images by taking many pictures of each 3d model at different angles. I also think that you need to have something more complex than just having 0.6 and 0.4 for the blending. You might need 2 different programs. One that figures out the shape, and one the color. So then you tell it to use one pokemon's shape and the other's color. This is just an example. You might need to get more complicated.
Love your videos
Excellent Project! More importantly, rocking the Retro Rock & Rye FTW!!!!
About training data - have you tried to grayscale images, so nn work with right shapes (like for vector graphics maybe) & only after color them from palette? also about cleaning data from doubles - try to use software for that, named like duplicate image finder, works faster than doing only by hands (need some assistance still)
I think what this lacks is a separation between silhouette and texture. If you could reliably detect the boundaries of the pokemon, you could reliably morph between the two shapes, and also use those silhouettes as a mask for pattern detection, dealing with contamination from the background colors.
Hello jabrils I want to learn to program but I don’t know what laptop to get. I am kind of on a budget and was wondering what you would recommend for a laptop I could get for programming. Btw I love your videos and you have taught me a lot.
I honestly can't believe it took you until the end of the video to notice that it's just overlaying the 2 images with a lowered opacity. I've been screaming at my screen this whole time waiting for you to acknowledge this. I was thinking it was looking for outlines and trying to transition between the lines of 2 images.
You can make a separate program that cleans the blended images (makes more bold outlines and colors) and makes them seem more pokemon-ish
The paper "Understanding and Improving Interpolation in Autoencoders via an Adversarial Regularizer" talks exactly about what you're trying to achieve. Maybe it could be helpful.
@Jabrils
4 жыл бұрын
Thanks for sharing this, will give it a look! - Jabril
Great work, Jabril! But as for me, the result is just like changing the opacity of two images, merged together. And you interpret the result as good when two initial pokemons have common shape, so merging them will yield the same similar shape. But actually, autoencoder doesn't create new unique shapes. What if you try to train autoencoder, to create shapes, by feeding him only the contour of the pokemon, or greyscale images? And after, for example you add color with another autoencoder? Waiting for the next part ;)
If you're trying to draw a pokemon in the style of another, you could maybe check if neural style transfer could be applicable/give you the results you're looking for. Instead of a vector, you combine the "style" of one with the "structure" of the other.
Would you be able to write code that rates the "pokemon-ness" of the final products and then sorts them? And then discards the ones that don't look like a pokemon (based on machine learning having seen many pokemon) The problem I think with your method is that instead of mixing two or three pokemon it mixes too many with every attempt at a new image...
What if you removed the background for each image and ran the product through a denoiser? Or did stuff like eyes and body separately and put them together?
Jabrils's machine learning autoencoder : "Would you look at that, what a mess" Google's machine learning autoencoder : Pokemon game : New Pokemon found! element : Google
whoa all this hard work for learning it was factor of opacity that is changing the picture tho i guess u learn a lot from it
Can you put the link in the description for the album auto encoder? It looked really cool! Btw great video as always
Yo i'm a big fan of Gambino and when i saw 0:52 i was shocked to know that you know him too and those albums mashup of his. And i also like Kanye and Tyler
the title "pokemon super mystery dungeon" should have a dataset with 720-ish pokemon, and should have a consistent art style. this is a super cool idea!
12:54 That one has seen some shit
You should try the same tecnique they use for vectorizing words in GPT2. Take the original NN and save the values of the bottleneck layer neurons to have a vector rapresentation of each pokemon. Like this the similar pokemon will have closer vectors to each other and thar will make interpolation and blending much better
You could try to vectorize the images of your last dataset and let the ai play around with mixing up individual vectors from svg files (so they actually form distinguishable shapes). That might produce cleaner linework. The blended image look is fine if it's just about getting them together somehow, but it isn't really any kind of artstyle. Great idea tho. I really like it.
Yoooo the DK 64 Crystal Caves music comin in clean at the 2:30 mark noice
Maybe you could increase the efficacy of the data? Making heat maps of distinguished features? Or color coded cell maps of like the anatomy of the creature?
@9:34 can someone please tell me where this song is from...
What happens if you just apply random noise to the original dataset to expand your training size? Does it cause output error?
I have an idea for a fighting game where every time you open the game, different AI encoders make whole new images for fighters, sounds, stages, a new game icon, etc.
@Jabrils
4 жыл бұрын
That's actually an interesting idea. Maybe even newly generated textures - Jabril
IDEA: What if instead of bending the images, get a graphic desginer to split each images into parts like "ear1, ear2, nost, mouth, headshape, headcolor"... and then the frankenpoke could choose 1 from each list or from a predetermined lists.
Ty for existing.
Hey jabrils... ive been experimenting with python AI... im trying to make an ai for a 2d platformer that has multiple players... and i need some help figuring out how to separate the player (agent)(player 1(me)) from everything and everyone else... im using python-opencv to process my images...
Keep up the hard work brother I believe in you!!
8:52 Can anyone tell me what that sound clip is from?? I feel like I've heard it so many times, but I can't quite place it...
@Jabrils
3 жыл бұрын
Star Fox 64 announcer before you start a level
@JediMediator
3 жыл бұрын
@@Jabrils I loved that game! Thanks!
An alternative idea this project could tangent into is having the AI generate earlier game sprites based on future. Obv the earlier you get the less of a dataset you'd have to train on. But I believe you can train the ai to limit the resolution, and color variation.
What is the music at 7:15? I recognize it, should know it and it's driving me crazy not knowing what it is
@beautyholic5592
3 жыл бұрын
It's "Yoshi's Story" a Super Smash Bros. Melee soundtrack
Dunno if you already tried but you can increase the dataset by flipping images and also consider transfer learning
@Jabrils how did you connect 4 screens to a single PC?
"What the fuck is this." - Jabrils 2019
Hey! This is an awesome project! First, I've been wanting to learn to play with RNNs for a while, but I have little to no experience with computer programming. If I wanted to take the first steps, where would you suggest I start? I was going to point you towards the TCG for art assets, but it's good to see that you thought of that. It makes sense that the information is too drastically different for you to work with, though. This would be a more complex AI and project, but here are some ideas I had, not sure if they have merit, but I think you're better positioned to evaluate them: ?>What if you had a set of data that was more complex, where [1,1,...] is Pikachu with a neutral expression, [1,2,...] is Pikachu with an angry expression, [2,1,....] is Charmander with a neutral expression, etc... ?>>Would the AI be able to understand that [x,1-5,...] are all the same pokemon and use that to better identify 1 (outline, body shape and color, etc) and 2, so that 1.5 would be closer to the middle of Pikachu and Charmander's outlines and shapes and colors? And, I ask this knowing very little about how this works or how difficult this is to design and define: ?2>Is it possible that your AI can more strategically and/or specifically encode things before decoding them? I remember you mentioning vectors and shapes and that it was complex, but if it was able to (try to) identify eyes, mouths, etc... it seems like it could make better outputs in theory. Last, I feel like you have to have seen this webstite: pokemon.alexonsager.net/ I imagine this is a very "dumb" design on the back end that has hand-separated assets... but could this kind of design work hand-in-hand with an AI to bridge some of the gaps you're trying to get across?
@ZovcDrafts
4 жыл бұрын
Oh! One more thing! Isn't it possible to "grade" the outputs of a RNN to help it do a better job in the future?
*_Nintendo Wants to know your location_*
so are you going to adjust a threshold of how much the ai can reference an image's likeness with another for the new image?
been waiting for a long time
You should try decreasing the latent space. And instead of assigning a "bit" to each pokémon, you would perform a weighted addition on the generated vectors from the first NN.
@ the ending, you talkin abt using root2/2 to keep the norm of the vector the same?
Interesting music in the background at the start - Diddy's Kong Quest... Crows' nest.
Probably using the smallest dataset and adding noise to images and slight linear transformations in order to increase your dataset is a good ideia, like this the actual representation will be susceptible to those variations
I think the issue is that you are computing cost by the pixel difference. In this case it makes sense that the neural net will just overlay the images on top of each-other. With your current setup, you could make an optimization algorithm which, given 2 pokemon, progressively produces a pixel representation of a pokemon which minimizes the distance between the hybrids encoded feature vector and the 2 provided feature vectors.
you could probably get some even better results when somehow using the networks to learn a palette. Like not learning the exact color RGB values, but learning how to pick from a fixed, predefined color palette for each pixel while learning :-)
Imagine someone logs onto his computer and they find 5000 pictures of Pokémon 😂😂😂
Oh i got baited I got excited and thought this was the real video. rip. see you tomorrow i guess.
Here we go again!!!
scale wise, you put more effort into this than game freak did with their "improved animations and models"
what if you removed the back grounds of the 500 pics? would it lead to better blends?
So dead ass this 4 episode series is gonna end in January at this rate👀👀
2 questions: - Did you try forcing a smaller bottleneck in the AutoEncoder? - Is your encoder 100% linear? (I think you should merge the latent space, not the input)
This video is really and the best part of the video is when
MYSTERY DUNGEON SAVES THE DAY... im so excited about this cause i grew up on Rescue team red/blue
Yo. Two things. 1: Just found your channel & it's dope 2: *Newb question* Is it really true that in some applications of AI we cannot really decipher how to program came to it's conclusion?
you can separate te process to color and form, a form giver to the image and an ai that takes the color and put it in the correct spots maybe?
Is there a way to speed everything up when the AI is learning and then slow it down?
Hey Jabrills, What would happen if all your data were in black and white? Would that help in the end result?
If you combine the vectors, it shouldn't be 0.5 and 0.5 because than the length of the vector isn't 1 anymore. It should be sin(0.25pi radiant), this way the length of the vector stays at 1 and the quality of the blend will be more like the originals. This is becouse you are working with the unit circle.