Web Scraping Instagram with Selenium
Ғылым және технология
** IMPORTANT NOTES **
⭐Struggling with some of the commands?⭐
Watch my live webinar about Automating Instagram Comments:
• Video
It's much more detailed, with slower phase and room for questions!
The ⭐new⭐ and improved article is now available on my blog:
www.mariyasha.com/post/web-sc...
Also, please refer to an ⭐ UPDATED VERSION ⭐ of the code on Github with wider functionality and error fixes:
github.com/MariyaSha/Webscrap...
** VIDEO DESCRIPTION **
Struggling with scraping dynamic language websites? No need to worry! I got you covered
with this super simple web scraping with Selenium tutorial!
In this video, we'll create a database of cat photos, which we'll extract from Instagram by searching for "#cat".
We'll work closely with the Developer Tools, target specific elements and discuss the best selectors to use depending on the situation.
We'll also talk about the common errors when scraping React and the best way to tackle them.
Starter Notebook:
github.com/MariyaSha/Webscrap...
Complete Notebook:
github.com/MariyaSha/Webscrap...
UPDATED ENHANCED Notebook (most recent!):
github.com/MariyaSha/Webscrap...
Tutorial on Medium:
/ web-scraping-instagram...
**********************************************
Timestamps:
**********************************************
00:00 - Introduction
01:08 - Download Chrome Driver
02:21 - Set up Chrome Driver
03:07 - Open web page with Selenium
03:41 - Log in to Instagram with Selenium
09:39 - Dismiss pop up messages
12:19 - Search for a keyword
15:49 - Scroll down along the page
16:48 - Select all the images
18:25 - Create a directory on your computer
20:10 - Save images inside the directory
22:42 - Thanks for watching!
**********************************************
Checkout my Github:
github.com/MariyaSha
Connect on Linkedin:
/ mariyasha888
Follow on Instagram:
/ mariyasha888
#webscraping #selenium #instagram #bot #createbot #webscrape
Пікірлер: 692
what happened if the post is not an image? IG lets you upload videos.
@PythonSimplified
3 жыл бұрын
Hi Johan 😀 You'll need to tackle this with a conditional statement, where videos would be saved under the ".mp4" extension and images under the ".png" extension mentioned in the end of the video. Let me know if you were able to figure it out! 😊if not - I can film a quick tutorial showing how to do it 😉
@TheJohanHalim
3 жыл бұрын
@@PythonSimplified please do! Also, how many images should I expect in my folder? I get a TypeError on the last for loop. TypeError: cannot use a string pattern on a bytes-like object I'm guessing its because of video format.
@paulameneses2306
3 жыл бұрын
@@PythonSimplified Hi!! Congrats for the perfect content!! I've spent my day studying for a master data science project and you've been helping me a lot :) I had the same problem today with profiles with some specific photo types, videos and reels.. I couldn't save an image and I got the same error mentioned above.. Could you help us please? Thanks!! ;)
@PythonSimplified
3 жыл бұрын
@@paulameneses2306 thank you so much dear! 😁 Sure, I'll look into it over the weekend and adjust the code to include a conditional statement for videos 😉 we'll be in touch!
@PythonSimplified
3 жыл бұрын
@@TheJohanHalim Yes, you are absolutely correct Johan!😊 You get this type of error when trying to save a collection of images (video) as a single .png or .jpeg image, it's due to an incorrect format. The amount of images you should expect differs from one computer to another, depending on the size/scale of the display. The code in this tutorial would get you the number of images that results in a single scroll event. And as Instagram uses a dynamic language - the more you scroll, the more images are loaded to the page. If you'd like to include several scroll events - checkout my community post, where I include additional resources, a detailed article and code examples on how you can expend this bot: kzread.infoUgwVQazZhNNqwdghhdh4AaABCQ I'll get back to you after the weekend with a solution to your video question 😉
This content is really great. Thank you for sharing it. Years ago I used to do web scraping back when there was a lot less JS and interactivity but haven't done it in a long time. This video got me back into it. Keep it up!
I love your style of knowledge sharing. You made it simple enough to understand by someone like me who is just beginning to learn python. Thanks!
Me saw the thumbnail and click it Me (after 10 mins) : ooh! It's a programming tutorial
@PythonSimplified
3 жыл бұрын
hahahaha indeed! 🤣
@robinferizi9073
3 жыл бұрын
lol 😂
Thank you so much for explaining and showing every basic steps in details ! Lots of beginners like me get stuck in setup steps that can seem obvious to experienced developpers. For exemple thank you for explaining and showing all the download and setup Chrome driver steps. Even on some big websites pass quickly these basics setup steps. I was stuck but thanks to you I made it ! Thanks again !!!
I never imagined that python learning could have this much glamour.
@akilthangavel8500
3 жыл бұрын
hahaha
@AgriculturaDigital
3 жыл бұрын
Very good comment
@johnames6430
3 жыл бұрын
it was either fake Gamer Girl, OnlyFans, or this. But there is a lot of competition in those other areas so she went with this.
@johnames6430
3 жыл бұрын
by the way the whole thing of her in the right side of the screen is planned out, she has her hair there to hide that she's wearing something hoping to appear nude so people will click and it will go viral or something 😂 Wish her the best of luck though! No hate! 😘
@timothyo718
2 жыл бұрын
@@johnames6430 What a hater.
Thank you for this tutorial!. I am currently learning python on datacamp and haven't learned or seen any real world applications. I am definitely going to try this out and add what I learned from this video to my skillset.!
Thanks again, Mariya. Always the right lessons at the right time.
@PythonSimplified
3 жыл бұрын
My pleasure! have fun! 😀
GENIUS!!!!! I really liked your video, you were able to solve the concerns I had and no one else could solve. THE BEST!!!
Solid tutorial! You're great at teaching!! Thanks
@PythonSimplified
3 жыл бұрын
Thank you Adam, glad you liked it! 😀
You explain each step very clearly, thanks for your effort
Mariya, I love you. Thanks, you gave me what I was looking for since 3 days
Compliment from a fellow girl coder, this video was super informative and entertaining and you are obviously bright and talented!
You are amazing in every way! Thank you for this useful tutorial.
i started knowing why i am actually watching your video after the introduction :)
OMG you're the best ! I was hitting my head on the wall. In reality you showed it in a way simple way. Thank you
learned some good web scraping practices here like waiting for elements to be clickable, clearing the input boxes, etc. thanks!
OMG!!! your channel is perfect , thanks for this class !!!!
@PythonSimplified
3 жыл бұрын
Thank you so much Matheus!! Glad you liked it! 😁
@thegrowthhackert4248
3 жыл бұрын
This channel has super easy tutorials on how to do it: kzread.info/dron/YvGiDV1JfJTpphxtKd7r_A.htmlfeatured
Incredibly great explanations 🔥🔥. loved the video.
Perfect intro to Selenium! Very nice video! Thanks again Mariya!
@PythonSimplified
3 жыл бұрын
Thank you Chiranjeeb! I told you you gonna like Selenium! 😁
@chiranjeebroychowdhury7759
3 жыл бұрын
@@PythonSimplified Yep! You were correct!
wow thats amazing Mariyasha, i like your way of teaching, its very helpful for me. Both are in same boat upcoming future data scientist
@PythonSimplified
3 жыл бұрын
Thank you so much Vikram, I'm glad I could help! 😄
This tutorial is so sweet like you. Thank you so much Mariya ❤️
@PythonSimplified
3 жыл бұрын
Thank you so much Saqib! 😀 I have a new Selenium tutorial premiering in 35 minutes: kzread.info/dash/bejne/hoyYya-klpzNgJM.html We're expanding the Linkedin messaging bot to seem much more human than it should be, I highly recommend to check it out! 😁
Great tutorial! as always. Entertaining, useful, and a pretty teacher, as well .-) Keep up the good work...
Thank you very much, this tutorial was very helpful and very very easy to understand, Cheers!! 🚀
Crystal clear and good pedagogy !
Great tutorial, I always learn something new, thanks for sharing
this and arjancodes are by far my favorite channels!
What I mostly learned is a very good workflow to get info and use it. tnx :D
thank you soo much you just saved me! im gonna rock that interview
@PythonSimplified
3 жыл бұрын
That's awesome to hear Arthur! Good luck on your interview! 😀
Waao.Very helpful 🙏🙏❤️More videos like these please 😊
Great tutorial! You are amazing!!
Wish I had known about this tool earlier, sounds very useful
i don't know why i am watching it instead of listening to music but the way she teaches is real fun!!
Perfect tutorrial!! Thanks a lot!!
you are amazing, best python tutor ever :)
Very useful content! Thanks 🙏🙏🙏
I thought I didn't know English but now I think I do. Incredible articulation!!!
Great explanation mariya👍🏻
I'd like to say that I loved it you're amazing and please keep on it, I'm happy too because English isn't my mother language and I understood you very well 😊.
@PythonSimplified
3 жыл бұрын
Thank you Vinicius! I'm so happy to hear that! 😁😁😁 English isn't my first language either, so I'm always trying to use simple words whenever possible (the complicated words are also much harder to pronounce, I sound very Russian when I do this hahahaha) Thank again and Merry Christmas!! 😊
@suiciniv76
3 жыл бұрын
@@PythonSimplified no problem, where are you from? Could you please make a video explaining how to understand boxplot charts? Merry Christmas 🙂
Thanks a lot for all these gifts from you.
@PythonSimplified
3 жыл бұрын
You're welcome, enjoy! 😀
This is the coolest thing I have learnt today
@PythonSimplified
3 жыл бұрын
Awesome! I'm glad I could help! 😁
simply amazing, big thanks. love you
@PythonSimplified
3 жыл бұрын
Thank you so much my friend! 😁
Amazing video!!
loved your style of teaching and Accent.
Cool very cool, You earned My Subscribe. Keep up the good work!
@PythonSimplified
3 жыл бұрын
Thank you so much Eyosiyas, welcome aboard! 😁
Thanks, señorita! Very helpful.
You are the most intelligent and beautiful teacher of all.
I like your video! Thanks!
Really great explanation. 👏
Thank you girl for this excellent content! U get more one subscriber👋
@PythonSimplified
3 жыл бұрын
Thank you so much Bruno, welcome aboard! 😀
Nice video and content, I was actually able to do it! : )
This is amazing. Thank you very much.
I particularly like the background music. Great tutorial!
you're amazing... hats off
Life Saved ! Thank you so much :)
gurl this saved my life
Great channel, great tutorial. New sub.
@PythonSimplified
3 жыл бұрын
Thank you so much! 😀
I love you, you and your code
Very good. Congratulations!
@PythonSimplified
3 жыл бұрын
Thank you David! 😁
This is such a great video!
@PythonSimplified
3 жыл бұрын
Thank you so much Roshan, glad you liked it! :D
@roshanshetty5661
3 жыл бұрын
@@PythonSimplified Would it possible for you to make a video on image processing using the images that we scrapped in this video?
@PythonSimplified
3 жыл бұрын
@@roshanshetty5661 Did you check out my video on image processing with Pillow? kzread.info/dash/bejne/gId81cOAYsSah7g.html You can use the same principles and apply them to the scraped images if you're looking for general processing and simple transformations. If you're looking to classify images with Artificial Intelligence, the video I've sent you above is not gonna help 🤣 I'm working on some more Machine Learning projects, where image classification will be a very important part (that's why we need this cats/dogs database in the first place)... In the meanwhile, you can check out my Flower Image Classifier on Github: github.com/MariyaSha/FlowerImageClassifier This might give you a good example of the pre-processing we do to get the data ready for training. Either way, I hope it helps! 😁
studying with you is a excitement
you are the best Maria Sha .LOL
You are Awesome!! Greetings from Cuba
@PythonSimplified
3 жыл бұрын
Wow, thank you so much Yosdany!! Greetings from Canada! 😀😀😀
Great work!
@PythonSimplified
3 жыл бұрын
Thank you Eduardo! 😃
Thanks, Ma'am for this... Helps too much
@PythonSimplified
3 жыл бұрын
Too much is my favourite quantity! 😊 Thank you, V!
This gender is always so organized.!!! A good session it was with so much clarity.
@PythonSimplified
3 жыл бұрын
Thank you, I'm glad you found it helpful! 😃
Отличная подача материала, так держать!
This channel is underrated, Change my mind!
Very good tutorial. You could use in for loop, the enumerate to avoid the counter assignment. for counter, image in enumerate (images): save_as = os.[path.join(path, keyword[1:] + str(counter) + ‘.jpg’) wget.download(image, save_as)
You're a life saver! Thanks
You are amazing)) really grateful for double enter tip
@PythonSimplified
3 жыл бұрын
Hi Anna, there's actually a better way than double enter! Check out my community post where I included the improved code, it's better to concatenate the url to search for your term 😉 And thank you so much! 😃
@puhozavrik
3 жыл бұрын
@@PythonSimplified woohoo, thank you so much
Great content! Very clear and useful. Btw you don't need to add the local path of the webdriver as long as you have it in your Environment PATH. It looks over there by default. Also, by the end of the video you can get rid of the counter variable if you use enumerate.
@PythonSimplified
3 жыл бұрын
Wow, thank you Yaniv!! This is fantastic - we can save it there once and never worry about it again! !!👏👏👏 I'll just sit down in shame and be impressed with your super-efficient coding skills 😂😂 אגב!! אני ממש שמחה לראות שחברה׳ ישראלים מצטרפים לחגיגה, ועוד עם כאלה עצות נדירות!! תודה רבה יניב, שיחקת אותה! 😀
@piriwo
3 жыл бұрын
@@PythonSimplified חחח ממש לא ציפיתי תשובה בעברית! אבל באמת תודה על הסרטון זה עזר לי להבין הרבה דברים. הPATH היה סתם משהו קטן. תמשיכי כך!
@PythonSimplified
3 жыл бұрын
@@piriwo תודה רבה, will do! :)
@Minerbush
3 жыл бұрын
חשבתי אני הישראלי היחיד פה😅
You're amazing!
*Lol , I Almost Forgot I came here to Learn Python! haha Stunning Looks!*
@PythonSimplified
3 жыл бұрын
Thank you! 😆 I may have went a bit overboard on this video XD
@shaharrefaelshoshany9442
3 жыл бұрын
@@PythonSimplified Super hot supet smart :))
@robertmainville4881
3 жыл бұрын
@@PythonSimplified Well, I think you were overdressed...
Sending an " Amazing" from Brazil here. Amazing.
@PythonSimplified
3 жыл бұрын
Thank you so much Marcos!! Greetings from Canada! 😁
that is so cool. thanks
Nice webscraping methods Great work
@PythonSimplified
3 жыл бұрын
Thank you! 😁 It's also available on Medium with a few improvements: medium.com/analytics-vidhya/web-scraping-instagram-with-selenium-b6b1f27b885 I'm also currently working on a website, where I'll post even a more updated version of the Medium article, where we'll be able to scrape the full-size images rather than thumbnails, and tackle more issues with the ENTER button 😀 Stay tuned!
@gaddamshanthsri125
3 жыл бұрын
😀ok
Hello ,I really like you work, just started to do some web scrapping and you tutorial was of a great help for me , you are organized, your explanation is perfect clear and easy to follow, just one thing I noticed and I already fixed but I wanna know if there is other way around. the problem is the search box don't appear if the chrome screen size is big so we just get the side bar with the search symbol which need to be pressed to open the search box, which I tried to figure out how to make it but couldn't. so I just added a screen size (driver.set_window_size(740,500)) to make sure the search box will appear automatically. If you know how to fix it the normal way that would be nice, Thank you
You explained Selenium very clear. Can you also explain in a video on how to prevent to be detected as a bot? I read many post on stack overflow but Selenium still got detected as a bot, even on the first page load.
very nice tutorial I must say its easy to follow and thanks for all your hard work showing all of us, Just a question could you use a mixture of selenium and beutifullsoup to automate the login and parse the html? just a thought because it all works but the page loads takes forever
You are so glamourous and after that the way you teach.
i want to be your student 😆 you are the best teacher i have ever seen
with yours videos I`ve been deployed my first flask app
nice job :D
Great video. It'd be cool to see one on Insta scraping using GET requests instead of Selenium, it's much faster. There's a good article on Diggernaut about it. Anyway, thanks, keep em coming!
@PythonSimplified
3 жыл бұрын
Challenge accepted!! 😎 Get requests would be the next module I'll cover in the scraping lessons! Thank you Chris! :)
@chris_burrows
3 жыл бұрын
@@PythonSimplified Love to see how you go with it! I've been stuck on it for the last 12 hours :'( I'm getting different static HTML returned from requests.get() than shown in the Chrome Dev Tools. Great channel btw, looking forward to more content!
@chris_burrows
3 жыл бұрын
also, you should make a Discord, it's a great way to consolidate a community and seems like you're building one quickly.
@PythonSimplified
3 жыл бұрын
@@chris_burrows Thank you for suggesting! I'll check it out :) I'll start advertising properly sometime in the near future. For now I take it easy, trying to focus on improving my filming/editing abilities before I go down that road :D
" Very nice :D ^_^ " !!! thank you so much for this tutorial. So far it is just what I needed to get started! :D
Since you call the method of an object on the same line could it be possible that you remove the variable like for instance in case with the log_in variable?
Thank you for the notebook!
Hey, is it possible to get unknown words from one of the translators with selenium? For example I want to word be translater form EN to DE from any vocabulary?
tq this video helped me a lot
Thank you for the video. Very clear and straight forward. Im having trouble with the searchbox.send_keys(Keys.ENTER) command. I tried ENTERing twice but still doesnt work. Any sugestions?
@PythonSimplified
3 жыл бұрын
Thank you so much Luisgui 😁 Try the code I've just posted on my Github, you can solve it with time.sleep(seconds), it's in the "search keywords" section: github.com/MariyaSha/WebscrapingInstagram/blob/main/WebscrapingInstagram_completeNotebook.ipynb I actually just finished working on a Medium article about this where I explain everything in detail, I'm just waiting for my publication to approve it and then I'll send you a link 😉
You are Awesome!
@PythonSimplified
3 жыл бұрын
Thank you! :)
Hello thank you so much this helps me a lot, while following this code I had one error, instead of writing ENTER you can use RETURN so it will work properly. This solve my error.
Hi, thank you! Is there a way to do this in real time, for example to get someone's followers and keep updating for each new follower or something? Because I want to do something like that by showing followers on the screen in real time
hi, thanks it really helps, but do you have any video on how to get all the post information?
You're amazing Girl.
Not able send enter event in search box . tried to place Keys.ENTER multiple times but not executing correctly . do we have any alternative for this ?
I need help When i tried to run the program, there's an error: SessionNotCreatedException: session not created: This version of ChromeDriver only supports Chrome version 110 Current browser version is 114.0.5735.111 with binary path C:\Program Files\Google\Chrome\Application\chrome.exe What should I do?
Just a question about how Instagram would or wouldn't detect that it is being scraped from the user and/or proxy address, when the site is loaded where you can see all the posts without yet having clicked on any of them does it mean that the images already been requested once in order for them to first display? After we collect the image links and again request them with Selenium will it count as a second time? Will Instagram see that it has been double requested from the same user everytime?
The project is very good. Congratulations. I'm still getting started in Python and I would like to know how I copy the list of followers for a given profile. It would be possible?