Advanced Web Scraping Tutorial! (w/ Python Beautiful Soup Library)

Get started w/ Bright Data + $15 free credit using this link!
brdta.com/keithgalli
In this video, we're diving into advanced web scraping techniques with Python. If you haven't seen my overview of the Beautiful Soup library, check it out first for some foundational knowledge. Web scraping is a highly valuable skill, especially for freelance work. This tutorial will take you through sophisticated scraping methods, using Walmart as an example.
Before we start, a big thank you to our sponsor, Bright Data. They offer proxy tools that make advanced web scraping much easier, allowing you to bypass restrictions set by websites. Check out their data sets marketplace for quick access to various data.
In this video, we'll cover:
- Setting up and understanding the HTML structure of a web page
- Extracting data using Beautiful Soup and handling dynamic content
- Implementing headers to avoid detection
- Parsing JSON data for efficient scraping
- Using proxies with Bright Data to bypass IP blocking
- Error handling and retries in scraping
- Storing scraped data and handling multiple search queries
If you need help getting started with web scraping, check out my original tutorial on BeautifulSoup:
• Comprehensive Python B...
Helpful Links:
GitHub Repository with Code Examples: github.com/KeithGalli/advance...
Video Timeline!
0:00 - Intro & Overview
1:30 - Identifying HTML Structure for Scraping (from Walmart)
4:26 - Writing Python BeautifulSoup Code to Extract Info from Walmart.com
7:22 - Implementing modified request headers to avoid detection
6:10 - Handling Dynamic Content
8:00 - Implementing Modified Request Headers to Avoid Detection (look more human when scraping)
9:30 - Parsing Complicated JSON Data (Using LLMs to help)
15:28 - Extending our Code to Collect Info on Many Products (Automating Search)
24:45 - Improving our Code (avoiding duplicates, multiple search terms, using a queue, etc.)
27:20 - Setting Up Proxies with Bright Data (Get around IP Address blocks)
36:35 - Error Handling and Retries
39:36 - Automating actions on pages with Selenium
41:42 - Conclusion & Next Steps
I hope you find this tutorial useful. If you did, please give it a thumbs up and subscribe to the channel for more tutorials. Let me know in the comments how you plan to use these web scraping techniques in your projects. Enjoy scraping!
-------------------------
Follow me on social media!
Instagram | / keithgalli
Twitter | / keithgalli
TikTok | / keithgalli
-------------------------
Practice your Python Pandas data science skills with problems on StrataScratch!
stratascratch.com/?via=keith
Join the Python Army to get access to perks!
KZread - / @keithgalli
Patreon - / keithgalli
*I use affiliate links on the products that I recommend. I may earn a purchase commission or a referral bonus from the usage of these links.

Пікірлер: 38

  • @FIBONACCIVEGA
    @FIBONACCIVEGAАй бұрын

    Im sure Ive told you before on instagram or here but I was waiting for this video . Great content !! Answering your question. I am working on a personal Waste project to search and extract information on recoverable materials. ..I learned Python by watching your videos and I look forward to every time you release a new one...Buen dia, Chao!!.

  • @fzrbigman
    @fzrbigmanАй бұрын

    You're my hero Keith, you made my data science journey in uni so much easier.

  • @KeithGalli

    @KeithGalli

    Ай бұрын

    Happy that I could play a small part in your data science journey!

  • @ben_tyler5
    @ben_tyler5Ай бұрын

    i don't know why but i am always super confident and eager to learn more stuff when watching ur videos😊

  • @shehumahmed
    @shehumahmedАй бұрын

    Another awesome content just dropped. You're a legend. Thank you

  • @KeithGalli

    @KeithGalli

    Ай бұрын

    🙌🙌

  • @leomiao5959
    @leomiao595923 күн бұрын

    Thanks Keith, this is great

  • @sonic763
    @sonic763Ай бұрын

    I’m learning so I can automate my job and be more effective. Thanks for sharing.

  • @meeFaizul
    @meeFaizulАй бұрын

    Keith, you're a natural teacher! Your tutorials are incredibly clear and easy to follow. You've inspired me to dive into this field. Please keep up the amazing work! Lots of love from Pakistan 🇵🇰 ❤.

  • @KeithGalli

    @KeithGalli

    Ай бұрын

    Thank you for the kind words! Glad that you like the tutorials!!

  • @jonpounds1922
    @jonpounds1922Ай бұрын

    Before this I knew nothing. Now I am expert. Thanks Keith Galli!

  • @KeithGalli

    @KeithGalli

    Ай бұрын

    My man!

  • @garyphan-lo4vi
    @garyphan-lo4viАй бұрын

    KEITHHH THE LEGEND 🎉🫶👊🏻

  • @KeithGalli

    @KeithGalli

    Ай бұрын

    MY MAN!

  • @diegoescobedo1716
    @diegoescobedo1716Ай бұрын

    great stuff

  • @KeithGalli

    @KeithGalli

    Ай бұрын

    thanks bro. I appreciate you checking out this video on 100% your own accord and having no influence from seb!

  • @sebastianalvarez1537
    @sebastianalvarez1537Ай бұрын

    Such a beast

  • @KeithGalli

    @KeithGalli

    Ай бұрын

    mi perro 😎

  • @MrTaken-tl4bw
    @MrTaken-tl4bwАй бұрын

    Ok this cool and all but can you scrape a React or any other javascritp frontend website without using selenium?

  • @WheresTheLambSAAAAAAAUCE
    @WheresTheLambSAAAAAAAUCEАй бұрын

    Babe wake up, new Keith Galli video just dropped I

  • @MarjorieRoseMasilang
    @MarjorieRoseMasilangАй бұрын

    💗

  • @user-bd5bx3lb4i
    @user-bd5bx3lb4iАй бұрын

    ERROR: JSException or error 503 how to deal with them ?

  • @ahmedbadal3795
    @ahmedbadal3795Ай бұрын

    make a video of how to get job with web scraping please i just asked my self why learn web scraping like what would you do with it so ? to make money with it >

  • @KeithGalli

    @KeithGalli

    Ай бұрын

    My recommendation would be to create an account on Upwork and check out some of the job postings people are asking for that require web scraping. Look through those postings and ask yourself if you would be able to solve the problem yourself with web scraping. I show some examples of these projects at 0:25.

  • @divyv20
    @divyv20Ай бұрын

    Hey Keith , very good video . I can do better editing in your videos which can help you to get more engagement in your videos . Pls lmk what do you think ?

  • @wiz8058
    @wiz8058Ай бұрын

    🔥🔥🔥🔥🔥💪💪💪

  • @KeithGalli

    @KeithGalli

    Ай бұрын

    😎

  • @netbin
    @netbinАй бұрын

    Is scrapping even actual nowadays?

  • @Mobilemaniaplays

    @Mobilemaniaplays

    Ай бұрын

    Yaaa

  • @netbin

    @netbin

    Ай бұрын

    @@Mobilemaniaplays for example?

  • @smasherlol351
    @smasherlol3517 күн бұрын

    What if they ban your macid. I mean basically a hardware ban.

  • @joeys7519
    @joeys75199 күн бұрын

    Starting an app that tracks when high demand products come back in stock

  • @japhethmutuku8508

    @japhethmutuku8508

    8 күн бұрын

    good luck with it

  • @moe9062
    @moe9062Ай бұрын

    You accidentally showed the username and password at 33:48

  • @KeithGalli

    @KeithGalli

    Ай бұрын

    Thanks for catching this! I deleted my zone information so the username/password no longer work! Thanks!

  • @moe9062

    @moe9062

    Ай бұрын

    The least I could do after all your help, Keith!