The Biggest Issues I've Faced Web Scraping (and how to fix them)

Try out Bright Data and get $15 credit for your projects! brdta.com/fknight
0:00 Problems I face web scraping
1:03 Web Scraping Basics Overview
4:38 Handling Complex Web Technologies
6:24 Script Optimization + Error Handling + Adaptive Algorithms
8:23 AI-Driven Proxy Management, Anonymity, and Intelligent Rate Limiting
10:23 How to Handle Extracted Data
12:22 Ethical AI and Legal Compliance
14:15 Thanks for Watching!
If you're a developer, sign up to my free newsletter Dev Notes 👉 www.devnotesdaily.com/
If you're a student, checkout my Notion template Studious: notionstudent.com
Don't know why you'd want to follow me on other socials. I don't even post. But here you go.
🐱‍🚀 GitHub: github.com/forrestknight
🐦 Twitter: / forrestpknight
💼 LinkedIn: / forrestpknight
📸 Instagram: / forrestpknight

Пікірлер: 51

  • @delsix1222
    @delsix12222 ай бұрын

    interesting timing to see this video, literally the day after I completed my first full-stack application which literally revolves around web-scraping :D

  • @flipygmd

    @flipygmd

    2 ай бұрын

    You're the next Mark Zuckerberg

  • @Noumaan_Ahamed

    @Noumaan_Ahamed

    2 ай бұрын

    How do you web scrape secure website?

  • @dalar2
    @dalar22 ай бұрын

    I used to web scrape all the time, but stupid js frameworks obsfucated css class names has made it very difficutlt.

  • @panz__
    @panz__17 күн бұрын

    In my opinion as i developed multiple web scraping application, half of the time is not spent coding but instead trying to reverse engineer the web application. Simple ones are just matter of looking at requests from dev tools and manually make api calls, while most complicated ones involve backtracing how content is loaded on the page to find the js code responsable to do that. Basically its 70% reverse engineering and 30% coding, if you do things the smart way.

  • @sumukh007
    @sumukh0072 ай бұрын

    The JD bottle in the background 😉

  • @doublesushi5990
    @doublesushi59902 ай бұрын

    such a chill vid

  • @EduardoEscarez
    @EduardoEscarez2 ай бұрын

    AFAIK the button highlighting is a feature based on video subtitles, including those generated automatically, but still somewhat random. I didn't catch those because I was already subscribed and like the video a moment before you said it.

  • @v1d300

    @v1d300

    2 ай бұрын

    I don't think its a video subtitles feature. It just happens randomly in my experience. The thumb up button shakes and subscribe highlights. Didn't happen for me on this video though :(

  • @danielabraham3022
    @danielabraham30222 ай бұрын

    To be honest, i subscribed because the button lit up. Also, I love your content.

  • @xlafxx
    @xlafxx2 ай бұрын

    I remember starting to watch your videos when I was entering computer science Ba, and as a 28 year old 1 semester left to graduate, you’re still uploading good content that’s unique. Never get tired of your vids , keep it up brother . I’m also concerned with the job market , can you make a vid about new grad Cs students ? For example seems almost every job wants front end or something and my school never taught any of it

  • @mrrobot-mn6re

    @mrrobot-mn6re

    2 ай бұрын

    You want to get a job from what your school taught you? You are in for a ride brother. Tech is about your own research and self learning, every fucking day.I pity people that majored in CS because they heard about a programmer earning 6figs

  • @Hshjshshjsj72727

    @Hshjshshjsj72727

    8 күн бұрын

    Unless u went to ivy league and wanna be a quant then u gotta do front end js react sql are key for majority. School is duhm unless ivybleague except for piece of paper

  • @redbill5197
    @redbill51972 ай бұрын

    Thank you for the amazing video! Much appreciated as a young web developer. By the way, none of the buttons lit up or did any animations... I am a subscriber, so I don't know if that's why. Peace!!!

  • @beaconxy

    @beaconxy

    Ай бұрын

    It actually didn't.

  • @Cryogenics12
    @Cryogenics122 ай бұрын

    Hi Forrest. I was wondering how you still feel about AI and the future of software engineering. With chat GPT out for over a year now, have your views changed much? Maybe a good topic for another vid.

  • @yafethtb
    @yafethtb2 ай бұрын

    Yeah. Scraping a dynamic website really makes me want to scream like Linus Torvalds to NVIDIA. And I also hate CloudFlare 😂

  • @brianmorin5547
    @brianmorin5547Ай бұрын

    Is there a reason/advantage to using Bright Data's "scraping browser" product instead of integrating their proxy and IP rotation services into a script I'm running on my own server?

  • @ramelox
    @ramelox2 ай бұрын

    When I see brightdata sponsorship, I instantly stop watching. Paying to brightdata is not a webscraping skill.

  • @zeddscarlxrd4331

    @zeddscarlxrd4331

    2 ай бұрын

    Did u know how to bypass cloudflare or captcha without bright data?

  • @ZacMagee

    @ZacMagee

    2 ай бұрын

    Some people 😂 That's like saying. "Oh well, these stupid people who drive cars, why would they do that when we still have horses?"

  • @vasyavasin7364

    @vasyavasin7364

    Ай бұрын

    ​@@ZacMagee why should I pay it if I can do it free?😂

  • @vasyavasin7364

    @vasyavasin7364

    Ай бұрын

    ​@@zeddscarlxrd4331 How to bypass cloudflare you can find easy.

  • @Ohiostategenerationx

    @Ohiostategenerationx

    Ай бұрын

    ​@@vasyavasin7364do you still not need to scrap a bunch of proxies to use?

  • @tomasemilio
    @tomasemilio2 ай бұрын

    Boom. Thanks

  • @xdcountry
    @xdcountry2 ай бұрын

    This guy gets it-I’ve been there. I can’t wait to make this all an easy ass python plugin

  • @V4rrow
    @V4rrow2 ай бұрын

    dude is literally gilfoyle from silicon valley(love your vids)

  • @theparten

    @theparten

    2 ай бұрын

    i wasn't looking for web scraping video but his face drew my attention, i was like wait this is Gilfoyle right😂❤...

  • @FFl1s

    @FFl1s

    2 ай бұрын

    Fr

  • @olasunkanmioyetunji9254
    @olasunkanmioyetunji9254Ай бұрын

    Can you recommend a course to learn web scraping. A course that taught the tool and techniques you mentioned and other concepts

  • @phethindabamkhwanazi3546
    @phethindabamkhwanazi35462 ай бұрын

    Hey, man do you have another channel where you teach live?????

  • @phethindabamkhwanazi3546

    @phethindabamkhwanazi3546

    2 ай бұрын

    If you have provide the link, please so I start learning more.

  • @carsonjamesiv2512
    @carsonjamesiv25122 ай бұрын

    GOOD VIDEO🎉👍

  • @v1d300
    @v1d3002 ай бұрын

    I am working on building a project that heavily requires scraping so I been doing a lot of research. And its really hard to find anything good that is not sponsored by brightdata. I get it, their marketing team has done a great job with tapping a perfect niche of creators who provide valuable information but this also creates a problem to ending up finding that almost each good resource is related to using brightdata and its not something I want to pay for when starting a hobby project. Anyway, this is a great video either way. I learned a lot of things I hadn't considered in my planning. Like the ETL(thats a new rabbit hole I need to dive into) or adaptive content extraction to account of layout changes. I was just assuming I will set up reporting to notify me when I start getting no content and then I will fix it. So thank you for that. Do you setup redis or something to make sure some requests are accessed from the cache of recently requested data than scraping again or accessing the db? is that necessary? And at what point should a webhook be setup and for what purpose exactly? Thank you

  • @johnknox4293
    @johnknox42932 ай бұрын

    interesting....thanks man

  • @javancheongyujing2531
    @javancheongyujing25312 ай бұрын

    Is web scraping under data science or software engineering structure?

  • @dmytro-skh
    @dmytro-skhАй бұрын

    this video is what I need. But whoaa so fast changes of screens with code... I'm too old at 35 to be able to push the pause button so fast 😅 Do you have some links with those hacks?

  • @realshiiiiiit8349
    @realshiiiiiit83492 ай бұрын

    Damn this guy is cool

  • @JoaquimDornelles95
    @JoaquimDornelles952 ай бұрын

    My fucking hero

  • @einekleineente1

    @einekleineente1

    2 ай бұрын

    are there vids of that ???

  • @francishubertovasquez2139
    @francishubertovasquez21392 ай бұрын

    Speaking of Females, if Hitler's fuhrer have Magog carrier of motorized machine monsters then the Northern Magog have ice snow predominant in their place near Arctic circle, and ice surface can better conduct gases and science elements and compounds interaction which can attract those science things from everywhere, who between them is stronger except for the Super Magog Dark Matter? Will they suffice at full force during the final battle end times?

  • @storymode9085
    @storymode90852 ай бұрын

    wow... i got a long way to go

  • @VishalJangid1
    @VishalJangid12 ай бұрын

    hopefully brightdata ain't a snitch 🫠

  • @botobeni
    @botobeniАй бұрын

    12:30 nuh uh 🗿🗿

  • @user-ut4so1vy3b
    @user-ut4so1vy3b2 ай бұрын

    Your mustache looks like a hedgehog 😂

  • @YouStillNeedToSleep
    @YouStillNeedToSleepАй бұрын

    Examples. Are you a Leo? he he

  • @abe_is_live
    @abe_is_live2 ай бұрын

    stop web scraping