No video

GPT-4 Vision Browsing Part 2: Following links with Puppeteer

In today's video I do continue my GPT-4 Browsing project and make it follow links on pages.
GitHub: github.com/unc...
Support: buymeacoffee.c...
Consultations: www.buymeacoff...
Memberships: www.buymeacoff...
00:00 Recap
02:41 Finding all clickable elements
31:06 Making GPT-4 Vision read link texts
47:03 Migrating to JavaScript
1:01:58 Clicking links by link text in Puppeteer
1:08:42 Making it conversational
1:17:40 It works! (almost)
1:20:00 It really works!

Пікірлер: 35

  • @marcoaerlic2576
    @marcoaerlic25762 ай бұрын

    Thanks for making these videos. They are a lot of fun and very informative.

  • @unconv

    @unconv

    2 ай бұрын

    Good to hear! Thanks for watching :)

  • @billybofh2363
    @billybofh23638 ай бұрын

    Haven't finished the video yet - but wanted to mention that I've sometimes had GPT behave a little oddly when I specifically use the word 'crawler' in a prompt. It's a bit like it goes into "I shouldn't really be doing this.. mnnghhhhhh!" mode. But telling it that it's a super-duper-wonderful positive thing to help a visually impaired user navigate the web works more reliably. Not sure if it's just random chance - but seemed to work at the time.

  • @unconv

    @unconv

    8 ай бұрын

    "You are a crawler... or else!"

  • @st-hf2ik

    @st-hf2ik

    8 ай бұрын

    Following up on the hour google meet I purchased - when can we do it? @@unconv

  • @unconv

    @unconv

    8 ай бұрын

    @st-hf2ik Check your email :)

  • @npizza3973

    @npizza3973

    8 ай бұрын

    I think you will be able to use a other vision LLM with opensource code and not having this kind of problems

  • @DJcatamount

    @DJcatamount

    6 ай бұрын

    this oddly works, and errors have reduced alot. guess gpts vary in their helping tendency if the person is disabled?

  • @carstenli
    @carstenli8 ай бұрын

    That was fun to watch. Can't wait for the next episode. 👍

  • @MindForeverVoyaging
    @MindForeverVoyaging8 ай бұрын

    Great Video. Appreciate that you just kept going, trying this and that, as you fought with visionGPT, it is very informative. Keep Going ...

  • @ex3aliber
    @ex3aliber8 ай бұрын

    Amazing! Always fun to see real time problem solving and how to go about it! Just saw the web crawler used by Jason AI in his latest video...been following both of youll since april... thanks to both !! 🍻🍻

  • @unconv

    @unconv

    8 ай бұрын

    Awesome!

  • @arjungoalset8442

    @arjungoalset8442

    8 ай бұрын

    @@unconv can you please make part 3 too :)

  • @CaleyHamiltonProjects

    @CaleyHamiltonProjects

    7 ай бұрын

    @@unconvwe def need the part 3 with the input field functionality

  • @Sulayman.786
    @Sulayman.7868 ай бұрын

    Nice, what I was looking for! Thanks.

  • @techfren
    @techfren7 ай бұрын

    Subscribed! thank you so much for the great content!

  • @toapyandfriends
    @toapyandfriends4 ай бұрын

    Yeah I'm having some code written for me with selenium uh if you can give me an idea how this is better these are kind of long tutorials but that would be great if you could give a little synopsis

  • @digitalcivilulydighed
    @digitalcivilulydighed8 ай бұрын

    You just keep on trucking :-) another good one!

  • @RahulGupta-uk1gc
    @RahulGupta-uk1gc7 ай бұрын

    Subscribed. Thank you so much for this. Bravo!

  • @user-ep3pm2tw1e
    @user-ep3pm2tw1e5 ай бұрын

    Hey man - amazing video. How would you go about deploying this? AWS? Vercel?

  • @m1kecr1s1s
    @m1kecr1s1s8 ай бұрын

    Awesome stuff!

  • @jayakrishnanp5988
    @jayakrishnanp59888 ай бұрын

    It was amazing

  • @RyanCourtnage
    @RyanCourtnage2 ай бұрын

    One of the issue I ran into with a similar project sending website screenshots to gpt-4o had to do with long web pages. They would generate long skinny images, which were unreadable by the AI. As I understand it, screenshots are resized on OpenAI's end into a square (ie: 1024x1024), maintaining the original image's aspect ratio. This results in a lot of the text being unreadable (too small). I've tried splitting these long images into part_1, part_2, etc, but it obviously results in some images getting split in nonsensical areas, which also causes problems. Would love to hear your thoughts on this.

  • @unconv

    @unconv

    2 ай бұрын

    In my video "5 Use Cases for GPT-4 Vision API" I scrape an Amazon search results page by splitting the screenshot into parts, but I also "overlap" the parts so that no product gets cut in half. Depending on the specific website you're scraping something like this might work.

  • @RyanCourtnage

    @RyanCourtnage

    2 ай бұрын

    Will check it out! 👍

  • @neon_Nomad
    @neon_Nomad7 ай бұрын

    Useful

  • @gaboguit
    @gaboguit8 ай бұрын

    Great video! I am now a fan. I have pull the code and the link clicking failed each time. I suspect node version maybe. What node version are you running?

  • @unconv

    @unconv

    8 ай бұрын

    I have Node 19.6.0

  • @actorjohanmatsfredkarlsson2293
    @actorjohanmatsfredkarlsson22937 ай бұрын

    Do intend to push this to the gpt4v-browsing repo. Would be appreciated.

  • @actorjohanmatsfredkarlsson2293

    @actorjohanmatsfredkarlsson2293

    7 ай бұрын

    Ah found it never mind :-D

  • @actorjohanmatsfredkarlsson2293

    @actorjohanmatsfredkarlsson2293

    7 ай бұрын

    Great video. Really interesting.