Claude 3 Just Released - “Outperforms GPT-4 And Gemini in Every Category!”

There is a brand new version of Claude, Claude 3 just released by Anthropic and it’s a pretty big upgrade from Claude 2.
With Claude 3 beats GPT-4 and Gemini in the top benchmark testing.
Claude 3 comes in three different models. Haiku, Sonnet, Opus.
All models of Claude 3 have vision capabilities.
The best model, Opus requires a paid subscription to Claude 3 Pro.
In this video, I'll test its Vision capabilities, writing ability, image-to-code capabilities, and coding capabilities.
You can read the full blog post here: www.anthropic.com/news/claude...

Пікірлер: 44

  • @JOHN.Z999
    @JOHN.Z9993 ай бұрын

    I believe that the launch of GPT-5 will take place next week, but it would be amazing if it happened this week. That way, in addition to celebrating the one-year anniversary of GPT-4, we would have the chance to constantly talk about GPT-5. I hope that GPT-5 will exhibit reasoning far superior to all currently available models. With this, OpenAI would quickly silence critics and envious voices.

  • @EDashMan
    @EDashMan3 ай бұрын

    Love your benchmark and comparison tests, simple and not too long and effective. Seen a bunch of ai vids released similar times around the Claude model but soon as I saw yours I had to click first. You reckon you could do more coding examples ?

  • @jd_real1
    @jd_real13 ай бұрын

    I'm impressed with it. I asked Claude how to fix my car and the response matched GPT 4 and they were right. i also uploaded a picture of a mole and asked if it was skin cancer. Claude said that it didn't display markings of cancer but i need to ask my doctor. GPT 4 straight up told me nothing and it violated its TOS and said to only go to the doctor. I also went to the doctor and he said it wasn't cancer. I'll probably switch

  • @tomgreen8246
    @tomgreen82463 ай бұрын

    Been playing with it at work today... its exceptional. Surprised and impressed. Wish it had web browsing though

  • @GamerEngineer1345
    @GamerEngineer13453 ай бұрын

    Can't wait for perplexity to add claude 3 into their group of models that can be used in copilot mode its gonna be epic

  • @totempow

    @totempow

    3 ай бұрын

    Its in Poe as of now. Files is a little odd though. not taking pictures. wah wahhhh.

  • @timooothy1234

    @timooothy1234

    3 ай бұрын

    ​@@totempow In my knowledge it accepts docx. (Microsoft document) Files

  • @JaddOnTheTrackakaJOTT

    @JaddOnTheTrackakaJOTT

    3 ай бұрын

    what if i told you they already did on their web browser

  • @totempow

    @totempow

    3 ай бұрын

    I'd be a little happier.@@JaddOnTheTrackakaJOTT

  • @GamerEngineer1345

    @GamerEngineer1345

    3 ай бұрын

    @@JaddOnTheTrackakaJOTTjust saw it but limited for 5 queries per day

  • @micbab-vg2mu
    @micbab-vg2mu3 ай бұрын

    I am surprised how good it is:)

  • @mohamedyasser840
    @mohamedyasser8403 ай бұрын

    good job man

  • @dhruvgupta4170
    @dhruvgupta41703 ай бұрын

    Claude is not available in Canada.

  • @whellockroad

    @whellockroad

    3 ай бұрын

    Wonder how these geographic decisions are made. It's available in my two homes: Thailand AND Sri Lanka....hmmmm.

  • @oryanol
    @oryanol3 ай бұрын

    Good video as usual. Thanks for the details

  • @chengalvalavenkata2401
    @chengalvalavenkata24013 ай бұрын

    If you can upload a report with all your test runs (Claude vs GPT-4) that would be great. :)

  • @FamousTVvoice
    @FamousTVvoice2 ай бұрын

    Referring to 8':00" ; This might be very subjective but historically, the usage of a Post Script dates back to when correspondence were handwritten or typed, making it cumbersome to incorporate any afterthoughts or additional information into the body of the letter without rewriting the entire message. I get it, today its used to stress or highlight a point, but then *better and more effective writing* would negate that. Just a thought ....

  • @seventyfive7597
    @seventyfive75973 ай бұрын

    So you tested if Claude team fine tuned their model to the snake question, and that's nice that the people there are aware of repeated tests, but how about really testing it for code?

  • @SkillLeapAI

    @SkillLeapAI

    3 ай бұрын

    I’m not a developer but if you have recommendation I can test out, I’m happy to try

  • @seventyfive7597

    @seventyfive7597

    3 ай бұрын

    @@SkillLeapAI Just ask it to perform any task you'd like it to make, ask it to create a different game that is at a similar complexity to snake, as long as the question has not been asked in the past, you're good to go.

  • @thaholylemon43
    @thaholylemon432 ай бұрын

    I Am sticking with chatgpt as as soon as gpt 5 comes out there will be no competition.

  • @timooothy1234
    @timooothy12343 ай бұрын

    I'll let this marinate for some some weeks or months for it to be better trained by users input

  • @Futurist_05
    @Futurist_053 ай бұрын

    Whoi is actually testing each version of ai models when they release it ro rhw public? I mean tge comparison table? Is there any regulation?

  • @SkillLeapAI

    @SkillLeapAI

    3 ай бұрын

    As far as I know, those are internal benchmark testing they run.

  • @sfinford
    @sfinford3 ай бұрын

    YOOOO

  • @konrad3
    @konrad33 ай бұрын

    Meanwhile in the European Union Claude is still not available... And you'll need a Phone Number to verify your country

  • @qu_entin

    @qu_entin

    3 ай бұрын

    It works with US VPN and they sent me a code to my EU Number; the verification worked .. but I do not know if you are able to purchase the Pro Plan eventually .. did not try

  • @tuvichuanhangngay
    @tuvichuanhangngay3 ай бұрын

    GPT-4 cũng gặp sự cố và có thể bị treo, nhưng không gây ra tình trạng như máy chủ của Claude, mất khoảng 10-15 phút để trả lời một câu hỏi. Người ta mong ước rằng họ sẽ có máy chủ như của Google, Gemini, luôn hoạt động nhanh chóng. Video có thể đã phóng đại khả năng của Claude 3 với những tuyên bố mạnh mẽ về sự vượt trội so với đối thủ ở mọi lĩnh vực. Tuy nhiên, Anthropic's Model có thể thể hiện điểm mạnh ở một số lĩnh vực, nhưng các mô hình ngôn ngữ lớn rất phức tạp và hiếm khi có sự thống trị hoàn toàn. Một bài thuyết trình cân nhắc hơn sẽ tập trung vào các điểm mạnh cụ thể mà Claude 3 có, so sánh với nhược điểm và thừa nhận rằng hiệu suất có thể thay đổi theo nhiệm vụ. Quan trọng là phải chờ đợi xác minh độc lập về những tuyên bố này, vì các công ty có thể thiên vị sản phẩm của mình, gây nghi ngờ về những tuyên bố quá mức.

  • @qu_entin
    @qu_entin3 ай бұрын

    at least Gemini Advanced gave me a 2 months free trial (and I am mind blown compared to GPT 4 and will switch in case OpenAI is not able to adapt) .. Asked Claude (free) a question and return was something "I'm too busy, please try pro version" .. thank you, but this is not the way to generate new customers.

  • @RobloxInsanity
    @RobloxInsanity3 ай бұрын

    might keep my subscription if Claude is even better now. I actually use it mainly for helping me write my books and game coding and few other very small things. when it came to my book writing chat gpt did it better sometimes like helping me expand a paragraph of story text of story telling like add more detail into what i already typed.

  • @phen-themoogle7651
    @phen-themoogle76513 ай бұрын

    Claude3 is awesome but servers are💀.... now that everyone is there lol And GPt4 also was having issues with them and would freeze a lot, but not as crazy as Claude's servers, takes 10-15 mins for one reply now. I wish they had Googles Servers, Gemini is always ultra fast..

  • @adhumon55
    @adhumon553 ай бұрын

    Not impressed, Claude 3.0 models sounds more like gpt than sounding human like 2.1,2.0 did! Very sad that they destroyed the strength of claude

  • @TheHistoryCode125
    @TheHistoryCode1253 ай бұрын

    The video likely overhypes Claude 3's capabilities with its bold claim of outperforming competitors in every category. While Anthropic's model may show strengths in certain areas, large language models (LLMs) are complex, and outright dominance is rare. A more balanced presentation would highlight specific benchmarks where Claude 3 excels, compare its weaknesses, and acknowledge that performance can vary depending on the task. Additionally, it's important to await independent verification of these claims, as companies can be biased towards their own products, making skepticism towards sweeping statements advisable.

  • @SkillLeapAI

    @SkillLeapAI

    3 ай бұрын

    well looks like ChatGPT is bad at commenting on KZread videos. Not at all what the video is.

  • @Apokalupsis88
    @Apokalupsis883 ай бұрын

    Except it's not tue. In an actual head to head vs GPT 4, it was shown to be a bit inferior: kzread.info/dash/bejne/pYxstMtsp5WzlbA.html&ab

  • @SkillLeapAI

    @SkillLeapAI

    3 ай бұрын

    well everyone has had it for like 4 hours. So really can't make a real determination.

  • @SkillLeapAI

    @SkillLeapAI

    3 ай бұрын

    Also Matt's video has the same title which is the claim of Claude and after watching it, doesn't sound like he came to a conclusive answer either.

  • @Apokalupsis88

    @Apokalupsis88

    3 ай бұрын

    @@SkillLeapAI Right, but the claim is in quotation marks, indicating that it's just the claim and not necessarily reality. Matt's conclusion is that Claud didn't beat out gpt4 and is more expensive. He does point out that gpt won out in logic and dialog use but Claude did very well in the technical portion (centipede game)

  • @SkillLeapAI

    @SkillLeapAI

    3 ай бұрын

    I see. For some reason all his titles say shocking or breaking lately and I can’t keep track. On the consumer side, they are both $20 dollars a month, and I usually compare the consumer facing Chatbot and not the API. But I understand the point. I just don’t think any of us can have any claim of our own with a couple of hours of testing. I do remember Gemini had similar claims and I ended up disagreeing with every benchmark. So we will see

  • @SkillLeapAI

    @SkillLeapAI

    3 ай бұрын

    I added quotes too so it’s clear it’s their claim and not mine.

  • @alejandrones5238
    @alejandrones52383 ай бұрын

    First ❤