Google
Күн бұрын
3,042,195
1

The capabilities of multimodal AI | Gemini Demo

Ғылым және технология

Our natively multimodal AI model Gemini is capable of reasoning across text, images, audio, video and code. Here are favorite moments with Gemini Learn more and try the model: deepmind.google/gemini
Explore Gemini: goo.gle/how-its-made-gemini
For the purposes of this demo, latency has been reduced and Gemini outputs have been shortened for brevity.
Subscribe to our Channel: / google
Tweet with us on X: / google
Follow us on Instagram: / google
Join us on Facebook: / google
0:00 Intro
0:19 Multimodal Dialogue
1:32 Multilinguality
2:04 Game Creation
2:31 Visual Puzzles
3:17 Making Connections
3:39 Image & Text Generation
4:06 Logic & Spatial Reasoning
4:55 Translating Visuals
5:27 Cultural Understanding

Пікірлер: 3 800

@degenplanet5 ай бұрын
Just one problem: the video isn’t real. “We created the demo by capturing footage in order to test Gemini’s capabilities on a wide range of challenges. Then we prompted Gemini using still image frames from the footage, and prompting via text.” (Parmy Olsen at Bloomberg was the first to report the discrepancy.)
@buttofthejoke
Күн бұрын
They changed the title. Previously it was called "Hands on with Gemini".
@Kudagraz
21 сағат бұрын
it says in the intro "showing it a series of images"
@dpsdps015 ай бұрын
Absolutely mindblowing. The amount of understanding the model exhibits here is way way beyond anything else.
@NeuroScientician
5 ай бұрын
It's staged.
@gerardojg
5 ай бұрын
I agree but I wouldn't describe it as "understanding". Identification and cognitively identify possibilities with given data. It is very impressive!
@cajbajthewhite4889
5 ай бұрын
@@NeuroScientician I've gotten GPT-4 V to play tabletop wargames with me and it had decent strategy, and to read my poor quality sketches. If Gemini Ultra succeeds at the benchmarks they claim it does and is built with native multimodality, there's no reason to believe that the video is staged beyond the fact that they've sped up the responses a bit (which is shown in text at the beginning).
@goturmatau
5 ай бұрын
@@NeuroScientician It's surely rehearsed, but don't underestimate the power of the LLM.
@Google
5 ай бұрын
Thrilled to hear you think so! Enjoy using Bard with Gemini Pro ✨
@joshuaryde90285 ай бұрын
Google has admitted in a blog post that this video isn’t accurate- the AI “was not responding to the voice or video at all”, but in fact had written prompts to respond to and still images rather than the live drawing/conversation which are not shown in the video.
@NoMercy.62
2 ай бұрын
where did they say that?
@blacknoir2404
11 сағат бұрын
we'd all have to wait 5 more months for something like this haha
@ChrisBrooksbank5 ай бұрын
Im glad to see Google back in the game, this looks next level.
@MikeKleinsteuber
5 ай бұрын
No they ain't. This will never see the light of day in the public arena
@anuragparmar8155
5 ай бұрын
@@MikeKleinsteuberwhy so
@jman
5 ай бұрын
@@MikeKleinsteuber it's already accessible for the public
@reconquista1911
5 ай бұрын
Yeah, evil company is in the game. What bad could happen?
@dexio85
5 ай бұрын
They are trying to look this way for sure. But this is a gimmick and a toy, maybe useful for vision impared, but that's it. Google is not capable of creating working product for the public for years now.
@familymultiplayergames12265 ай бұрын
When did Google lose their way and think it’s ok to fake videos to raise stock prices.
@99.googolplex.percent
3 ай бұрын
There's a chance this exists, but sharing such information publicly might not be feasible in the near future.
@CristianHernandez-er4zn5 ай бұрын
could people set lawsuits since it is sort of misleading
@TimeBucks5 ай бұрын
The real-time element is by far the most impressive.
@somthingz3928
5 ай бұрын
Don't get your hopes up. It's not real time.
@MdsweetSweet-ox6jp
5 ай бұрын
Nice
@TalwinderDhillonTravels
5 ай бұрын
Lol this is just an edited video Nothing real time
@appletree6741
5 ай бұрын
It’s fake apparently
@beayn
4 ай бұрын
This is their favorite interactions with the AI, so they edit out the ones where it performed poorly which was probably the majority of them. Once they polish it up over the next few years I'm sure it will be able to do this in almost-real-time as in it will probably take several seconds to react to what you're doing... and of course, you'll be able to subscribe for $29.99 per month for faster responses.
@SoloPirate20035 ай бұрын
Tasteful touch at the end with the constellation drawing. So far Gemini is living up to the hype. Looking forward to using it come 2024.
@Google
5 ай бұрын
Can't wait for you to get prompting 🤩
@pylotlight
5 ай бұрын
@@Google Did you guys release an ETA yet for this on to be updated in Bard?
@michaelcondon8286
5 ай бұрын
This may be staged just like Google Duplex demo from a few years ago. Don't believe anything from these people until you see it for yourself.
@stephantual
5 ай бұрын
Can you explain developers.googleblog.com/2023/12/how-its-made-gemini-multimodal-prompting.html?m=1. It looks like you fed the model, some images with some textual hints and then created a video that emulated the look and feel of a live feed presentation. It would be good if you could clarify exactly what we're looking at here
@somnathghosh6165
5 ай бұрын
@@Googleallow twerking videos on KZread without demoni demoneytization. Corporate crackheads
@tristanwegner5 ай бұрын
Sad to read elsewhere, that it is not the actual interaction that took place. They cut out the thinking time, that they used text instead of voice and worse: the much more specific prompts (e.g. the human explain the country guessing game, and even gives two examples with screenshots of the finger pointing on the map). Is Google really so unsure about their product, that they have to exaggerate their features in this video? But why? When people get access to it, they will notice it anyway. Example from the blog: They don't show the footage of the hand and Gemini by itself mentions the game. No, they instead upload 3 perfectly timed images of the three gestures and give it the hint "it's a game". And with this, Gemini gets it. Still impressive, but probably GPT4 would do that just as well, whereas the video implies the novel features of real time understanding of live video, which is not there, but delay text response to specific requests to text and images uploaded.
@greatbritishmale5 ай бұрын
They’ve edited the video guys to make it look better. The AI was not responding to the live actions of this guy, it was responding to still images and text. Very strange to act like its AI is capable of this.
@21EC5 ай бұрын
I got shocked and mind blown seeing how smart Gemini is in this video alone, it's kinda scary how advanced and smart it is, what is it? a primitive initial AGI? just WOW
@Shazamthunder
5 ай бұрын
True AGI will never exist. But I think that humans could reach a level with AI where it won't make a difference.
@alternatecheems8145
5 ай бұрын
@@ShazamthunderIt can easily exist with a system of using a main model acting as an OS with multiple portable "module" models.
@gonzalobruna7154
5 ай бұрын
this is staged, sadly. there is a blog where they wrote how this was done, and first of all, this is not in real time, they pass specific frames to the model and they give VERY specific instructions on what to do. The model doesn't guess anyrhing at all. Even the game with the map, in the blog they show they wrote exactly what the instructions of the game were, so the model didn't come up with the idea. it's very dissapointing.
@lolzman122
5 ай бұрын
@@Shazamthunderwhat is ”true agi” and please explain why it won’t ever exist
@electrolove9538
5 ай бұрын
It couldn't tell the line drawing was a duck without feet. Still a ways away. Yet still mindblowing.
@JakeHaugen5 ай бұрын
Absolutely next level stuff. The temporal inference was amazing. I was most impressed by it's ability to remember where the ball was and follow it. Seems well versed. What a time to be alive!!!
@tuckerbugeater
5 ай бұрын
not long to be alive
@Google
5 ай бұрын
It's a big day for us all
@michaelcondon8286
5 ай бұрын
This may be staged just like Google Duplex from a few years ago. Don't believe anything from these people until you see it for yourself.
@W4rfire5 ай бұрын
Unfortunately, what you see is not at all what happened. The AI does not actually reply to the person but to a script and pictures containing sometimes more information than we are shown here
@Armeli-wj2fv
5 ай бұрын
oi qquandoaaaaaaaaaaaaaaaaa1alp1alpaaaaaaaaaaaaaaaaa1alpaaaaa1alpaaa1alpaa1alpaqqa1alp1alpa1alpaa
@Clarix_Shorts
5 ай бұрын
But thatcis not the same version
@mattador865 ай бұрын
Pretty disappointing to find out that Google faked these real time live video conversational interactions.
@elvisvan
5 ай бұрын
welcome to reality, advertisements are rarely accurate to how stuff actually is in practice
@Criticalgraphics
5 ай бұрын
But this is ridiculous, is far from being nearly close to what this promise. And most of the inaccurate ads let you know at the beginning the intention. That isn't Gemini should be called Aries 😅
@gregwessendorf
5 ай бұрын
@@Criticalgraphics Fast food in ads v. What you actually get.
@appletree6741
5 ай бұрын
Did they?
@justynaczaplicka3820
5 ай бұрын
@@appletree6741 1:24
@utopiankreations5 ай бұрын
I knew you guys were working on something AMAZING. Glad to see ya back! This is a complete game changer! 💜
@dufung3980
5 ай бұрын
It’s a manhattan project, stop being anything but disappointed in your species. You should look up what Larry Page said at Musk’s 44th birthday and get back to me.
@Azzazel_
5 ай бұрын
Im sorry but it was fake and staged
@utopiankreations
5 ай бұрын
And how so? If you know then share your facts please? :) @@Azzazel_
@utopiankreations
5 ай бұрын
ummm ok lol Recognizing the proficiency and effort invested in developing this technology does not warrant characterization as a "speciest." I anticipate numerous positive outcomes stemming from the advancements in artificial intelligence, similar to the transformative impact witnessed with the invention of the internet. It is crucial to acknowledge that, like any creation, challenges may arise alongside its benefits. @@dufung3980
@josephman1488
5 ай бұрын
@@Azzazel_ And they put a disclaimer in description which none of you guys even read😂😂
@Press1ForNick5 ай бұрын
This is mind-blowing! Thanks for giving us a sneak peek into the incredible progress happening in the world of tech, creativity, and communication. This has the potential to be at the heart of everything we do.
@Google
5 ай бұрын
You're very welcome. Thanks for using Bard with Gemini Pro!
@michaelcondon8286
5 ай бұрын
This may be staged just like Google Duplex from a few years ago. Don't believe anything from these people until you see it for yourself.
@alexp.3694
5 ай бұрын
@@Google Oh look - Google has time to answer youtube comments, instead of working on aligning a potentially dangerous tech...
@cgme9535
5 ай бұрын
@@alexp.3694probably someone that manages social media profiles. The engineers are still watching the AI, don’t worry.
@keelfly
4 ай бұрын
@@Google come on now, tell them how you faked it. Your next video should be about that. Be honest for once.
@tusharparkhe32455 ай бұрын
This is really fascinating! I was waiting for the Gemini and it's finally here! I hope this Gemini is as capable as the video is showcasing it. but I noticed that this video is edited especially when the person rotates the phone while showing the cat's demo at 5:36 that video has clearly been added later...
@christinestpierre34625 ай бұрын
Fascinating 😮 I can’t wait to see what we’ve accomplished in another 5 years
@Abnetfikre5 ай бұрын
Wow! This is incredible! I'm so excited to see Google pushing the boundaries of AI with Bard. As someone from Ethiopia, Africa, I'm especially thrilled to see this technology accessible to a global audience. The potential for Bard to bridge the information gap and empower people like myself is truly inspiring. Great job, Google! This is just the beginning! 🤩👏🏾
@stienogamez8296
5 ай бұрын
chatGPT is also globally available...
@dufung3980
5 ай бұрын
It’s a manhattan project, stop being anything but disappointed in your species.
@MatthewTheWanderer
5 ай бұрын
@@dufung3980 Go away, troll! This is awesome and will do much more good for the world than harm!
@Google
5 ай бұрын
It's the start of something great ✨
@dufung3980
5 ай бұрын
@@MatthewTheWanderer Idealist optimist=wrong, but hey you're what you're.
@Yassine-tm2tj5 ай бұрын
What a journey we’re about to embark on!
@Pudibu
5 ай бұрын
...that ends at bottom of a cliff.
@-reezey-6332
5 ай бұрын
XDDDDDDDDD @@Pudibu
@Paradoxicful
5 ай бұрын
It's okay... We'll let you go first!@@Pudibu
@Google
5 ай бұрын
Thanks for coming along 😁
@adambowman1161
5 ай бұрын
Do we have a choice? @@Google
@sakushi39312 күн бұрын
OPENAI DID IT!! THEY DID WHAT GOOGLE COULD NOT
@user-uh4gm8ls8n
2 күн бұрын
True
@chucko314995 ай бұрын
How many months before this product too is cancelled?
@jeffreymitchell49045 ай бұрын
The real-time element is by far the most impressive. These sorts of asynchronous interactions are what AI has been missing thus far.
@atlas3650
5 ай бұрын
How do you know it’s real time?
@ethan.johnson
5 ай бұрын
"For the purposes of this demo, latency has been reduced and Gemini outputs have been shortened for brevity."
@SinanAkkoyun
5 ай бұрын
It's not, likely GPT 4 latency when OpenAI servers are under moderate load, as it looks you would need to prompt with a static video file etc
@Bunny501
5 ай бұрын
It's not real time and its not video. Its responding to prompts and shots from this presentation, the responses also have been editorialized. Read the experiment to see how they did it developers.googleblog.com/2023/12/how-its-made-gemini-multimodal-prompting.html
@xsuploader
5 ай бұрын
@@ethan.johnsonshortened outputs aren't a big deal and latency will improve in time
@JohnKooz5 ай бұрын
I was genuinely increasingly astounded each minute of the Gemini demonstration! With its image recognition, translation capabilities, nutritional advice, geographic knowledge, intuitive features, and even humor, I think Gemini might make a good "friend"! haha! 😀
@wyssli5 ай бұрын
according to bloomberg: "In reality, the demo also wasn’t carried out in real time or in voice. When asked about the video by Bloomberg Opinion, a Google spokesperson said it was made by “using still image frames from the footage, and prompting via text,” and they pointed to a site showing how others could interact with Gemini with photos of their hands, or of drawings or other objects. In other words, the voice in the demo was reading out human-made prompts they’d made to Gemini, and showing them still images. That’s quite different from what Google seemed to be suggesting: that a person could have a smooth voice conversation with Gemini as it watched and responded in real time to the world around it." Wow Google you must be desperate...
@appletree6741
4 ай бұрын
Yeah quite disappointed
@skypurplecloud4 ай бұрын
Was this all in realtime? If it was shot in one take, I am impressed. How was the setup created, what tools/accessories and what app components to analyse, pass the details/images to Gemini and interact with the AI?
@isidroundercover
4 ай бұрын
they faked it :/
@joannot6706
4 ай бұрын
No it's written at 0:21 below screen and they go on to explain how it's done. But considering gemini has audio, and video multimodality, it's just a matter of time.
@FUncleDave
2 ай бұрын
Even if you ask Gemini, it tells you it's fake While the video you linked does feature me appearing to look at drawings and guess what they are, it's important to understand that this is a carefully crafted illusion. I don't actually have any visual processing capabilities in the way a human does. In the video, the creators likely used a combination of techniques to create the illusion of me looking at and understanding the drawings. This could involve things like: * Pre-recorded video: The video of me "looking" at the drawings could have been pre-recorded and then edited to make it appear that I was reacting to the drawings in real-time. * Text prompts: The creators could have provided me with text descriptions of the drawings, which I then used to generate my responses. * Human input: It's also possible that a human was involved in providing me with information about the drawings or guiding my responses in some way. Ultimately, the goal of the video is to showcase my ability to process and understand information, not to claim that I have true visual perception. I hope this clarifies the situation!
@YTV-Hoddeok5 ай бұрын
Such an interesting work!! Hope to see more incredible things in the near future
@masija23
5 ай бұрын
😊😊😊
@caelen_c5 ай бұрын
I always love AI videos from Google
@AppleTechMaster84 ай бұрын
This looks amazing! Gemini has so many new AI capabilities that I’ve never seen before. It’s amazing how it is able to generate images so fast ( 3:46 ). I can’t wait to try it out in real life, and when I can, I’m sure it’s going to be so cool.
@tempomail93875 ай бұрын
Folks, their model does not work on a live video feed as shown in this video. There is a blog with images and DETAILED text prompts for it. Look for "How it’s Made: Interacting with Gemini through multimodal prompting"
@BECHEEKHA5 ай бұрын
Very impressive. Want to try it.
@wqlff2692
5 ай бұрын
lol haven’t seen these type of bots in ages
@bashvim
5 ай бұрын
FRAUD
@ArjunU931
5 ай бұрын
broo ivideyo haha nice kandathil sandhosham ini evidengilum vech kanam
@ximaik094
5 ай бұрын
@@wqlff2692 next level scam actually!!! What is KZread doing ????
@ivoryas1696
5 ай бұрын
@@wqlff2692 Yo, same! Do be succing, though... 😞
@klx62655 ай бұрын
Absolutely mind blown by the scale of context awareness here. G for Gemini.
@blueSurfer5 ай бұрын
It turns out the video is not entirely correct and is edited as mentioned in the description.
@scalereality48405 ай бұрын
THEY'VE JUST ADMITTED THIS WAS FAKED! The AI didn't respond to voice and the delays between AI responses were cut.
@MrARRMP5 ай бұрын
As an Ai admirer, this blew my mind. I’ve watched it at least 3 times and I still can’t grasp how big your datasets must have been. Amazing impressive work!
@michaelcondon8286
5 ай бұрын
This may be staged just like Google Duplex from a few years ago. Don't believe anything from these people until you see it for yourself.
@jimmysyar889
5 ай бұрын
You'd be surprised. I've got a 7b model that's only around 10gb and it seems to know all these random things. Hell even wikipedia is less than 25GB in entirety.
@delowerhossain3069
5 ай бұрын
@@jimmysyar889there are 540B model exist
@Vector-dz3jk
5 ай бұрын
@@jimmysyar889what’s a 7b model?
@-long-
5 ай бұрын
@@Vector-dz3jk a model with 7 billion parameters.
@abdoufma5 ай бұрын
I'll have to reserve judgement untill I've seen it in production, but this looks absolutely mind-blowing!
@cbow305
5 ай бұрын
It's fake. They got caught and have has to release more information. Google it ( I understand the irony)
@soumilghosh51565 ай бұрын
@Google, when will I be able to use Gemini in this way, aka real time instead of just by text like how Gemini will be integrated in Bard?
@jiminyc99385 ай бұрын
"Google admits AI viral video was edited to look better" , I just read the article on BBC website where Google explains the video is not real time but edited. Much ado about nothing ...
@cijoykjose
5 ай бұрын
How can someone do something with steps and lags inbetween each task (i mean human and the machine preparation lag) . This is how the results are professionally published. So editing is an unavoidable part .
@sierramist446
4 ай бұрын
I just read that they had used still images. So it seems like this video interaction was artificial? They had to take pictures and upload them
@Nolimit4you
4 ай бұрын
It's lot of marketing, but the future is here and it will be wild
@avrahamshaked21475 ай бұрын
Dayum, and here I thought we were entering the phase of diminishing returns and slowing down on AI models before you guys came up with this one haha
@cagnazzo82
5 ай бұрын
Where did you get that idea? December has been a nonstop explosion.
@hastyscorpion
5 ай бұрын
@@stanvassilevlol what a dumb thing to say
@michaelcondon8286
5 ай бұрын
This may be staged just like Google Duplex demo from a few years ago. Don't believe anything from these people until you see it for yourself.
@IsJonBP5 ай бұрын
It would be great that, as it generates images and audio on the go, it also could generate docs, sheets, slides and even give you some folders with elements inside, maybe in a zipped folder. I dunno, the posibilities are inspiring. When will this model be avaible to the public? It could turn into my principal AI tool!
@h.c4898
5 ай бұрын
It's already hooked on Bard. It's in today's Bard update. But I dunno if it can generate the tasks what u asked for. Bard@ is just an LLM at this point.
@IsJonBP
5 ай бұрын
@@h.c4898 yeah, I was hoping for them to put Bard 'to sleep' and come out with a new rebranding or something like that. I guess I just don't trust Bard in general. I know this feeling is completely subjective though :(.
@SteveJones-qi5hn5 ай бұрын
That was 6 minutes of my life that I will never get back.
@romanatorx39495 ай бұрын
A shame that it is not 100% real and google admitted to editing it to make it look better... Fake advertising?
@andremillones4631
24 күн бұрын
❤❤
@user-mi3gl4vb8c
23 күн бұрын
WA QQ w
@joe4171
22 күн бұрын
Literally every advertisement you’ve ever seen in your life is edited
@Inter-Dimensions_Studios5 ай бұрын
I have always thought Google has the best chance to take generative A.I. to a super level.
@michaelcondon8286
5 ай бұрын
This may be staged just like Google Duplex from a few years ago. Don't believe anything from these people until you see it for yourself.
@ivoryas1696
5 ай бұрын
Inter-Dimensional_Studios Honestly, low-key same. _Especially_ since they "acquired" Deepmind!
@Inter-Dimensions_Studios
5 ай бұрын
@ivoryas1696 I like the competition, looking forward to what others have up their sleeves.
@ShpanMan5 ай бұрын
Well done Google, if the model *actually* answers these (and no, it won't be this fast), then you have not disappointed us - the wait was worth it! Now to Gemini 2...
@michaelcondon8286
5 ай бұрын
This may be staged just like Google Duplex from a few years ago. Don't believe anything from these people until you see it for yourself.
@Shadow__1335 ай бұрын
@04:27 Mass or orbit? Answer is too specific and could be incorrect.
@Fushgy2 ай бұрын
Me: Show me pictures of German people Gemini: *insert George Floyd*
@maximusolivia9982
2 ай бұрын
George Floyd pre or post OD? Just curious.
@guillermoelnino
Ай бұрын
He could've literally been a na zi and they'd still race riot on his behalf.
@TicTockBrandShop5 ай бұрын
I really cannot quite believe what my eyes have just shown me For me, this is the most incredible piece of A.I advancement the world has seen.Period. Mind blown, when I try to just imagine what the A.I world will could become in just a few years from now. Amazing and every other superlative I could throw at you.
@michaelcondon8286
5 ай бұрын
This may be staged just like Google Duplex demo from a few years ago. Don't believe anything from these people until you see it for yourself.
@swatmaster492
5 ай бұрын
it's incredibly misleading and not actually real-time.
@TicTockBrandShop
5 ай бұрын
Ah didn't know that.Thanks my friend.
@NkwawirBeltus5 ай бұрын
Mindblowing!!. We all knew Google wasn't gonna just let OpenAI win AI battle. This is some next level stuff.
@dufung3980
5 ай бұрын
It’s a manhattan project, stop being anything but disappointed in your species.
@dcos5
5 ай бұрын
they've been working on AI for a long time. and they have limitless data to train on.
@TheRafark
5 ай бұрын
It’s 🧢 tho the video is scripted
@stephantual
5 ай бұрын
It would be if it was real. developers.googleblog.com/2023/12/how-its-made-gemini-multimodal-prompting.html?m=1 They took individual images of the sun and earth that you see in the video and passed it with the eye complete with hints. Then they recorded the answers from the AI using a text to speech system and overlaped it with a video to make it look like the AI is looking at what you see in real time it is not.
@SnoopyDoofie
5 ай бұрын
Except it is being reported as fake by TechCrunch.
@kelvintiger5 ай бұрын
TechCrunch: The video isn’t real. “We created the demo by capturing footage in order to test Gemini’s capabilities on a wide range of challenges. Then we prompted Gemini using still image frames from the footage, and prompting via text.”
@Cockroach_underwear4 ай бұрын
Wow! Great job google,I hope it lives up to everyone’s expectation! Seems like our utopia might not be too far off in the future
@devinoxman5 ай бұрын
The accessibility implications of Geminis ability to perform real time image Analysis are mind blowing, as somebody who can’t see, I can’t wait to try this. This paired with a smart phone, camera or headset with stereoscopic image capture could be a total game changer.
@ilianos
5 ай бұрын
Have you tried other image caption algorithms that can detect objects? If so, I'd be curious to know what your experience was with them. I'm asking because I was already imagining this years ago, when I learned about the program "By my eyes" (which was only done by humans at the time).
@blindstreet
5 ай бұрын
@@ilianos Blind people already enjoying Be My AI.
@ilianos
5 ай бұрын
@@blindstreet I know, that's why I'm asking about the quality of the experience
@gonzalobruna7154
5 ай бұрын
Sadly this is not real time. Actually, it never gets video as a prompt. All the prompts are perfectly selected still images and they add very clear and detailed instructions on what to do with everything there. Actually, when playing the game of the map, they make it look as if the AI created the game, but actually, they gave a VERY specific prompt: "Instructions: Let's play a game. Think of a country and give me a clue. The clue must be specific enough that there is only one correct country. I will try pointing at the country on a map.", so the AI never guessed it. So this is a fake video, and there are certain places where you can tell. If you want to know more about that, check their own blog post: developers.googleblog.com/2023/12/how-its-made-gemini-multimodal-prompting.html
@greatbritishmale
5 ай бұрын
It isn’t real time analysis as you see it. They have altered the video to make it look like it is. What they do is show the AI images and ask it questions via text prompts, and the responses are not as quick as shown. It’s a nice concept video, but not reality.
@user-bz9nh1fb5k5 ай бұрын
That's truly mind-blowing!! looking forward to more amazing things we can do using Gemini!
@Google
5 ай бұрын
The Gemini era will be a great one 😊
@michaelcondon8286
5 ай бұрын
This may be staged just like Google Duplex demo from a few years ago. Don't believe anything from these people until you see it for yourself.
@PaulTurnbull-qz4rj5 ай бұрын
Google have admitted it was edited to appear this intelligent
@recordednowhere5 ай бұрын
When asked about the video by Bloomberg Opinion, a Google spokesperson said it was made by “using still image frames from the footage, and prompting via text,” and they pointed to a site showing how others could interact with Gemini with photos of their hands, or of drawings or other objects. In other words, the voice in the demo was reading out human-made prompts they’d made to Gemini, and showing them still images. That’s quite different from what Google seemed to be suggesting: that a person could have a smooth voice conversation with Gemini as it watched and responded in real time to the world
@brentshaffer97735 ай бұрын
Realizing the yarn examples are displayed against the same backdrop as the AI is seeing is both impressive and creepy.
@cbot93025 ай бұрын
The three most impressive parts for me were it tracking where the ball was, understanding the dot connection was a crab (I didn't even see that!) and, funnily enough, it getting things wrong! I think this last one because it is also stuff that would fool us humans (like expecting the coin to be where you saw it put, or expecting a cat to make an 'easy' jump). Super fascinating stuff.
@kenneld
5 ай бұрын
Wouldn't the ball tracking be really easy (relatively speaking)?
@DevTheorem
5 ай бұрын
Too bad this video is mostly fake. The model is not using video or audio input - it was fed some handpicked still images and text prompts, and the output text (not real time) was edited into this slick marketing video. What you see is not a real representation of how the model performs.
@realdanney
5 ай бұрын
How’s the ball tracking even possible if it only operated on stills?
@DevTheorem
5 ай бұрын
@@realdanney Provide the right still images and it will output the "right" answer.
@dufung3980
5 ай бұрын
It’s a manhattan project, stop being anything but disappointed in your species.
@WiseBlise5 ай бұрын
Just so people know. This video has been edited by google and the AI doesnt actually work like this, which Google has admitted.
@TaylorCks03
5 ай бұрын
I still feel deep faked
@beyondrecall94465 ай бұрын
I was just watching the documentary about AlphaGo, which was amazing, and remember one of the programmers saidhow he was interested in AI nd wanted to work in that field 5 years prior (5 yers before it was filmed (it is an event from 2016.)),so 2011.. and everybody was just telling him that he was just wasting time... I can't believe this.. in such a short time span.. Simply mindblowing when you think of how everything changed in the last decade, like a different world.... I hope i get to revisit this comment in5-10 years and say : "How clueless we were back then.. we thought this was impressive :) "
@Atmatan_Kabbaher
5 ай бұрын
You fell for a fake demo. Good job.
@nandinisingh27945 ай бұрын
Can't wait to try it,with all the understanding this model is able to do it's just amazing.
@imqwerty51715 ай бұрын
Impressive. Waiting for Microsoft and OpenAI to play their move ⏳ Edit: Google played itself by faking the video. Respect 📉 "I hope with our innovation they will definitely want to come out and show that they can dance. I want people to know that we made them dance." - 🐐 CEO (Satya)
@ahtoshkaa
5 ай бұрын
GPT-5 in half a year that will make all of this look like child's play
@rubarion3650
5 ай бұрын
@@ahtoshkaa bro I have some knowledge regarding how GPT and other AI models in use today, work under the hood and I can tell you that the technology behind this google demo video is nothing like GPT models etc. This is Terminator/Matrix kind of stuff🙃🙃
@MM_Legacy
5 ай бұрын
Tests show the current Gemini version is somewhere between GPT 3.5 and 4.
@ahtoshkaa
5 ай бұрын
@@rubarion3650 The "main" Gemini Ultra - the one that supposedly beats GPT-4 - is not out. Gemini Pro is a bit better than GPT-3.5, but no where near as good as GPT-4. The showcased model seems to be on par with GPT-4V in terms of cognition. "Sequences shortened throughout" disclaimer prevents us from knowing the real inference time and whether its better than in GPT-4V. Very underwhelming for a model that is coming out more than a year later (GPT-4 finished training in 2022 Q4). It seems that they simply can't catch up to OpenAI
@amdrewhamris
5 ай бұрын
@@rubarion3650why are you acting like that's special knowledge, plenty of people understand how they work
@prem95015 ай бұрын
Happy to be alive to witness this ❤. Let's hope that all the hardwork goes into building these AI model will be fruitful and this Gemini will make the world a better place
@hanshmhansen2 ай бұрын
It's completely like hearing Lieutenant Commander Data from Star Trek. It makes me think of how well Brent Spiner actually played that role in the series.
@Mum_and_LizzyLark9 күн бұрын
I worked with many elderly at nursing homes. Maybe an AI friend whom they can talk to, tell stories to and interact, will be a kind of helpful project. AI has no many uses that can benefit humanity. I hope they can teach AI compassion and kindness too. The one job that AI can’t replace is caregiving because no one wants to really do my job and care for, and advocate for the elderly and disabled. But I wish they can help us caregivers with more AI tools someday. - Marie
@user-fn9cm5lr5k5 ай бұрын
the level of abstraction Gemini is capable of is mind-blowing
@gonzalobruna7154
5 ай бұрын
this is staged, sadly. there is a blog where they wrote how this was done, and first of all, this is not in real time, they pass specific frames to the model and they give VERY specific instructions on what to do. The model doesn't guess anyrhing at all. Even the game with the map, in the blog they show they wrote exactly what the instructions of the game were, so the model didn't come up with the idea. it's very dissapointing.
@thefireman17492
5 ай бұрын
@@gonzalobruna7154 that's interesting. Would you care to provide said blogs and articles where this exact point you have mentioned was brought up?
@bernhardd626
5 ай бұрын
All fake
@gonzalobruna7154
5 ай бұрын
@thefireman17492 sure, actually, it is linked on the description of the video itself, but I will link it here for you: developers.googleblog.com/2023/12/how-its-made-gemini-multimodal-prompting.html?m=1
@RajputNiku5 ай бұрын
How can i use Google's Gemini AI
@dominic2
5 ай бұрын
Bard uses Gemini now
@Astro.004
5 ай бұрын
@@dominic2Not yet
@jordanledoux197
5 ай бұрын
@@dominic2 Bard currently uses PaLM 2. Gemini won't be released until early 2024.
@ShawnFumo
5 ай бұрын
@@jordanledoux197 Well they actually said Gemini Pro is rolling out to Bard starting today. But Gemini Ultra will be early 2024
@faheemtariq61065 ай бұрын
I am thrilled and excited at the same time by real time interaction what's next? Can't wait to use it
@ManadayMavani5 ай бұрын
Marvellous stuff! I was pretty confident Google will take the AI race to the next level.
@horacehxw5 ай бұрын
This is soooo amazing! Much more dynamic and interactive than GPT. Can't wait to give it a try!
@do.xuantung
5 ай бұрын
Check the link in the description, even the current gpt 3.5 can do most of this. Gemini doesn't have live video or voice input from what you are seeing in the video
@appletree6741
5 ай бұрын
@@do.xuantungyeah it’s fake
@pratikpandey66805 ай бұрын
I love how it can come up with ideas Like the Guess the country game and one with yarn 🤩 Amazing!!!!❤
@Google
5 ай бұрын
So many fun things to try using Bard with Gemini Pro 💡
@bitwamet5 ай бұрын
are you having the fake video and how to test this myself
@ericonyango1690
Ай бұрын
No, I don't have fake videos or else I recorded them unknowingly.
@JonathanStonier5 ай бұрын
Pretty cool, I'm curious thought that all 5 of the images given for the yarn examples appear to be images shot on the same (or very similar) surface, which is also the same surface that the demo is happening on, which makes it feel contrived?
@TVCHLORD
5 ай бұрын
They faked it and it’s confirmed
@vectoralphaAI5 ай бұрын
That is incredibly impressive and mind blowing. To think that AI has become this capable nowadays. Now the competition is on for Microsoft/ OpenAI to see what they do because Gemini is incredible. Just making the timeline towards true AGI in 2 years(2025) even more credible and achievable.
@michaelcondon8286
5 ай бұрын
This may be staged just like Google Duplex from a few years ago. Don't believe anything from these people until you see it for yourself.
@familieweber55565 ай бұрын
When this is really working as being shown it is indeed mindblowing. Great job!
@appletree6741
5 ай бұрын
The video is misleading, it’s not real-time. Google has been criticised for this all over the internet
@kevincornwall24315 ай бұрын
And now we know the truth you faked this video...because you are playing catch up with OpenAI....SHAME ON YOU GOOGLE
@mrsmith51025 ай бұрын
My brain stops working when I watch this. That's probably the goal
@lukewilliamrimmington5 ай бұрын
This is fascinating and awe-inspiring that a multimodal model can do this! Well done to the Google team who probably had barely any sleep when this dropped.
@dufung3980
5 ай бұрын
It’s a manhattan project, stop being anything but disappointed in your species.
@lukewilliamrimmington
5 ай бұрын
@@dufung3980 This ain't the terminator. This is real life. AI can kill us, it's also a double edged sword. Advancements with these programs can be extremely beneficial to finding cures to cancers and beyond. So, who cares? Dont be-little me or the Google team. Be-little regulators for not doing enough. Dont hate the player hate the game son.
@mayankmittal99005 ай бұрын
Taking AI to next level collaboration
@Google
5 ай бұрын
That's the goal!
@MikeO89
5 ай бұрын
oh you just tickled the social media manager's g-spot
@journeysend1754
5 ай бұрын
@@Google What would you suggest a person learn if they wanted to build an application around Gemini like how people are building around GPT4, and do you plan on in the future maybe offering a Career Certificate on LLM as they are growing both adoption and flexibility
@totalignition12595 ай бұрын
Would be more impressive if it were actually responding to the audiovisual inputs and not still images and text prompts, as Google have admitted in a recent blog post.
@techinfo20502 ай бұрын
how gemini is watching things is it installed in robot like sophia or its watching from pc/laptop cam?
@technophile_5 ай бұрын
Mind Blown 🤯 Kudos to every single developer who worked on this! You are amazing!
@Google
5 ай бұрын
It takes a village of brilliant folks ✨
@michaelcondon8286
5 ай бұрын
This may be staged just like Google Duplex from a few years ago. Don't believe anything from these people until you see it for yourself.
@ZOXENE
5 ай бұрын
Seems like the village needs new people, Count me in
@alexp.3694
5 ай бұрын
Why is everyone so happy about google building a literal pandora's box?? No one knows what's going inside there and how safe it is... Yet everyone is happy like brainless kids
@SuperFuriousfox5 ай бұрын
Wow, Gemini is incredibly impressive! The combination of multimodal capabilities and its ability to interact with different formats like text, images, and even code is truly groundbreaking. I especially loved the showcase of visual puzzles and image & text generation. This technology has the potential to revolutionize the way we interact with AI and open up new possibilities for learning and creative expression
@Atmatan_Kabbaher
5 ай бұрын
Okay bot
@pegassi11
5 ай бұрын
@@Atmatan_Kabbaher 😂
@mostwanted271
5 ай бұрын
@@pegassi11 This is FAKE! google admitted it.
@Micx9x30
5 ай бұрын
3:42
@appletree6741
4 ай бұрын
There’s no real time interaction it’s fake
@jessysarazin22085 ай бұрын
I would be mind blown if it wasn't edited to be more impressive
@Minimalrevolt-m835 ай бұрын
Superb fantastic creative invention from humanity in the 21st century. Advancing and interesting creation. Wish that it could be market to Malaysia soon..!👏🏻
@The_spaceguy5 ай бұрын
I think google deserves more credit for this and it’s nice to see them actually competing. This model seems really powerful and although I might not use the video input feature, it alone gives a whole lot more promise for audio and text too. Can’t wait to try it.
@do.xuantung
5 ай бұрын
You should see their blog post in the description. It is a lot less impressive than what you are seeing in the video. Such as the map game was an input prompt, Gemini didn't even generate that idea
@DajuSar
5 ай бұрын
Fake stuff xd really impresive how they can be competitive with manufactured test and misleading advertising. Really putting their graint of sand in the ecosystem
@ragon74715 күн бұрын
When this version will be available for the general public?
@aidanearl5 ай бұрын
*A video showcasing the capabilities of Google's artificial intelligence (AI) model which seemed too good to be true might just be that.* The Gemini demo, which has 1.6m views on KZread, shows a remarkable back-and-forth where an AI responds in real time to spoken-word prompts and video. In the video's description, Google said all was not as it seemed - it had sped up responses for the sake of the demo. But it has also admitted the AI was not responding to voice or video at all. In a blog post published at the same time as the demo, Google reveals how the video was actually made. Subsequently, as first reported by Bloomberg, Google confirmed to the BBC it was in fact made by prompting the AI by "using still image frames from the footage, and prompting via text". "Our Hands on with Gemini demo video shows real prompts and outputs from Gemini," said a Google spokesperson. "We made it to showcase the range of Gemini's capabilities and to inspire developers."
@DarkH4X05 ай бұрын
That's awesome Google!! But I must be completely honest with you... what really sold me this was the: "what the quack!" at 1:07 🦆
@bobfrasure84365 ай бұрын
Impressive, I can't wait to try the released product. Even if it's scaled back, it'll be a win!
@gavinferrari-cray
3 ай бұрын
read the comments its not real
@MichaelFergusonVideos5 ай бұрын
I am wondering what system prompts were required.
@evanseesred5 ай бұрын
I can’t believe this was totally real and not staged whatsoever 😂
@Thirunaking
2 ай бұрын
kzread.info/dash/bejne/aqSHusOhqteqZMY.html
@jitterskater5 ай бұрын
Incredibly impressive. Genuinely shocked by how good it already is.
@gonzalobruna7154
5 ай бұрын
it's a fake video, check their blog post: developers.googleblog.com/2023/12/how-its-made-gemini-multimodal-prompting.html
@ojisan4220
5 ай бұрын
as of December 2023 it is sort of fake
@Lomusicchannel5 ай бұрын
Can I know why the Sun - Earth - Saturn test had the assumption of ‘left to right’ plus ‘distance’, etc; when it could have been based on other criteria, such as ‘water composition’ plus ‘right to left’ for example, in which case it would be Sun-Saturn-Earth, as an example. - Is the real unseen issue here the assumptions, which were not mentioned? From a legal or moral perspective, would it not be a prerequisite for Gemini to state the parameters/assumptions it is making, as it did worth some of the other tests? (Or has that been edified out of the video?) - I would expect something like “If the series is based on ..eg. ‘capabilities to support life’, then the most likely from left to right wold be Earth-Saturn-Sun. - Are these ‘assumptions on parameters’ based on general statistical probabilities? If so, if those are ‘skipped over’, as assumptions are accepted; aren’t we in for some serious mistakes, like when we lost a satellite in a Mars mission due to miscalculation of LBS vs KG, from Japanese vs Western scientists? We call that ‘human error’ - But how many ‘AI Errors’ can we estimate based on Simulation for the future? Thanks J
@lionking.MuZimba5 ай бұрын
Can someone explain to me slowly how this video was made and how in real life what was happening here can happen. i.e. would the input be scans of images into a chat window as well as voice prompts or text prompts?
@Atmatan_Kabbaher
5 ай бұрын
Yes; Step One: it's fake.
@theNobs15 ай бұрын
The first interaction is definitely a nod to the movie Billy Madison. Proving the AI can draw a blue duck is the only way to pass the 1st grade. Thats quacktastic indeed!
@lifeofameji4 ай бұрын
when will it be added to the google pixel & are we getting glasses too like the "rayban & meta" glasses? This would be a good "window" to allow gemini see & can be a proper assistant & pleaaase, give BARD a calender extension. I really need it.
@HemantGiri5 ай бұрын
i am so upset i saw the news people saying this was fake video ? in real this test was done by showing pic not video i never expected this from google atlest :(
@Viperzka5 ай бұрын
That is incredibly impressive. There were clearly some hidden prompts as it kept understanding switching contexts. But still it was highly impressive.
@thepowerhat
5 ай бұрын
It was really a chat conversation with photos and videos sent directly to the AI - The video on the left is scripted
@tapist3482
5 ай бұрын
@@thepowerhat How the prompt was input doesn't matter. It's how the model understand highly abstract line drawings that is impressive.
@smutnejajo5149
5 ай бұрын
@@tapist3482 Read the blog post. The drawing parts aren't even there, which leads me to think they've been either (pesimistically) entirely engineered or (optimistically) they showed a few examples of these drawings, explained the rules (the duck should not go to the enemy), and asked the AI to reproduce that. It is much less impressive than the video makes it out to be. For example, the game the AI "invented" was actually described in the prompt. It still is an impressive model, but the video is just a total misdirection.
@gonzalobruna7154
5 ай бұрын
actually not even videos were sent as prompts, it was all still images. Pretty dissapointing if you ask me. @@thepowerhat
@gonzalobruna7154
5 ай бұрын
it's true, actually, the AI "guessed" that the game was rock papers scissors, but check the prompt: "What do you think I'm doing? Hint: it's a game." They literally had to write a hint so the AI would understand what is going on. And yeah, the prompt is not a video, is just 3 images of the hands doing the rock paper scissors. @@smutnejajo5149
@aeroflack5 ай бұрын
outstanding! i wish you could apply this to Google Home and make it smarter and allow us to add as many conditions as we want to run complex automations. Please make it happen !
@KentDozier
5 ай бұрын
Can you imagine "when anyone who is not in our family comes into our house when nobody in our family is home, send me an alert", with the AI having access to security camera feeds.
@phen-themoogle7651
5 ай бұрын
@@KentDozier Amazing security system! Brilliant idea
@Pixelarter
5 ай бұрын
@phen-themoogle7651 And scary. It will know everything you do in a meaningful way, and even be able to manipulate you if it has feedback to you or the environment.
@joelxart
5 ай бұрын
Yeah, the current Google Home appears to be soooo 'stupid' compared to all those latest AI toys. I guess the weather forecast just won't cut it :)
@AdaLao4 ай бұрын
Amazing！I think it can be initially used to improve cognition and memory in the elderly, which will be of great help in preventing Alzheimer's disease. Then, it can also be opened to children's learning, but parental control is required to prevent excessive screen time from affecting the development of brain cells.
@faheemmusthafa77504 ай бұрын
Can I try this which do image recognition without capture and chat at the same time