Jon Barron

3 жыл бұрын

Mip-NeRF: A Multiscale Representation for Anti-Aliasing Neural Radiance Fields

Пікірлер

@jimj26833 ай бұрын

Please apply this to Google Street View.

@Ali-wf9ef6 ай бұрын

Positional Encoding was just like a black magic to me that just works. Now you introduced Integrated Positional Encoding which is black magic on top of black magic. How do you guys understand what is happening here

@michaelcopeman95777 ай бұрын

Simply brilliant.

@user-sc5xk6vy8j9 ай бұрын

I don't understand why distortion loss can be written like this. Doesn't the cubed term appear?? Is there anyone who knows??

@PunxTV1239 ай бұрын

how to use this?

@kristoferkrus9 ай бұрын

Sweet, looks like it works well and it's simple to use! Have you tried using a mixture of Gaussians as output and KL-divergence as loss function? It would be interesting to see how well that performs against your method. Granted, even a mixture of Gaussians will be sensitive to outliers unless you include some Gaussian with extremely large standard deviation; we just need some way to enable the network to be able to easily output values that can be interpreted as very large standard deviations.

@patricksullivan337210 ай бұрын

This is unreal. So impressive. Thank you for your work!

@MikeBarron10 ай бұрын

Crazy impressive! Approximately many input images were needed to render the movies at the end?

@jon_barron10 ай бұрын

Hey thanks Mike! Those results are around a thousand images: someone walking through the space holding down the shutter button on a DSLR, waving it around. It's a lot of images but it's surprisingly fast

@mene21723 ай бұрын

@@jon_barronso these are not frames grabbed from a video? Are they hi-res images?

@TheMazyProduction10 ай бұрын

We’re so back

@greg.skvortsov10 ай бұрын

That's a damn nice graphical explanation!

@iratemusic4575 Жыл бұрын

AI getting smarter, deep genius, and faster. Can AI or ChatGPT help autism get smarter?

@sinlife484 Жыл бұрын

amazing! I'm curious if it works for indoor scene reconstruction, could you please tell me ?

@saemranian Жыл бұрын

Awesome 😶‍🌫

@uttamg911 Жыл бұрын

Great work!! Are there any accepted submissions that didn’t make the cut for the highlight reel?

@TyroneHilpert Жыл бұрын

"PromoSM" ❗

@jeffreyalidochair Жыл бұрын

a practical question: how do people figure out the viewing angle and position for a scene that's been captured without that dome of cameras? the dome of cameras makes it easy to know the exact viewing angle and position, but what about just a dude with one camera walking around the scene taking photos of it from arbitrary positions? how do you get theta and phi in practice?

@addincui2617 Жыл бұрын

Great honor to share some works on CVPR as a non-scientist. Best wishes to CVPR.

@ratdreamsai6324 Жыл бұрын

I love all of the great AI creative works highlighted here. and I feel honored to be featured these other creative minds! 🐀❤

@ratdreamsai6324 Жыл бұрын

I love both the ai & non-ai work! It's the blend of tools, craft, and creative vision that can make these pieces extra special.

@StormOrMelody Жыл бұрын

@@ratdreamsai6324 we're entering a new era of unbounded human creativity <3

@ratdreamsai6324 Жыл бұрын

@@StormOrMelody Yes - What a time to be a live! People of all capabilities will be able to bring their visions to life in new, never before possible ways!

@abenedict85 Жыл бұрын

zoom. enhance. Zoom. Enhance. ZOOM! ENHANCE!!!!

@user-hi2bz8ds2h Жыл бұрын

Hi Jon, I saw the newest project provided by you and your colleagues, Ben Mildenhall. I want to say that it is so impressive and simultaneously useful for plenty of new ideas and projects. We are working on some interesting projects where we think Anti-Aliased Grid-Based Neural Radiance would be the best option to increase the effectiveness and productiveness. Hence, can we have some direct conversation about the project ? I am really looking forward to hearing from you.

@user-by6fr4dj4k Жыл бұрын

which lab did you cooperate in Harvard?

@magicnifties Жыл бұрын

The views of this vid will blow up very soon! 🙌 Great explanation on an advanced topic of AI. 🤓

@shubhashish7090 Жыл бұрын

can we exatrct the mesh from this to be used in any traditional game engine

@Instant_Nerf Жыл бұрын

Is video playback possible wirh nerf?

@jon_barron Жыл бұрын

Yeah check out kzread.info/dash/bejne/rHaHqo-kaarIhpc.html

@mahmoodmasarwa3374 Жыл бұрын

any way we can test it ?

@jon_barron Жыл бұрын

Sure! github.com/google-research/multinerf

@frenchmarty74462 жыл бұрын

Insanely cool

@ncmasters2 жыл бұрын

Is the code available yet?

@jon_barron2 жыл бұрын

Yep, here you go: github.com/google-research/multinerf

@ncmasters2 жыл бұрын

@@jon_barron Thanks

@changgongzhang66412 жыл бұрын

For the regularizer at 6:40, the minimum of Loss_dist is achieved by setting w(u) = 0 everywhere, right? wondering how it can become to a delta function?

@jon_barron2 жыл бұрын

You are correct, the reason that w(u) doesn't get set to zero everywhere is because that would cause the data term of the loss to be extremely high. In that animation I normalized w to sum to 1, which in practice is what happens during training because of the data term.

@changgongzhang66412 жыл бұрын

@@jon_barron Thanks a lot for your explanation!

@Askejm2 жыл бұрын

will it be publically availabe/is someone working on an implementation?

@jon_barron2 жыл бұрын

Yes, we'll be releasing code soon.

@Askejm2 жыл бұрын

@@jon_barron does it run on windows and will it have a pretrained model?

@saltygamer843510 ай бұрын

have you released the code?@@jon_barron

@bharatsingh4302 жыл бұрын

Looks pretty amazing! To import this to a graphics engine and be able to render new objects in this scene, we would also need the light-sources, material properties (diffuse/specular etc.) and normals (unless the depth maps are super accurate) - wonder if those could also be recovered by modifying this technique..

@timsousa3860 Жыл бұрын

Seems quite the challenge

@superatomic97612 жыл бұрын

this is amazing. does it work with images of people?

@willhart21882 жыл бұрын

I'd like to try this in VR.

@legoworks-cg5hk2 жыл бұрын

What a time to be alive

@BAYqg Жыл бұрын

dear fellow scholars

@Neptutron2 жыл бұрын

This is amazing! Given the precision of the depth maps is greater than the input of SVS, could this be used for more accurate photogrammetry?

@legoworks-cg5hk2 жыл бұрын

Exactly what I was wondering

2 жыл бұрын

@@legoworks-cg5hk hopefully RealityCapture or Agisoft adopts this fast

@legoworks-cg5hk2 жыл бұрын

@ is there a way to use reality capture without nvidia?

2 жыл бұрын

@@legoworks-cg5hk sadly no.... Agisoft can be used with any GPU, but is slow as hell without CUDA... one solution is cloud processing

@legoworks-cg5hk2 жыл бұрын

@ only problem with agisoft is that you have to pay for it to export models

@HaroldR2 жыл бұрын

This is great!

@Halcyox2 жыл бұрын

Quite excellent work that is done here!

@Padoinky2 жыл бұрын

Have to goggle this to even understand what it is about

@GmZorZ2 жыл бұрын

quite an interesting way of representing depth, how is it done? it’s about time black and white maps become a thing of the past!

@jon_barron2 жыл бұрын

This is the "turbo" color map: ai.googleblog.com/2019/08/turbo-improved-rainbow-colormap-for.html

@GmZorZ2 жыл бұрын

thank you so much! very informal, i’d have guessed more bit variation in an image presents much more detail, for me it’s really all about what you can fit in standardized 32bit formats! I will incorporate this in my own projects to further normalize the standard.

@marknadal96222 жыл бұрын

@@jon_barron is color/density at depth position determined by the training? In all the videos it seem to be implied that its known as input, but that clearly isn't possible from a 2D image without prior object size training. Sorry for dumb Q.

@kwea1232 жыл бұрын

@@jon_barron Thanks for the great article! I always used jet, and was wondering how turbo is different. Now turbo looks better to me!

@pretzelboi642 жыл бұрын

If you don't even know what color mapping is, I don't think you're qualified to talk about what should be things of the past lol

@ThetaPhiPsi2 жыл бұрын

The depth map is amazing, wow!

@khaledbouzaiene39592 жыл бұрын

can we can extract 3d model maps ?

@HA-cy4vx2 жыл бұрын

wowwwwwww

@exhibitscotland3602 жыл бұрын

Very Interesting.

@cailihpocisuM2 жыл бұрын

Late to the party, but incredible work!

@cailihpocisuM2 жыл бұрын

Amazing results, truly compelling!

@yasserothman40233 жыл бұрын

@1:42 what is x ?

@Q_203 жыл бұрын

amazing

@luke26423 жыл бұрын

Does changing the activation function to Siren help at all in larger NERF networks? My little colab experiments it seems to train fast, but it might not scale. Also, interested to see if a modern hopfield network could represent the NERF data well too?

@jon_barron3 жыл бұрын

So far I haven't seen any results where SIREN improves NeRF's test-set performance. SIREN primarily targets quickly minimizing training loss (and does a great job at it!) but doesn't really focus on generalization, and performance in NeRF is largely determined by how well the model generalizes to new views.

@JoshuaKolden3 жыл бұрын

I don’t want to be disparaging this or any of the other work on NeRF, but I don’t understand where the innovation is. I get the very strong impression it is just rehashing work that has already been done in graphics, and long ago at that. What is the compelling improvement that neural networks bring other then novelty, over prior work? We’ve been able to generate new camera angles, and volumetric reconstructions from still photography for literally decades.

@jon_barron3 жыл бұрын

I think an important distinction is that NeRF lets you construct models from images.

@eelcohoogendoorn80443 жыл бұрын

'We’ve been able to generate new camera angles, and volumetric reconstructions from still photography for literally decades.' Yes, there exist prior work in this area. Have you actually looked at any f the comparisons in the nerf papers? Do you feel they make an unfair comparison? They seem like very compelling improvements over the state of the art to me.

@kwea1233 жыл бұрын

I just found that the positional encoding has some similarity with traditional 2nd order differential equation used in physics (e.g. harmonic oscillator en.wikipedia.org/wiki/Harmonic_oscillator ) Original positional encoding: PE(x) = (sin(x), cos(x), sin(2x), cos(2x), ... sin(2^Lx), cos(2^Lx)) Harmonic oscillator w/o damping: x'' + w^2x = 0 whose solution is Asin(wt)+Bcos(wt), So PE(w) is actually the solution evaluated at t=1, 2, .. 2^L with different initial conditions (that lead to A=1, B=0 or B=1, A=0). IPE here: IPE(x, u, s) = (sin(u)e^(-s^2/2), cos(u)e^(-s^2/2), sin(2u)e^(-2s^2), cos(2u)e^(-2s^2), ...) Harmonic oscillator w/ damping: x''+2kwx'+w^2x = 0 (with k<1) whose solution is Ae^(-kwt)sin(w1t) + Be^(-kwt)cos(w1t) where w1=k*sqrt(1-w^2). Again IPE(w, u, s) corresponds to the solution evaluated at different t's with different initial conditions. Is this just a coincidence?

@jon_barron3 жыл бұрын

Great insight! I can't tell if it's a coincidence or a meaningful connection at first glance, but I'll investigate further.

@sheetalborar68133 жыл бұрын

Would this loss work in classification tasks as well? As it does not match the shape of the cross-entropy loss function

Jon Barron

Zip-NeRF: Anti-Aliased Grid-Based Neural Radiance Fields --- Supplemental Video

CVPR 2023 Art Gallery Highlight Reel

Mip-NeRF 360: Unbounded Anti-Aliased Neural Radiance Fields

Mip-NeRF: A Multiscale Representation for Anti-Aliasing Neural Radiance Fields

"A Generalization of Otsu's Method and Minimum Error Thresholding", Jonathan T. Barron, ECCV 2020

"A General and Adaptive Robust Loss Function" Jonathan T. Barron, CVPR 2019

Fast Fourier Color Constancy - Supplemental Video

Convolutional Color Constancy - ICCV 2015 Spotlight

Volumetric Semantic Segmentation using Pyramid Context Features, ICCV 2013 - 1

Volumetric Semantic Segmentation using Pyramid Context Features, ICCV 2013 - 2

Color Constancy, Intrinsic Images, and Shape Estimation

Пікірлер