Running Neural Networks on Meshes of Light

I want to thank Alex Sludds for his efforts in helping me research and produce his video. Check out his work here: alexsludds.github.io
Links:
- The Asianometry Newsletter: asianometry.com
- Patreon: / asianometry
- The Podcast: anchor.fm/asianometry
- Twitter: / asianometry

Пікірлер: 332

  • @Hassanmohamed31152
    @Hassanmohamed31152 Жыл бұрын

    I almost commented a few videos ago that you have single handled staffed all US semi conductor fabs with engineers in the next 10 years just by posting. Happy to see you grow so much even in your own niche without clickbait.

  • @geneballay9590

    @geneballay9590

    Жыл бұрын

    you have single handled staffed all US semi conductor fabs with engineers in the next 10 years just by posting. THE SAME THOUGHT OCCURED TO ME. BY PRODUCING THESE VIDEOS HE IS OPENING UP A WORLD OF OPPORTUNITIES FOR OTHERS TO SEE AND CONSIDER.

  • @baptistedelplanque8859

    @baptistedelplanque8859

    Жыл бұрын

    This video is sponsored by chipactVPN

  • @wale7342

    @wale7342

    Жыл бұрын

    I’m getting my comp eng degree rn rn

  • @jaazz90

    @jaazz90

    Жыл бұрын

    definitely, one of the topics I've always been interested in but never had any good source of information about, if I was younger I'd strongly consider partaking in it

  • @harrykekgmail
    @harrykekgmail Жыл бұрын

    A big Thank You to Alex Sludds too (from grateful audience)!

  • @outerspaceisalie

    @outerspaceisalie

    Жыл бұрын

    @Nobody Important thats none of your business

  • @oscarruorochmolinacansino5907

    @oscarruorochmolinacansino5907

    Жыл бұрын

    @Nobody Important Time travel.

  • @stefanklaus6441

    @stefanklaus6441

    Жыл бұрын

    @Nobody Important The video may have been on private before?

  • @ivoryas1696

    @ivoryas1696

    Жыл бұрын

    @@outerspaceisalie Indeed. Should we take him out? He might have learned too much...

  • @clemenkok5758
    @clemenkok5758 Жыл бұрын

    As a ECE sophmore in college, I just wanted to say that you're playing an amazing role in developing the next generation of semiconductor engineering! :)

  • @geneballay9590
    @geneballay9590 Жыл бұрын

    Wow, your videos just get better and better. As I watched this I kept having flashbacks to my university math/physics discussions on matrix mechanics of more than 50 years ago, and realizing that those concepts remain important in today's world.

  • @guaposneeze
    @guaposneeze Жыл бұрын

    FWIW, that observation that IO consumes more power than MAC operations in an AI accelerator is pretty universal across problem domains. I often quip that it's a silly accident of history that we call the metal boxes "computers" since almost none of the power, gate count, mass, etc., is actually used directly for computation. Most of computing in-practice is about getting the right data to the right place at the right time. I have a patent on internet-scale CDN configuration. But it's all the same at every scale. Pushing configuration data across the globe to the right server. Pushing weight data across the chip to the right IO pin. The memory/storage hierarchy instantly becomes the constraint as soon as you try to scale compute at any scale, in any domain. The ideas driving photonic compute for AI will be directly applicable to more seemingly mundane use cases.

  • @Soken50

    @Soken50

    Жыл бұрын

    It always comes down to logistics and thermodynamics, humanity's two biggest nemeses, doesn't it?

  • @0MoTheG

    @0MoTheG

    Жыл бұрын

    It was not always so in the past. Leakage could also be a major contributor. But it was always known that the wires and power density would become the main problem, both in power and delay.

  • @watermans7357
    @watermans7357 Жыл бұрын

    I work in a research group which develops simulation tools for these photonic circuits. This video was very well explained. I can't wait to see what photonic circuits will be used for in the future. Thanks for making this video!

  • @raphaelcardoso7927

    @raphaelcardoso7927

    Жыл бұрын

    What types of tools do you develop?

  • @Soken50

    @Soken50

    Жыл бұрын

    He said at the end a 1D row of interferometers can perform like a 2D array using time instead, would the same priinciple apply for 2D to 3D if the accuracy for 2D can be improved ?

  • @watermans7357

    @watermans7357

    Жыл бұрын

    @@raphaelcardoso7927 We develop tools that leverage Artificial Neural Networks to simulate the performance of photonic devices. All of our softwares are free and open source.

  • @watermans7357

    @watermans7357

    Жыл бұрын

    @@Soken50 I am not sure, since I do not deal with the theory-sort of stuff, my wokr us mostly in the software development side of things.

  • @chandrasekarank8583

    @chandrasekarank8583

    Жыл бұрын

    Hi can we connect on LinkedIn ??

  • @aberroa1955
    @aberroa1955 Жыл бұрын

    Electrical signals are essentially sent at the speed of light (because it's not charge carriers that transmit the signal, it's electrical fields), so it's not the signal propagation speed that allows high throughput, it's ability to distinguish signals. Electrical fields get "smudged" along the way, but so does electromagnetic signals. But second one does it remarkably lesser. Also, electrical logic gates takes some time to transition from one state to another, and that's the major factor in limiting throughput. If there would be faster switching transistors - higher frequencies would be available. I don't know about photonics, but it seems that for it transition is either much faster, or architecture is completely different, like, instead of switching state, light is split along the way and goes through preconfigured logic gates, so processing is faster while it goes through same transformations, but takes some time to switch from one configuration to another. But there's possibility that same results could be achieved with using electronic components.

  • @davidb5205

    @davidb5205

    Жыл бұрын

    I wouldn't say electrical signals travel at "essentially the speed of light." That applies to maybe radio waves in free space. But velocity factor/wave propagation speed is typically ~64% the speed of light (Cat 5 data cables) to ~90% the speed of light (RF signals). Without taking insulation into account which reduces VF further. Even if the jump is from 90% to 99% the speed of light, that optimization would result in huge improvement. But like you said, it's about ability to distinguish signals, the accuracy of detection at the receiving end. Without that it's unusable.

  • @aberroa1955

    @aberroa1955

    Жыл бұрын

    @@davidb5205 True, but overcomplicated, so that's why there was a word "essentially" - because it's still by multiple orders of magnitude faster than charge carries move. Also, in photonics light is travelling significantly slower than c too, because it does so in a medium (glass, or whatever), which slows down electromagnetic wave propagation.

  • @XCSme

    @XCSme

    Жыл бұрын

    If gate transition time is the bottleneck, shouldn't FPGAs or static circuits (simple wires) not have this limitation? Or you can not implement matrix multiplication without using gates?

  • @aberroa1955

    @aberroa1955

    Жыл бұрын

    @@XCSme FPGAs aren't simple wires, they too use transistors and switch state each clock iteration based on input. Static circuits... well, as long as they do not use any capacitors and do not have too much capacitance or inductance on their own, they'd be lightning fast, but as useful as a plain wire or resistor, unable to compute anything. One could say that transistor is a teeny-tiny capacitors. And transistors takes time to charge. The less capacitance they have - the faster is charging. And despite modern nanometers transistors have neglectable capacitance, they still have it and they need some time to charge or discharge. If you have, tons of transistors, but they're mostly in parallel, then you can raise clock cycle maybe to tens of GHz and it would be fine, but you won't be able to do many operations per cycle, only the basic ones. If, on the other hand, you have same number of transistors interconnected with each other like in CPU, then your frequency would be limited by the latest one in chain - you need to be sure that each one in series before this one had enough time to charge/discharge, otherwise last one could end up in incorrect state.

  • @XCSme

    @XCSme

    Жыл бұрын

    @@aberroa1955 Thanks a lot for the response! I realized that my initial comment was a bit stupid to suggest an FPGA without logical gates, as that's the G stands for in the acronym... That being said, is it really no way to compute anything without using transistors? What if you use the voltage value as the output? Let's say you have to make an addition, if you feed 0.5V and 0.3V at the input, and link them in series, you should get 0.8V (maybe this is just an analog computer? en.wikipedia.org/wiki/Analog_computer ). Also, for example division could be done by adding resistors in parallel, so let's say you want to divide by 3, you feed 1V at the entrance, and then have 3 resistors/paths, to get input divide by 3 you could measure the output voltage of one of the paths.

  • @paxdriver
    @paxdriver Жыл бұрын

    You did a really good job on this one, man. That's no small feat, bravo

  • @alexsludds1377
    @alexsludds1377 Жыл бұрын

    Great work Jon!

  • @lilacswithtea

    @lilacswithtea

    Жыл бұрын

    thanks to you, he really made light work of this topic!

  • @suntemple3121

    @suntemple3121

    Жыл бұрын

    Thank you Alex, all the best blessings to you and yours.🌟🌟🌟🌟🌟🌟🌟🌟

  • @PlanetFrosty
    @PlanetFrosty Жыл бұрын

    Excellent work! My company is one of those working on photonics/quantum compute InFlight as optical networks transit the world. Though quite different, great progress has been made.

  • @stevengill1736
    @stevengill1736 Жыл бұрын

    So there an analog aspect of these calculators as well? Very cool...exactly what was wanted. Can't wait to see how this tech works out....cheers

  • @QSecty
    @QSecty Жыл бұрын

    i have the idea to do computing with light 5 years ago, but have no clue to get it done. glad to see a big step in computing!

  • @johanlarsson9805
    @johanlarsson9805 Жыл бұрын

    Thanks for metioning the paper! I knew I recognized this and when you showed it I realized it was 4 years since I read it.

  • @satadrudas3675
    @satadrudas3675 Жыл бұрын

    This was a very informative video. I am in fact working on a time-multiplexed SiPh matrix multiplication design like the one you mentioned towards the end of your video.

  • @WildEngineering
    @WildEngineering Жыл бұрын

    im really glad to see this tech being mentioned more and more.

  • @billwhoever2830
    @billwhoever2830 Жыл бұрын

    1) Electrical signals also travel at the speed of light (speed of light inside the conducting material), the signal is transmitted by photons. The main limiting factor of electronic computers are the capacities inside them. The most basic one is the capacity of the FET gates. In case for a FET to function the gate needs to reach the desired charge and this although getting smaller and smaller with the new nano transistors is still there. The same applies to discharging of those capacities which still takes time and also dumps all of their energy into heat. 2) the speed of light is a limiting factor even on photonics: A 4ghz chip, something that might be in a modern computer has a 4ghz clock and a period of 0.25ns between clock cycles, light can only travel 75mm in such a period and this is in the best case scenario (in vacum). A theoretical 40ghz photonic computer will have a 0.025ns or 25ps period and light will only be able to cover 7.5mm. This means that even in a 40ghz chip, the maximum distance for the datapath inside a computation core is at 7.5mm in the best case scenario. Having photonic computers working at teraherz is almost certainly sci-fi. And ofcource this type of cpu, with such a small distance covered between clocks will have very big memory bottlenecks (time it takes in cycles for data to be stored-recovered from the memory) and will require the memory to be very very close to the chip.

  • @tf_d

    @tf_d

    Жыл бұрын

    This.

  • @billwhoever2830

    @billwhoever2830

    Жыл бұрын

    @@tf_d I just noticed a mistake in my comment. I said that the datapath on a 40ghz core would be 7.5mm. In reality datapaths are normaly pipelined so, each individual stage of the pipeline would be limited to that length (this totaly applies to electronic computers). Pipelines on cpus today are around 8-20 stages long. Im not sure if pipelining would work on photonics and I think there would need to be electronic circuits between the stages anyways.

  • @tf_d

    @tf_d

    Жыл бұрын

    @@billwhoever2830 I don't see why pipelining wouldn't be possible with photonics, they're technically able to do anything that an electronic circuit can.

  • @bakedbeings
    @bakedbeings Жыл бұрын

    Quick posts! Really enjoying your silicon rabbit hole.

  • @infinitumneo840
    @infinitumneo840 Жыл бұрын

    Silicon Photonics represent a quantum leap in technological speed and power efficiency. One major issue when dealing with light is the fact that you're reading the probabilities of the light waves. You run into quantum mechanics at this level. Light is sensitive to interference of the environment through quantum decoherence. I believe there will be a solution to this problem as our understanding of quantum systems evolves.

  • @Primarkka

    @Primarkka

    Жыл бұрын

    @@rufushawkins3950 Very simplified it means something is quantifiable, as in a photons energy is discrete in a way.

  • @venerable_nelson

    @venerable_nelson

    Жыл бұрын

    @@rufushawkins3950 When used as a noun it means small, when used as an adjective it means big. English!

  • @marilynlucas5128

    @marilynlucas5128

    Жыл бұрын

    Geometry is the key to solving the quantum mechanics problem

  • @unvergebeneid

    @unvergebeneid

    Жыл бұрын

    Wow, there is so much wrong with this one small comment that it would take an essay to pick it apart.

  • @benjybo

    @benjybo

    Жыл бұрын

    How does this interference from the environment behave? Can it be algorithmically modeled? If so, I believe it might possible to create a noise generator which mimics this behavior during neural network training. This can help “robustify” the neural networks, to prepare them for inference on such optical devices. This can provide an software rather than hardware approach to mitigating the accuracy issue.

  • @tortugatech
    @tortugatech Жыл бұрын

    Great video as always! Keep them coming, love it!

  • @Erik-gg2vb
    @Erik-gg2vb Жыл бұрын

    I watched a you tube called "The next big step in computing" by Anastasi, she mention how they are trying to use light in a analog form, different intensity's as a new way to compute. Not as in depth as here but still over my head.

  • @rahulmathew4970

    @rahulmathew4970

    Жыл бұрын

    Happy to know that i am not only one following her

  • @mclilzenthepoet2331

    @mclilzenthepoet2331

    Жыл бұрын

    Oy another Anastasia followers nice

  • @stefanklaus6441
    @stefanklaus6441 Жыл бұрын

    I have recently seen a great video on why our end/or gates will always dissipate energy. The answer "boils down" to entropy. Depending on how far into theory one wants to dabble this might be pretty interesting content.

  • @SianaGearz

    @SianaGearz

    Жыл бұрын

    Can you give a better set of keywords or a full title?

  • @stefanklaus6441

    @stefanklaus6441

    Жыл бұрын

    @@SianaGearz why pure information gives of heat By up and atom

  • @VicenteSchmitt

    @VicenteSchmitt

    Жыл бұрын

    @@stefanklaus6441 Watched it yesterday, great video

  • @Soken50

    @Soken50

    Жыл бұрын

    gates are used to make "bits" interact and potentially effect a change of states, that change will of course necessitate a certain amount of work, however tiny it is we can't get a system to change states without expending energy somewhere.

  • @benjaminlynch9958
    @benjaminlynch9958 Жыл бұрын

    Awesome video. Another reminder of why I’m subscribed. 👍🏼 This technology is really cool. It seems like the use case to make this commercially viable is training massive neural networks rather than inference. It’s the training that is computationally expensive and requires stupid amounts of computing power. That’s a challenge that needs to be solved. Inference on the other hand is trivial by comparison. Almost every smartphone these days has a built in neural engine that can run inference in real time at less than a watt for relatively simple problems, and even moderate to large problems can be run through inference on a traditional modern CPU with no dedicated matrix multiplier.

  • @JamEngulfer

    @JamEngulfer

    Жыл бұрын

    I wonder if you got the photonics cheap enough and small enough, the accuracy could be improved by running the same calculation multiple times and averaging it. Though the extra electronics and redundancy might offset any gains made…

  • @cerebralm
    @cerebralm Жыл бұрын

    At 5:04, did you mean to write picojoule instead of petajoule?

  • @punditgi
    @punditgi Жыл бұрын

    Excellent video! Learned a lot. Well done! 😃

  • @TaylorAlexander
    @TaylorAlexander Жыл бұрын

    Thank you for this! I have been seriously wondering about Lightmatter and I just checked up on them recently. Looks like they’re hiring some powerful folks and hopefully going to be able to offer real products soon!

  • @mapp0v0
    @mapp0v0 Жыл бұрын

    have you heard of Brainchips Akida chip? Currently in production. Akida is a neuromorphic system on a chip designed for a wide range of markets from edge inference and training with a sub-1W power to high-performance data center applications. The architecture consists of three major parts: sensor interfaces, the conversion complex, and the neuron fabric. Akida incorporates a Neuron fabric along with a processor complex used for system and data management as well as training and inference control. The chip efficiency comes from their ability to take advantage of sparsity with neurons only firing once a programmable threshold is exceeded. NNs are feed-forward. Neurons learn through selective reinforcement or inhibition of synapses. Sensory data such as images are converted into spikes. The Akida NSoC has neuron fabric comprised of 1.2 million neurons and 10 billion synapses. For training, both supervised and unsupervised modes are supported. In the supervised mode, initial layers of the network are trained autonomously with the labels being applied to the final fully-connected layer. This makes it possible for the networks to function as classification networks. Unsupervised learning from unlabeled data as well as label classification is possible.

  • @x2ul725
    @x2ul725 Жыл бұрын

    Such a fun video ! Great work guys !

  • @anteconfig5391
    @anteconfig5391 Жыл бұрын

    "Photonic Neural Networks" That a yummy combo of words. I hope this video doesn't disappoint.

  • @gustavderkits8433
    @gustavderkits8433 Жыл бұрын

    Good that you started looking at this. More presentations in this area should follow. Talk to more experts.

  • @ajeybs4030
    @ajeybs4030Ай бұрын

    I can't thank this channel enough. Good job.

  • @hugod2000
    @hugod2000 Жыл бұрын

    Thank you for these fascinating videos.

  • @kice
    @kice Жыл бұрын

    6:28 High bandwidth is not due to the physical transfer speed, in fact, electricity also move as speed as light. Bandwidths usually determined by how many bits per transfer and how many transfer per second. A normal GPU transfers couple hundred bits at few GHz.

  • @Steven_Edwards

    @Steven_Edwards

    Жыл бұрын

    It's Speed of Light Through a Medium... Switching to photonics removes the need for conductive metals and voltage transformation. Light TX/RX is a lot more simple, and a much clearer medium so the speed of light is faster in that medium.

  • @mystifoxtech

    @mystifoxtech

    Жыл бұрын

    Small correction (I may be nitpicking) electricity can transmit data at speeds of 50%-99% speed of light

  • @nicholasgrippo1754
    @nicholasgrippo1754 Жыл бұрын

    This is very interesting. Excited to see what the future brings in this space.

  • @norik1616
    @norik1616 Жыл бұрын

    From what I've read (ML is my main field), even AlphaZero (and definitely MuZero) run on a "high end PC". The training was done on TPUs and simulation on CPU servers.

  • @norik1616

    @norik1616

    Жыл бұрын

    Also, the problem is how DL model is querried in reinforcement learning scenario - it is querried thousands of times per step for simulating the game in it's "state space" (evaluation of a tree of future steps)

  • @pc_screen5478

    @pc_screen5478

    Жыл бұрын

    Katago, which is a Go AI based on AlphaGo Zero with some extra improvements, is superhuman at just a couple hundred playouts, which on my computer (gtx 1650) only takes a couple seconds to achieve (about 3-5). On a high end computer this is achieved in less than a second per move. The original AlphaGo was a frankenstein of neural networks and needed a lot of MCTS rollouts to make up for it, subsequent Go AIs can be superhuman running on an iphone even

  • @paulmichaelfreedman8334
    @paulmichaelfreedman8334 Жыл бұрын

    Excellent channel. Objective, serious and extremely informative. Channels like these are what make KZread great. Not those bonehead vloggers.

  • @Dr7-1
    @Dr7-1 Жыл бұрын

    Welcome back on Asianometry. Thanks for your answer. I’m not so ready! First my family. See you soon. I hope! DV

  • @gspaulsson
    @gspaulsson Жыл бұрын

    When Deep Blue beat Garry Kasparov, some wag said: "Sure, but how did it do in the post-game interview?" Probably wouldn't be hard to train a neural network to give trite answers to trite questions, with a few quips thrown in. "Mr Deep. Can I call you Deep, or do you prefer Blue". "Whichever you like." "OK Deep, how do you think Mr Kasparov played?" "Pretty well - for a human.". "Why didn't you take his pawn at move 35?" "It wins at depth 6, but loses at 16. Humans are so slow."

  • @jimurrata6785

    @jimurrata6785

    Жыл бұрын

    And today we have Meta's chatbot dissing Zuck! 🤣

  • @JorgetePanete

    @JorgetePanete

    Жыл бұрын

    the bit-flip that caused that move really broke Kasparov

  • @benjybo
    @benjybo Жыл бұрын

    Great video! Thank you very much for making it! I’m currently working on a research project with ultra low precision neural networks. I wanted to ask if reducing the number of bits in the activations and/or weights to about 2-3 bits each (using state of the art Quantization methods) would help with the issues with photonics accelerators raised in this videos accuracy and scale? In general, most neural networks these days can be quantized down to 4-bits with almost no loss of performance, using the latest Quantization methods. So 8-bits might be a bit unnecessary, if these methods are used.

  • @Y2Kmeltdown
    @Y2Kmeltdown Жыл бұрын

    Great Video really interesting to see how other fields are tackling the issue of power consumption. From what I understand, it is not a fair comparison to make between conventional neural networks and the human brain. The human brain works on a completely different mode of computation where data storage and computation are unified and signals are carried through spike potentials. Hopefully photonics can be applied to designing analogs for spiking neural networks.

  • @theAadi47
    @theAadi47 Жыл бұрын

    Amazing and insightful video. I take solace in the fact that brain is much more efficient if not the best at specialised tasks. Let's hope the photonic innovators are able to get a product market fit, and who knows, due to efficiency reason we just might be able to simulate quantum computers before we actually design quantum computing at scale !

  • @salma-amlas
    @salma-amlas10 ай бұрын

    Woah this is blowing my mind! It's amazing, the things nature has provided for us humans. And the human scientific collaborative effort never ceases to impress me. Thank you for this video.

  • @miklov
    @miklov Жыл бұрын

    Fascinating. Thank you!

  • @lachlanperrier2851
    @lachlanperrier2851 Жыл бұрын

    This is one of my favourites don’t know why it doesn’t have more views

  • @chavita4321
    @chavita4321 Жыл бұрын

    love this video! cheers from California

  • @aniksamiurrahman6365
    @aniksamiurrahman6365 Жыл бұрын

    Man, you are remarkable. Btw, do u do financial consulting for tech companies? Or plan to do in the future?

  • @hugoboyce9648
    @hugoboyce9648 Жыл бұрын

    The caliber of this video was very impressive!

  • @Kengur8
    @Kengur8 Жыл бұрын

    In my favourite si-fi movie Bicentennial Man they kinda show photonic brain, even though it's called positronic. I love it now...

  • @taktoa1
    @taktoa1 Жыл бұрын

    A few mistakes: 1. ML typically consists of many matrix-vector multiplication steps, not matrix-matrix multiplications. 2. At 5:02 you meant picojoules, not petajoules 3. As I understand it (not an expert in photonics, though I have worked on an ML accelerator), for a given level of accuracy a photonic matrix-vector multiplication circuit will consume more power than a digital one, mostly because of the digital-to-analog and analog-to-digital steps. So I think it's somewhat misleading to say that power is not the problem. 4. I think the last point about replacing one of the axes with time is also misleading. That can be done for any circuit ("time-multiplexing") and will proportionally decrease throughput. So it's far from a solution to the density problem.

  • @trulyUnAssuming

    @trulyUnAssuming

    Жыл бұрын

    1. A 1xn matrix is a vector so eh... plus if you do batch learning you end up with true matrix-matrix products

  • @taktoa1

    @taktoa1

    Жыл бұрын

    yeah, but I still feel like mentioning matrix-matrix multiplication is going to confuse the average viewer more than illuminate, compared to matrix-vector. most ML accelerators are built to accelerate matrix-vector products (e.g.: they use weight stationary systolic arrays). this is because accelerators rarely have the memory bandwidth to support matrix-matrix products at full throughput; they require the higher operational intensity of the static matrix/dynamic vector product.

  • @leonfa259

    @leonfa259

    Жыл бұрын

    2. 28 orders of magnitude is a lot

  • @daniel_960_

    @daniel_960_

    Жыл бұрын

    Petajoules in a chip sounds fun

  • @jadeaffenjaeger6361

    @jadeaffenjaeger6361

    Жыл бұрын

    Convolutions are typically expressed using im2col, which makes them an instance of the matrix-matrix multiply. They are extremely common in vision-based applications, so I think the statement is absolutely justified!? I would consider the questions whether a matrix-matrix product is decomposed into matrix-vector multiplications in a given accelerator an implementation detail, rather than an inherent feature of the underlying problem.

  • @MrJazzCigar
    @MrJazzCigar Жыл бұрын

    You are producing some excellent content, never a dull video…thank you!

  • @brandonblue2994
    @brandonblue2994 Жыл бұрын

    Was wondering when you would cover this.

  • @uirwi9142
    @uirwi9142 Жыл бұрын

    the part about alpha go and how many TPUs were used. it's no wonder i cant find anywhere to build AI for StarCraft on my PC at home. ambitious but just not gonna happen, it seems. nevermind that, this talk/video was spectacular and incredibly informative. thank you.

  • @Rockyzach88
    @Rockyzach884 ай бұрын

    When I was getting my chemistry degree I noticed multiple labs working on materials for things like this. It's cool seeing it hit youtube.

  • @pmk_
    @pmk_ Жыл бұрын

    You mention that the 2016 AlphaGo was ran on 48 TPUs. Were these required for the Inference step used during the matches? Or was the final trained version running on just the laptop we saw in the documentary? Thanks for the great video!

  • @randomhandle721
    @randomhandle721 Жыл бұрын

    Great video. I enjoyed watching it.

  • @zane62135
    @zane62135 Жыл бұрын

    Wow...this is incredible!

  • @htomerif
    @htomerif Жыл бұрын

    It would be interesting to know some numbers. So far as I can find out, Google's TPUs use a little bit off-standard 16 bit floating point format for all of their data. You don't need the high accuracy of a 32 or 64 bit float, at least for inference. If the silicon photonics ADC/DAC has an effective end-to-end precision of an 8 bit float, then the gap between them and what is useful for AI is very, very large. If its equivalent to 12 bits, then its not as much of a problem. The other thing that would be nice to know is how much process variation there is in individual interferometers. One nice thing about digital electronics is that you fabricate a chip, you test it at speed and if it gives you the right digital answers, the chip is good. With analog electronics, you might have an interferometer with a reliable 32 bit equivalent signal to noise ratio, but non-linearity and variation between interferometers on the same chip might push the effective precision way down into the single digits, especially with a light path passing through multiple optical elements, testing every possible light path may be functionally impossible. With digital electronics, all you have to know is that one device's output falls within certain bounds to know that you can chain together an unlimited number of them with no loss of accuracy. With analog electronics, chaining them together always compounds the error, whether its RMS error of the noise floor being added or the multiplicative error in the actual signal. Anyway, I don't expect answers to these questions, but they are questions that I think the answers to determine whether digital photonics will be a thing in the future.

  • @dmurphydrtc
    @dmurphydrtc Жыл бұрын

    excellent summary. thanks

  • @JorgetePanete
    @JorgetePanete Жыл бұрын

    I saw that analog computing could have conversion to digital and back between a few steps to recover accuracy, with some circuit tradeoffs

  • @T3hderk87
    @T3hderk87 Жыл бұрын

    Holy crud.... That is insane. This reminds me of the x64 jump, and I think it will be as, if not more, significant.

  • @BB-nz9rp
    @BB-nz9rp10 ай бұрын

    Hi there, I really appreciate your content. Just a side note: I believe you meant 20 or 1 'pico' Joules / MAC and not Peta which would be about 278GWh = 19k households/year?

  • @MrTonypace
    @MrTonypace Жыл бұрын

    Lightmatter has an upcoming talk about doing this at wafer scale coming up in 2 weeks at Hot Chips. I hope you can tell us what they're up to! (And Ranovus).

  • @BRUXXUS
    @BRUXXUS Жыл бұрын

    I see this being a much more viable path to future computing than quantum computers. Even if the chips are substantially bigger, they'll use far less energy and won't require cooling in the same way as traditional transistors. I think it's really exciting and I hope to see this continue to grow and advance!

  • @thomaspluck1515

    @thomaspluck1515

    Жыл бұрын

    Check out Xanadu Photonics, squeezed state photons make quantum computing also possible in photonics - although the photodetectors have to be cooled in a liquid oxygen bath.

  • @lionelcliff
    @lionelcliff Жыл бұрын

    In the challenges section what does John mean when he states that the photonic chips aren't used for training, but only for 'inferences' due to their lower accuracy? Great Presentation btw !

  • @TheRiskyBrothers
    @TheRiskyBrothers Жыл бұрын

    This is some real Metamorphosis of Prime Intelect shit. Also this chanel is great keep it up 👍

  • @itonylee1
    @itonylee1 Жыл бұрын

    I wonder if it is possible to have multi-stack-layer of LED film to do the similar task, since LED can both emit light and also photoelectric?

  • @animeshthakur5693

    @animeshthakur5693

    Жыл бұрын

    LEDs aren't sensitive enough

  • @itonylee1

    @itonylee1

    Жыл бұрын

    @@animeshthakur5693 Sure, but in theory, it is possible to integrate LED within semiconductor die process.

  • @AngDavies

    @AngDavies

    Жыл бұрын

    Yes, but you probably wouldn't want to, for this to work you want coherent light, which for LEDs is going to mean throwing away most of it. A laser is what you want here really

  • @AngDavies

    @AngDavies

    Жыл бұрын

    @@itonylee1 if I recall, integrating the light source well is actually one of the major pitfalls/ cost centres that is as of yet unresolved. Integrating with the design means you don't need to align it/tune it. But making light sources out of silicon is really hard.

  • @gator1984atcomcast
    @gator1984atcomcast5 ай бұрын

    Light doesn’t travel faster than electrons, but it’s the frequency of light that allows more information to be transmitted and processed faster.

  • @rohanofelvenpower5566
    @rohanofelvenpower5566 Жыл бұрын

    Cloud Tensor Processing Units (TPUs) Tensor Processing Units (TPUs) are Google’s custom-developed application-specific integrated circuits (ASICs) used to accelerate machine learning workloads. TPUs are designed from the ground up with the benefit of Google’s deep experience and leadership in machine learning.

  • @DaT0nkee
    @DaT0nkee Жыл бұрын

    Paralell operation can be achieved using different frequency lights on the same chip simultanously.

  • @matttaylor2009
    @matttaylor2009 Жыл бұрын

    Excellent channel

  • @thegame4027
    @thegame4027 Жыл бұрын

    Small detail, but electrons don't move through the chip/wires. They just wiggle around; the energy is transmitted over the electric field around the conductor, not by the electrons. Doesn't really matter as your point is still valid, just a technicality.

  • @signalworks

    @signalworks

    Жыл бұрын

    Electrons do flow in low frequency conductors, especially the DC power. The energy is indeed in the fields, the motion of charge carriers are described by the edge of the fields (current and magnitude)

  • @krimsonsun10
    @krimsonsun10 Жыл бұрын

    I saw an article in high school in the early 2000 on photonics research for replacing buses on motherboards from MIT. The idea was to reduce heat loss and latency.I wonder if this is an offshoot of that research?

  • @kasuha
    @kasuha Жыл бұрын

    There's some inconsistency in the argument. At the start you note that most energy is lost on data transfers, yet these are untouched by the photonics, they tackle the multiplication instead. And I can't help but notice that important part of the photonic circuit is a heater, presumably to affect length of one of the paths, adjusting the interference. So while there seems to be obvious advantage in speed of the multiplication itself, it's not clear how much if any energy does it save.

  • @10-AMPM-01
    @10-AMPM-01 Жыл бұрын

    12:35 - That's really clever.... Easy to get bogged down by the "right and wrong" ways to use tools.

  • @seditt5146
    @seditt5146 Жыл бұрын

    I created a Neural network I trained to work as a Binary ALU. Even better for this I trained "Cells" which act as logic gates and I would love to see my data encoded into glass such that it could function as a full on ALU in light.

  • @antiprime4665

    @antiprime4665

    Жыл бұрын

    What is the point of using a neural network as an ALU

  • @TaeruAlethea
    @TaeruAlethea Жыл бұрын

    What would be pretty wild would be using both time offsetting and wavelength multiplexing to increase throughput. If I understand it, it would be like light based hyperthreading, except you could do 3, 4, or more threads all independently. I guess it would just rely on how passive the structures would actually be.

  • @markkyn7851

    @markkyn7851

    Жыл бұрын

    I worked on this datacom side for photonic switches. The thing about wavelength multiplexing (WDM) when used with MZIs is that crosstalk can be a killer, depending on the MZIs used. Depending on the interconnect topology used for the MZI mesh, crosstalk can cascade through the MZI mesh ultimately increasing the "noise" level beyond practicality. This also inhibits scaling these meshes out, as you could imagine!

  • @Andrew-rc3vh
    @Andrew-rc3vh Жыл бұрын

    I think there is a bit of an error in this video. the MZI is a passive device which uses a half-silvered mirror to create interference patterns, so there is no voltage applied to the MZI. What I suspect you may be referring to is the Kerr effect where the reflective index changes with applied voltage, and if you used it with an MZI then this is likely to be what gives you the desired properties.

  • @jannegrey593
    @jannegrey593 Жыл бұрын

    OK. It seems like old video but also just released. It will probably be fantastic.

  • @jacoblara4175
    @jacoblara4175 Жыл бұрын

    I wonder how this compares to the analog circuits that are being used to run neural networks.

  • @2black1white3blue
    @2black1white3blue10 ай бұрын

    This is very interesting. Thanks

  • @HexerPsy
    @HexerPsy Жыл бұрын

    Do Photonics require extreme low temps, such as q-bits do currently? Quantum bits receive noise from temperature, so those chips work most reliably in a very low temp, close to 0K. It ends up with a machine thats mostly multi stage cooler, with a chip on the tip. Are photonics the same?

  • @profdc9501
    @profdc9501 Жыл бұрын

    A small note, a petajoule is the amount of energy unleashed by a 250 kt nuclear bomb. You probably mean femtojoule. :) The structure of feedforward deep neural networks is unfortunately very sensitive to computation error which is why typically these often employ at least 32-bit floating point arithmetic. Backpropagation of these networks to update weights through many layers can result in cumulative error which limits model performance. For optical scaling operations, there are additional error sources due to quantum detection fluctuations, flaws in the optical system that cause scattering and coherent noise, sampling and quantization error, not to mention power consumption from electro-optical interfaces that can be quite substantial. There may be neural networks for which optical scaling operations are suitable, however, the conventional feedforward deep neural network, because of its reliance on precision matrix multiplication operations so that backpropagation can be performed using the adjoint operation, is going to be quite challenging. There are plenty of ideas and simulations floating around for this but very little in the way of actually attacking the real issues surrounding optical neural network implementations, just mostly hype.

  • @taktoa1

    @taktoa1

    Жыл бұрын

    I don't think anyone is interested in training on photonic accelerators, it's all inference. Quantization is very commonly employed to make inference cheaper, which results in errors similar to photonic accelerators, though smaller in magnitude (IIRC current photonic accelerator designs get 2-4 bits of precision, classical inference accelerators are typically in the 8-16 bit range). So I think most of what you're saying here is a non sequitur with regard to the published research.

  • @profdc9501

    @profdc9501

    Жыл бұрын

    @@taktoa1 Run something as simple as MNIST on an optical accelerator and get 99% accuracy and then we'll talk. The key with digital quantized neural networks is that despite the fact they're quantized they are also deterministic, that is, given an input, the output is the same each time, as there is no measurement noise. Therefore if you train with quantization error, the network can learn that error. However, analog physical systems have measurement error. It's not just that the optical system achieves the "equivalent" of 2-4 bits of precision, its that no matter how many average photons are used to represent a signal, there are going to be measurement outliers. Due to the nonlinear operations of ReLu and Maxpool, outliers due to measurement error can accumulate in deep neural network layers. So it seems to me that having many deep layers and nonlinear operations like ReLu and Maxpool make it extremely difficult for an analog multiplier, especially one susceptible to quantum noise, is going to produce reproducible, reliable inference. Because of the extreme sensitivity of feedforward neural networks to cumulative error, if training is performed digitally for inference that is to occur on an analog/optical computer, the training model must be extremely accurate, including effects of quantization, noise sources including Poisson, thermal, coherent noise, system manufacturing error, etc., and even then the variation due to measurement error may limit the ultimate inference accuracy. It may be required to train a neural network for each physical system because the manufacturing tolerances of two different optical chips may be too different for a network trained on one chip to work on another chip. Biological neural networks seem to work quite effectively without being deterministic despite the fact these are implemented on analog computer wetware. Deep feedforward neural networks seem like a poor fit for analog computing, especially quantum noise limited computing for which the power consumption is directly influenced by the number of photons required to achieve a certain SNR due to Poisson noise (SNR being proportional directly to the square root of power, and so SNR increasing only slowly with increased power consumption). Even other solutions that use electric charge (mythic.ai/) with similar electric charge quantization problems are limited in the number of layers that can be implemented. The whole reason why feedforward deep neural networks were created in the first place is because backpropagation is possible using a bit of clever calculus and the chain rule. Training is the problem, because if you don't have any other kind of neural network you can effectively train that is resistant to measurement error, analog computation is not going to be a viable solution for neural network inference. Neural network accelerators like the Tensor processor have sucked all of the air out of the room for research into any other kind of neural network architecture, and as long as this is the case, the market will not care about analog computers because the current feedforward deep neural networks were created for deterministic, digital machines.

  • @10-AMPM-01
    @10-AMPM-01 Жыл бұрын

    8:23 - I'm not very surprised. I figured it could be done. That kind of manufacturing isn't in my wheelhouse. But, architecture is, haha.

  • @quaidcarlobulloch9300
    @quaidcarlobulloch9300 Жыл бұрын

    12:41 LET's GO, literally called it because rate coding is how our neurons are organized!

  • @jasonkocher3513
    @jasonkocher3513 Жыл бұрын

    I'm far, far away from this area of study, but I am an EE nonetheless... could they replace the "thin film heater" with a Piezo element on each of those interferometers to slightly deform the one leg? This stuff is so cool.

  • @googacct
    @googacct Жыл бұрын

    One thing that seems to be overlooked in the video is the use of a nonlinear activation function as part of the computation. I do not think matrix multiplies all by themselves give the desired effect.

  • @raphaelcardoso7927

    @raphaelcardoso7927

    Жыл бұрын

    Usually the nonlinearity is achieved outside of the photonic part :/

  • @pathos48
    @pathos48 Жыл бұрын

    Congratulations for the very interesting and informative video. However, I guess that probably you meant femto rather than petajoule per MAC. Furthermore, the speed of light is invoked inopportunely both for justifying the very large frequencies and the short time of computation. In optical fibres light propagates 1.5 times slowlier than in empty space, and in SOI waveguides it does even 2.8-3 times slower; by contrast, the RF or microwave signal in a modulator travels faster. And in general, electricity propagates to a speed comparable to c, because it's the electromagnetic field to propagate it, not the electrons in the metals, which, as a whole, drift by cm/h (when applying DC). The point is that in photonics you use dielectrics like glass, silica or intrinsic Si, therefore absorption is much smaller than in a conductive material; this would be evident if the same circuit were implemented with microwaves on a microstrip. However, the problem with metal connections when using high clock rates in digital circuitry is that you have to charge and discharge the parasitic capacitance of those lines. About the second misconception, an electrical circuit would be much slower with respect to the MZI mesh because of its RC time constants: it's not that the electric signal propagates slowlier, it's that the transient is much longer. Regarding the 1980s Bell Labs research you mentioned, I guess that they made an optical computer, I doubt (but I should check) that it was based on optical transistor as that technology is still at the proof of concept phase. However, this does not change your point, that announcements about silicon photonics neural networks replacing TPUs must be taken with caution.

  • @superpie0000
    @superpie000011 ай бұрын

    with what you said at about 10:10 with analog not having the accuracy, could i theoretically use multiple streams of analog to convey greater resolution, like if i want more accuracy, i can have a 1's channel and a 1/2 channel for 2x the accuracy like how binary/any other number system has powers. id imagine if the error is on the reading side and not the light multiplication computer part then that could work(i dont really understand light, spooky stuff like magnets and electrons), however in a op amp implementation id imagine the lower places would leach more noise as they have heavyier weight.(could be reduced by differential noise reduction, but emf is unbeatable) another solution would to fill up the smaller place with value, then as you fill the first place, use the second as an extension to allow you to stack the channels into one massive in depth channel made up of many streams of light or whatever the medium. Idk how any of this works whatsoeverbut i would love to know if this method is of any use for improving accuracy at the expense of complexity

  • @darthmoomoo
    @darthmoomoo Жыл бұрын

    5:03 Are you sure it's petajoules? That's the energy equivalent of about 1 megaton of TNT.

  • @prabhatp654
    @prabhatp6548 ай бұрын

    Uhmm, that photon will be generated by a LASER that are known to be a big power churner, so how does that make them better?

  • @nanobrains
    @nanobrains Жыл бұрын

    Thanks!

  • @nexusyang4832
    @nexusyang4832 Жыл бұрын

    Thursdays I fry my brain with First We Feast in the morning and then educate myself at night with Asianometry.

  • @elliott614
    @elliott614 Жыл бұрын

    Correction: high speed transmission lines of electricity also travel at the speed of light... electronic circuits are physically electromagnetic waves, just much of the time circuit theory is an adequate tool and very simplified. While the overall movement of electrons themselves, ie, electron drift, is incredibly slow (shockingly slow if you didn't know already) Typical FR4 dielectric on a pcb slows the waves by a factor of 4. Fibre optics also slow the light. Depends on the material how much

  • @John.S92
    @John.S92 Жыл бұрын

    "old" silicon can still compete well into the future, well, if we replace silicon for another material (yes that is in research since +10 years) we could get the same result with less electricity used, thus faster and more efficient computers. The "Real" fun begins when scientists achieves room-temperature superconductivity, that would enable computers running Much, much faster than current computers while using close to zero in electricity (as superconductivity would allow electrons to flow with no resistance, thus using electricity solely for the calculations/data movement through the material)

  • @youngmonk3801
    @youngmonk3801 Жыл бұрын

    Are these light matrices forming AND, NOR, OR, XOR gates, etc? Or is this a different type of computing that isn't "Turing style" ? in other words, are neural networks different from these logic gates?

  • @Andrew-rc3vh
    @Andrew-rc3vh Жыл бұрын

    Hmm I get the hunch that these chips will go the way of transputers, as per nice idea at the time, but I feel something far smaller is around the corner. There is a Chinese researcher who is doing it on the molecular scale using crystal lattices and the doping of them with different atoms in order to change their topological properties. The light therefore behaves according to the structure of the lattice. You can us entangled photons to send signals and the photons themselves are in a squeezed light state. I understand the doping is done using a femtosecond laser. I'm a little unclear on the details but it is the leading edge in photonic computers. The work is published in respectable journals but still highly experimental.

  • @shapelessed
    @shapelessed Жыл бұрын

    6:20 - Wrong. It's not strictly because the signals move with the speed of light. Electricity travels at about 270000km/s, really close to light speed. What is really holding electronic processors back is the fact that electricity generates heat. Exponentially as you increase the throughput whereas light does not.

  • @runforitman
    @runforitman Жыл бұрын

    6:19 voltage potential is also transmitted at the speed of light

  • @cubertmiso
    @cubertmiso Жыл бұрын

    would you consider investing in which company making some of the axes for the photonic/silicon era?

  • @firstlast9504
    @firstlast9504 Жыл бұрын

    Enjoy your KZread Video subjects! Silicon Photonics, cool. ✌

  • @ahmedatef6017
    @ahmedatef6017 Жыл бұрын

    Something that most of the pro photonic accelerators often ignore, whether in academia or industry, is the power consumed in the LASER!!! They often don’t include that to make their efficiency numbers look hot!

  • @JorgetePanete

    @JorgetePanete

    Жыл бұрын

    How much is it, and is it getting lower?