can you given an example or threads, like you mentioned for process, each tab in a browser is a process. And can there be user and kernel process?
@AdiTeman2 күн бұрын
Errata: At time @23:51 the number of MACs should be 10,275 for an 89% reduction and not as stated. Thanks @FritzKissa for noticing this mistake.
@lalit60014 күн бұрын
Can anyone help me understanding Body Effect intuitively without using equations?
@AdiTeman2 күн бұрын
Hi, The easiest thing in my opinion is just to remember the following: "Negative Body Biasing raises VT". For an NMOS, that means that if I put a negative voltage on the bulk, my VT is higher. For a PMOS, the opposite - if the NWELL is biased higher than VDD, the |VT| of the PMOS becomes higher (i.e., the VT becomes more negative). To understand in general, physicists have many ways of explaining things that are all different ways of looking at them. One thing you could try is that there is a certain amount of voltage you need to apply to the gate in order to invert the channel. If I put a negative voltage on the body, I need to apply more voltage on the gate to bring the channel to the same voltage. This can be supported by the two capacitor drawing I show many times in the slides, where the gate-to-channel capacitor is what we want to control the channel, but the channel-to-body (or channel-to-others) capacitance is "fighting" with the gate voltage. If we apply a negative body voltage (on an NMOS), we are effectively increasing the gate-to-body cap and making it harder to invert the channel. Hope that helps...
@FritzKissa7 күн бұрын
At 24 minute mark I don't understand where the MAC count of 16,675 comes from. You have 3 pcs of 3x3 kernels, that move (convolve) 5x5 times; 3x3x3x5x5=675 MAC in the group convolution part and then 128 pcs of 1x1x3 kernels that move 5x5 times; 128x1x1x5x5 = 9,600 MAC in the pointwise part, i.e. the total MAC is 10,275. Can you check this?
@AdiTeman2 күн бұрын
Hi @FritzKissa - You are 100% correct. I don't know where this error came from. I am pinning a comment with the correction. Thanks!
@umarnadeem407411 күн бұрын
can you recomend some textbook for this course as well?
@AdiTeman2 күн бұрын
Hi, Actually, that is not an easy thing. I don't know of any comprehensive textbook that provides everything I cover in this course. That said, there are many books and other material that you can find that together will provide everything. What my main suggestion is, is to look at the list of references at the end of each lecture. I have tried to put the main sources that I relied on (other than my own experience and knowledge), so for each lecture, some relevant text book may be available.
@qeq16711 күн бұрын
I can't believe this course is free , Thank you very much Lecture 1 is one of the best introduction to a course i have ever took
@AdiTeman11 күн бұрын
Thank you for the appreciation. I have a lot more content available on KZread for free, which you can access through my channel or see in a more convenient layout on the EnICS Labs website enicslabs.com/education/ I'd love to hear more feedback and I hope to find time to create more content over the summer.
@Editzzor10912 күн бұрын
can we do cell padding for pin density
@AdiTeman2 күн бұрын
Hi, Yes, indeed you can pad cells and that will reduce the pin density. Of course, this comes at the cost of extra area, but it could be useful to apply to certain hierarchies. I guess this is the same as applying a lower target utilization, though going at it from "a different direction". As a side note, cell padding is usually used for things like leaving space to put decaps next to flip flops and clock buffers to improve the dI/dt drop near the toggling clocks. But it could be used as you suggested, as well.
@menakaa640514 күн бұрын
could you please say what is timing model?
@AdiTeman2 күн бұрын
Hi, A timing model is a simplification of how to calculate the delay (and a few other things) through a digital gate. This is fully covered in Lecture 3 of this series. Specifically, I suggest you watch this video: kzread.info/dash/bejne/k594ls58YMfLfc4.html
@haziqiqbalhussain17 күн бұрын
Hey Professor. How can we identify intuitively (without tools) which net is aggressor and which is victim? Does it depend on frequency or any other parameters?
@AdiTeman2 күн бұрын
Hi, So, my intuitive, straightforward answer is that if two nets are routed close to each other for a substantial distance, then there is a good chance that an SI issue will occur. But it's very hard to actually see this without a tool, since you have lots (...millions... billions...) of nets with different segments and so forth. Which is a victim and which is aggressor. Well, first of all, they can "trade places". In other words, one of them can be the aggressor to the other for a certain timing path and can be the victim for a different timing path. But usually, the victim will be the one that is weakly driven, which can be seen as a slow transition on the net. You can look at DRVs - max capacitance and max transition (and max fanout) reports to find high potential candidates for SI problems. But really, just use the tool...
@shauryachandra232322 күн бұрын
I have been following these lectures and absolutely love them. It would be great if you can please record a video series of using the Cadence tools to perform FPR and CTS on a small design in real time, maybe a live stream if possible so as to get some hands on approach.
@AdiTeman2 күн бұрын
Hi, Thank you so much for the kind words. Indeed, I do not provide any hands on/live material at this time. This is due to commercial restrictions - both of the CAD/EDA tools and of the IP that is used (standard cell libraries). In the future, I may get permission from the two sides and work on providing such material, but this is also really dependent on my time (both to go through the bureaucracy and then to actually go and make the recordings...). And as you can probably see from the amount of material I have uploaded lately, time is something I don't have a lot of. Who knows, maybe things will settle down and I'll find the time (though I have been promising my wife that "I will have more time next month" for about 15 years :).
@shauryachandra23232 күн бұрын
@@AdiTeman Thank you for the detailed response and for the incredible content you’ve already provided. I completely understand the constraints around commercial restrictions and time commitments. Your lectures have been immensely valuable, and I genuinely appreciate the effort you put into them. I have completed the DVD lecture series and I am going to start with your SoC playlist soon. If you ever do get the chance to navigate the bureaucracy and find the time, a hands-on series would be fantastic. Meanwhile, I’ll continue learning from your existing materials and look forward to any new content you can share. It would be great if you could suggest some online courses or any other references in your knowledge that could help gain this hands on experience before I actually enter the industry. And I hope you manage to find that elusive free time soon :) Thanks again, and best wishes! Regards!
@user-ex3ub6bj1e27 күн бұрын
Thanks a lot for this video, Adi! I am more of an experienced Digital Design Engineer myself, so I know 90% of this stuff already. But for beginners this course is very valuable, Linux concepts are explained very clearly and with good examples. I only wish I had something like this course when I first started working with VLSI CAD tools and Linux. My path would have been so much easier! I will now watch other lectures on your channel for sure
@AdiTeman27 күн бұрын
Great to hear! Please give me feedback on my other videos and I'm open to suggestions for more material (though, who knows when I'll find time to record more :)
@purplesky2402Ай бұрын
What is eda tool?
@AdiTeman27 күн бұрын
EDA stands for Electronic Design Automation. This is the general name of the programs used to design chips. We also call them CAD (computer-aided design) tools, but CAD is used in other fields, whereas EDA is usually used for hardware design utilities. I suggest watching my other courses to learn all about this field. You can find my material at enicslabs.com/education/
@VuThanhNinhАй бұрын
Thank you sir, your explanation is very easy to understand
@AdiTeman27 күн бұрын
You are most welcome
@thangdaoviet419Ай бұрын
how to get the slide?
@AdiTemanАй бұрын
All of the slides are available on the EnICS Labs website. The webpage for this course is at: enicslabs.com/academic-courses/dvd-english/
@thangdaoviet41929 күн бұрын
@@AdiTeman thanks for your reply but I cant find the direct link to download it, can you please provide it?
@AdiTeman29 күн бұрын
@@thangdaoviet419 No problem. There is a button on the right panel of each lecture that says "Lecture X Slides". For this specific lecture, the link is: www.dropbox.com/scl/fi/d5sqn83htkyifkbed7nfu/Lecture-3-Synthesis-Part-1.pdf?rlkey=e0jfxerycb03brp3feq3q265t&dl=0
@thangdaoviet41929 күн бұрын
@@AdiTeman thank you very much, have a nice day ^^
@slimjimjimslim5923Ай бұрын
I been working in VLSI for 7 years. And I still come back to your videos to refresh or relearn something! A heck lot faster than reading my Neil Weste textbook too lol
@AdiTemanАй бұрын
Thank you! That's what makes me motivated to provide more material (hopefully, I will have time to make some more videos later this year).
@slimjimjimslim5923Ай бұрын
Thank you so much professor for putting this online for free. There are many university in USA that only cover parts of VLSI but nothing as clear and complete as your lectures. Some focus more on architecture, other more on circuit design. I found myself lacking in some areas when I entered industry 7 years ago. I get very good but also very segmented focusing just some circuit design, timing calculation and analysis but to become a good VLSI, we also need breadth. And that doesn't happen naturally by staying in one place. Your classes are helpful in gaining that breath. : - )
@AdiTemanАй бұрын
You are so welcome. That is what they are there for! Hopefully I will find time to prepare more videos later this year. I have a long queue of lectures waiting to be recorded - I just have to find the free time to get around to it.
@Kiladikannadiga123Ай бұрын
Sir after i learning this 73 videos.can i have learned and apply for vlsi designer jobs
@AdiTemanАй бұрын
Well, I don't know if just watching the videos is enough, but it is a good start!
@AdiTemanАй бұрын
Errata: At time @9:54 there shouldn't be a "dx" in the expression for current. Thanks to @arghya.7098 for pointing this out.
@arghya.7098Ай бұрын
At 9:54, shouldn't the expression for drift current be: I_d = −v(x)⋅Q(x)⋅W since current is velocity times charge, and the charge is proportional to the width of the channel. I don't understand why the infinitesimal channel length dx is included. Can you please clarify this point?
@AdiTemanАй бұрын
Yes, you are right. There is an "extra" dx there. Thank you for pointing this out (it "magically" disappeared on the next slide ;)
@arghya.7098Ай бұрын
@@AdiTeman Thank you for the clarification, Professor. I really enjoyed the lecture and appreciate your guidance on this point.
@jaeyupchungАй бұрын
lifesaver
@AdiTemanАй бұрын
Thanks!
@sapandeepsandhu4410Ай бұрын
GRT GET NEW STUFF
@mdomarfaruque493Ай бұрын
Hello Sir,could you please provide relevant lab work?It would be great
@AdiTemanАй бұрын
Hi, Sorry that I haven't provided it as of yet. I may be updating the course recordings soon (a lot has changed since 2020...) and maybe then I could add some labs.
@VIKRAMANDEVARAJ-wq4mfАй бұрын
Hii sir, Thank you for the playlist Is there any particular tool where I can start my pratical knowledge. There skywater 130 I use windows I can't use it. What type of tools should you recommend
@AdiTeman27 күн бұрын
Hi. Tough question. These things mostly run on Linux and aren't too friendly for home computers. But you can start by buying a starter FPGA and programming it. The FPGAs come with a tool suite that runs on Windows and you can learn a ton from it (and FPGA design is a very popular and required skill on its own). Try starter kits from Xilinx or Altera.
@haziqiqbalhussainАй бұрын
I think Din and Dout and TX and RX are mistakenly swapped at 10:15. And if not, it would great if you explain the naming convention
@AdiTemanАй бұрын
Hi Haziq, Indeed, this is a really bad naming convention, but I didn't invent it. Maybe I should have changed it, because it's so upside down, but I kept what was in a reference that I based it on. You can see in the bottom figure that the usage of DIN/DOUT is the same (pay attention that the PAD is connecting outside the chip, while the ports are connected to the chip core). I can try to give a makeshift explanation, but really, it is going to be a bad one, because if I were to design these I/Os, I would label the pins in the opposite way. Here it goes - "we are looking at the I/O from the perspective of the other chip, so its outputs are connected to DOUT and its inputs to DIN. Where the other chip is receiving is the RX I/O and where it is transmitting is the TX one". So after we got that horrible explanation out of the way, I will say that these things just depend on whatever the vendor who developed the circuit decided to call it. So you have to read the manual (good luck ;) and adhere to it.
@haziqiqbalhussainАй бұрын
@@AdiTeman well the explanation isn't that horrible. And by not changing it, you saved the students from future confusion. Thanks!
@haziqiqbalhussainАй бұрын
Man whoever came up with this is a genius. Thank you Dr. Teman. Could you please confirm one thing. Connectivity matrix saves us from calculating each quadratic wirelength and differentiating them partially in order to arrive at A matrix, right?
@AdiTeman2 күн бұрын
Hi, I guess you could look at it this way, but I think it is more straightforward than that. The solution to the optimization problem is to differentiate the entire system. This is a set of equations that you can collect into matrix notation. The connectivity matrix is basically the outcome of writing down all equations and collecting them together. But the "observation" is that this matrix has features that "are intuitive" and "make sense" and represent the connectivity of the netlist and therefore it's just straightforward to write it down. Note that there is some amazing work by University of Texas (David Pann), where they use deep learning inspired optimization to solve the placement problem ("DreamPlace"). I highly suggest looking into this, because it's really really beautiful.
@omersayag8909Ай бұрын
איפה אפשר להוריד את המצגת? בקישור שמופיע יש רק את הרצאות הוידאו ללא המצגת
@AdiTemanАй бұрын
בקישור יש כפתור ליד כל הרצאה עם קישור למצגת.
@AdiTemanАй бұрын
Errata: At time @23:24 the default state of the Mux should (of course) be 1'bx (and not 4'bx on a one bit signal). Thanks @atharvaagiwal6051 for paying attention to this.
@Engineer884Ай бұрын
you didn't mention what is 456 and 789. Is it a single number 456789 or two different numbers
@AdiTemanАй бұрын
Hi, Actually, your question made me go look this up again and make sure I didn't make a mistake in the lecture and it also showed me something that I'm not sure how deeply I thought about it before. Luckily, there is no mistake in the lecture (as far as I can tell), but maybe I should have given the example a bit differently to make it more clear. What we want to do is fill a number into a register that doesn't fit in the immediate field of addi (so >4096). What we have to do is break it down into two numbers. The first, we multiply by 4096 and then we add the second to it. The number, therefore, that gets stored in the example is 456x4096+789. This can be a bit complicated due to sign extension and twos complement numbers. Therefore, we usually wouldn't write this sequence by hand, but rather use the compiler pseudoinstruction "li" (load immediate). This is, assuming that you actually are hand writing your assembly code, rather than doing what most people do, which is using a high level language ;). But for the purpose of understanding this, if I would have done lui 0x456 addi 789, then I would have gotten 0x456789 in the register (because a <<12 operation is 3 hexa positions).
@AdiTemanАй бұрын
Just another point, There is a great discussion about this here: stackoverflow.com/questions/50742420/risc-v-build-32-bit-constants-with-lui-and-addi
@Engineer884Ай бұрын
@@AdiTeman Thanks sir. Sorry for not mentioning the time stamp while asking the question.
@vivekartist6893Ай бұрын
Hi Professor Adam. Your videos are simply wonderful. Huge fan! One query, could you please provide an example scenario of exclude pin usage for better understanding? Thanks.
@AdiTemanАй бұрын
Wow, that is such a great question. It is something I have been teaching for years as an option, but seemed to never have thought about what it's good for! Your question made me look it up. I will say that clock trees are a bigger subject than I discuss in this lecture - much bigger. Every chip has new "surprises" in the clock tree that make you use these different definitions, and every time I have run into one, I say "I'll use that as an example in a lecture on CTS", but by the time it becomes relevant (usually, this is in the middle of a tapeout or other stressful times ;), I totally forget the scenario and why we needed these weird commands... Anyway, back to your question, I found an answer. The general high-level reason is that there can be cases where a net is both on a clock path and a data path. In such a case, you want to buffer the clock part of the net, but not the data part of the net, so you would put "exclude" on the data pins. That is a great explanation, right? Well, the obvious question is "why the heck would a clock net be a data net as well???". And the answer is not too obvious. One answer could be that you may have some observation circuitry on the clock and you treat this as data. But one of the user manuals shows a more common case that actually makes sense and that is the case of a clock divider. A clock divider is just a bunch of flops, where the output of one drives the clock signal of the other with a toggled input. In this case, we have the Q pin of the flop driving a clock net and so it needs to be handled by CTS. Until now, all is good - we want these to be buffered and such. So this is not the case of mixing clock and data. HOWEVER, we perform scan insertion (which I didn't cover in this course), where all flops are connected in a shift register configuration (scan chain) for testing. In this case, the clock net emanating from the Q pin of the divider flop goes to the CK pin of the other divider flop, but also goes to the SI pin of the next flop in the scan chain. This IS a data path and shouldn't be treated as a clock net. So the SI pin should be regarded an EXCLUDE pin. (note, this may be handled inherently by the CTS engine). Thanks for pointing this out so I learned something new.
@haziqiqbalhussainАй бұрын
Thanks for the amazing lectures, sir. I have a question. At 5:00 the HPWL calculated is 7 while if we count the units manually it's 9 as each of the two bottom right cells are also taking one unit of wirelength. You said that these cell doesn't add much in the wirelength but what if they are a little far (inside the bounding box). The actual wirelength can be 10 or 11 units and our estimation would be quite off.
@AdiTemanАй бұрын
Hi Haziq, Your question is legitimate, because I can say that I have been confused by this more than once (don't tell anyone 😂). But actually, it's almost irrelevant. The reason is that we are building an estimator that is trying to somehow quantify the wirelength, so we can define a cost function and optimize it. It is very clear that our estimator is far from accurate - it just needs to represent something that is better or worse than something else, so we can change it and see if that helps. So counting the distance (|x1-x2|) or the number of blocks that the wire occupies (which comes out |x1-x2|+1) is essentially the same for the purpose we are using it. In addition to that, as you will see in the following parts of the lecture, we are actually using other estimators more predominantly (such as the quadratic wirelength estimator). But the point is the same - we are trying to put a number that represents how good or bad our solution is for comparison to a different solution. You could look at it as a rating system that gives you 1-5 stars or a grade of 1-10 - as long as the better option has a higher rating, it doesn't really matter what the scale is.
@haziqiqbalhussainАй бұрын
@@AdiTeman Thanks. I have completed the following parts of the lecture and got your point. Thank you for taking your time out and explaining.
@AdiTemanАй бұрын
@@haziqiqbalhussain Great
@iliachakarov7285Ай бұрын
These lectures are really well done!!
@AdiTemanАй бұрын
Thanks!
@Engineer884Ай бұрын
3:31, its written that frame pointer points to beginning of the frame, but you said it points to base of the frame. Is the base of frame really the beginning of the frame, or is there any problem in whats written ?
@AdiTemanАй бұрын
Hi, I don't think there's a problem. The "base" and "beginning" that I refer to are the same thing. Basically, it's the opposite of the stack pointer that points to the top of the stack. The frame pointer points to the bottom so we can return to where it started.
@Engineer884Ай бұрын
@@AdiTeman oh I see, thank you for replying
@iliachakarov7285Ай бұрын
Now this lecture I like!! gj
@AdiTemanАй бұрын
Glad to hear that!
@user-np8ut6zv3m2 ай бұрын
Nice explaintions sir.... Thank you 😊
@AdiTemanАй бұрын
Thank you!
@atharvaagiwal60512 ай бұрын
In the default statement the output should be 1'bx. For 4:1 mux
@AdiTemanАй бұрын
Haha, Great catch! This slide (and video) has been around for quite some time and no one ever pointed that out. This is not an error, per se, since it's just an X (and 4'bx is basically the same as 1'bx) and the simulator and synthesizer would (hopefully) disregard this, but it was for sure unintentional in the slide. Thanks for finding this. I will pin the errata!
@cahitskttaramal31522 ай бұрын
Hi! I coded my own spice and this serie helped me a lot! Thank you! The hardest part is nonlinear elements. Wrong first guess causes lots of trouble. I wonder if there is a method for guessing starting point as well. I think I will add bisection method after some failed iterations. Also do you know anything about modified trap integration? Mike Engelhardt (Creator of LTspice) claims that is the best method (simple a true trapezoid without the oscillation effects) and only he knows that.
@AdiTemanАй бұрын
Hi, I think you just went beyond my pay grade :). I'm far from being an expert - and for sure when it comes to deep down things that affect analog simulations - that's out of my expertise. There are many gurus out there, but I've had the pleasure to meet two of them - Andrew Beckett of Cadence (he answers many questions on online Cadence forums and is amazingly knowledgeable) and Prof. Andrei Vladimirescu, one of the designers of SPICE (and author of "The SPICE Book"). Andrei is a good friend and would probably be happy to answer your question if you reached out to him (...and bonus points if you told him that I sent you ;).
@LamNguyen-te2lq2 ай бұрын
Hi there, i have a question that do we actually do this methodology when we do placement step, or tools will automatically do it? And thank you for your videos, it so good.
@AdiTeman2 ай бұрын
Hi Lam, Luckily, the algorithms are deeply implemented into the tools, so we don't have to do anything ourselves in terms of implementing them ;)
@zichendu55652 ай бұрын
Awesome lesson! A thermal engineer here trying to learn Semiconductor to do my job- electronics thermal management better. Would be glad to know if anyone here else has similar background.
@AdiTemanАй бұрын
Great to know that my lectures are reaching other disciplines. I'd also love to hear if there are any other thermal engineers watching these.
@Shahidsoc2 ай бұрын
why multiplexers are needed ? to chose different frequencies ? for dynamic volatages and freq ?.
@AdiTemanАй бұрын
Hi, Yes, these can be cases for multiplexers on the clock tree. There are actually many cases where you would multiplex several clocks onto the same clock tree. Just as an example that is commonly found on SoCs: We usually provide two clocks: (1) An external reference (from a crystal) with a low frequency (usually less than 100MHz and sometimes much lower) (2) An internally generated clock from a PLL or other clock generator. When we boot the system, the generated clock is not available. It takes time for the PLL to "lock". Additionally, we want a backup in case the PLL doesn't work properly or something like that. So we drive both clocks into a Mux, boot with the external clock, and transfer to the internal clock once it is stable.
@eda10582 ай бұрын
In 1:32:20 why do we draw data like that one going up the other down at the Same time but for clock There is onlu one line up and down in a row?
@AdiTemanАй бұрын
Hi, Thanks for the question. Some things we - experienced engineers - take for granted, since we're used to seeing them so often and we may not explain them (though I do vaguely remember explaining this in one of my lectures somewhere). The clock is very deterministic. Every clock period, it goes up once and down once and it does this repeatedly at a constant rate. Therefore, this is how we draw the clock signal. On the other hand, we don't know what the data is in a general case. It could be '1', could be '0', could stay constant and could toggle. Therefore, we draw it in that "changing" or "unknown" kind of way. It is supposed to represent "all cases" where we must take into account that any level can be there and where the X's are, it may be changing (it will be stable where the lines are straight).
@anuragharidasu57462 ай бұрын
Thank you so much the detailed explanation
@AdiTemanАй бұрын
You are welcome!
@harshitsrivastav52172 ай бұрын
Hello sir, your explanation is wonderful, I appreciate your efforts. Sir can you help to know the details of CCS Model and how it is different from the WLM Model.
@AdiTemanАй бұрын
Hi, So I think this is covered in Lecture 3 (Synthesis - Part 1). Specifically, you can find the video about Liberty timing models here: kzread.info/dash/bejne/X4KbudWKeqatnNY.html I don't go into great detail about CCS models (they're actually very complicated), but I explain where they come from and such. Wire load models (WLM) are something a bit different and I cover them in this lecture, as well. They basically are just a (really poor) estimation of the RC load of a net based on the fanout.
@ndjarnag2 ай бұрын
good
@AdiTeman2 ай бұрын
Thanks
@visheshjain90442 ай бұрын
Can you please suggest any course on layout design please
@AdiTeman2 ай бұрын
Hi, Sorry, but I don't know of any freely available courses on layout design. I'm not saying that there aren't any of these - I just am not aware of them. I am pretty sure that the leading EDA companies (e.g., Cadence, Synopsys, Siemens-EDA) provide such training and these are often free to academic participants. In addition, I get emails from KalTech (kaltech.co.il/ic-training/) with offers for their layout courses, but I have never taken one of them, so I cannot recommend them.
@visheshjain90442 ай бұрын
Analog on Top
@AdiTeman2 ай бұрын
Hi, I'm not sure what you are commenting. I would like to point out that this short presentation is for digital-on-top integration. In other words, the flow is run in a digital place and route tool, such as Cadence Innovus or Synopsys Fusion. The point of the presentation is to discuss how to prepare a custom-designed block for integration in a digital flow. This is as opposed to analog-on-top, which is preparing the entire design in a transistor-level tool, such as Cadence Virtuoso. Here, the custom block will be designed in such a tool, but the overall tapeout will be prepared in the digital tool.
@ndjarnag2 ай бұрын
this is good
@AdiTeman2 ай бұрын
Thanks!
@Shahidsoc2 ай бұрын
different cells have different transition, so which value shoul be chosen from which cell's table ?. and what to do if we have CCS model libraries.
@AdiTeman2 ай бұрын
Hi. I'm not sure I understand the question exactly, but I'll try to answer. If you mean what to set as a default input transition - this is a good point. We want to model the input transition, but we don't know what is connected to the input and how. Therefore, the calculation will not be accurate. This is a limitation that cannot be overcome, but the point here is to get a "good estimate". In general, digital delay modeling is NOT ACCURATE (as opposed to SPICE), but provides a tradeoff between run-time and accuracy. So we are trying to get a number that is "good enough", while it is clear that this is not 100% right. In this case, we have two options - 1) provide a number that characterizes the process. This is very inaccurate and has nothing to do with CCS or the different gates. It's just some number that could be a reasonable transition so the delay of the next gate falls within the timing tables. 2) provide a typical gate from the library that may be connected to the input. In this case, the .lib (including CCS) of the gate is used for the delay calculation. This is not the gate that is actually connected, but is typical of the technology/library and therefore is a good estimate. In any case, these are just estimations. Not accurate. But better than assuming something that is not based on anything...
@sayanbaidya97242 ай бұрын
wow best explaination
@AdiTeman2 ай бұрын
Thank you!
@dGooddBaddUgly3 ай бұрын
Your lectures have been GOLDEN Professor. Thank you so so much! God Bless you and your family!
Пікірлер
can you given an example or threads, like you mentioned for process, each tab in a browser is a process. And can there be user and kernel process?
Errata: At time @23:51 the number of MACs should be 10,275 for an 89% reduction and not as stated. Thanks @FritzKissa for noticing this mistake.
Can anyone help me understanding Body Effect intuitively without using equations?
Hi, The easiest thing in my opinion is just to remember the following: "Negative Body Biasing raises VT". For an NMOS, that means that if I put a negative voltage on the bulk, my VT is higher. For a PMOS, the opposite - if the NWELL is biased higher than VDD, the |VT| of the PMOS becomes higher (i.e., the VT becomes more negative). To understand in general, physicists have many ways of explaining things that are all different ways of looking at them. One thing you could try is that there is a certain amount of voltage you need to apply to the gate in order to invert the channel. If I put a negative voltage on the body, I need to apply more voltage on the gate to bring the channel to the same voltage. This can be supported by the two capacitor drawing I show many times in the slides, where the gate-to-channel capacitor is what we want to control the channel, but the channel-to-body (or channel-to-others) capacitance is "fighting" with the gate voltage. If we apply a negative body voltage (on an NMOS), we are effectively increasing the gate-to-body cap and making it harder to invert the channel. Hope that helps...
At 24 minute mark I don't understand where the MAC count of 16,675 comes from. You have 3 pcs of 3x3 kernels, that move (convolve) 5x5 times; 3x3x3x5x5=675 MAC in the group convolution part and then 128 pcs of 1x1x3 kernels that move 5x5 times; 128x1x1x5x5 = 9,600 MAC in the pointwise part, i.e. the total MAC is 10,275. Can you check this?
Hi @FritzKissa - You are 100% correct. I don't know where this error came from. I am pinning a comment with the correction. Thanks!
can you recomend some textbook for this course as well?
Hi, Actually, that is not an easy thing. I don't know of any comprehensive textbook that provides everything I cover in this course. That said, there are many books and other material that you can find that together will provide everything. What my main suggestion is, is to look at the list of references at the end of each lecture. I have tried to put the main sources that I relied on (other than my own experience and knowledge), so for each lecture, some relevant text book may be available.
I can't believe this course is free , Thank you very much Lecture 1 is one of the best introduction to a course i have ever took
Thank you for the appreciation. I have a lot more content available on KZread for free, which you can access through my channel or see in a more convenient layout on the EnICS Labs website enicslabs.com/education/ I'd love to hear more feedback and I hope to find time to create more content over the summer.
can we do cell padding for pin density
Hi, Yes, indeed you can pad cells and that will reduce the pin density. Of course, this comes at the cost of extra area, but it could be useful to apply to certain hierarchies. I guess this is the same as applying a lower target utilization, though going at it from "a different direction". As a side note, cell padding is usually used for things like leaving space to put decaps next to flip flops and clock buffers to improve the dI/dt drop near the toggling clocks. But it could be used as you suggested, as well.
could you please say what is timing model?
Hi, A timing model is a simplification of how to calculate the delay (and a few other things) through a digital gate. This is fully covered in Lecture 3 of this series. Specifically, I suggest you watch this video: kzread.info/dash/bejne/k594ls58YMfLfc4.html
Hey Professor. How can we identify intuitively (without tools) which net is aggressor and which is victim? Does it depend on frequency or any other parameters?
Hi, So, my intuitive, straightforward answer is that if two nets are routed close to each other for a substantial distance, then there is a good chance that an SI issue will occur. But it's very hard to actually see this without a tool, since you have lots (...millions... billions...) of nets with different segments and so forth. Which is a victim and which is aggressor. Well, first of all, they can "trade places". In other words, one of them can be the aggressor to the other for a certain timing path and can be the victim for a different timing path. But usually, the victim will be the one that is weakly driven, which can be seen as a slow transition on the net. You can look at DRVs - max capacitance and max transition (and max fanout) reports to find high potential candidates for SI problems. But really, just use the tool...
I have been following these lectures and absolutely love them. It would be great if you can please record a video series of using the Cadence tools to perform FPR and CTS on a small design in real time, maybe a live stream if possible so as to get some hands on approach.
Hi, Thank you so much for the kind words. Indeed, I do not provide any hands on/live material at this time. This is due to commercial restrictions - both of the CAD/EDA tools and of the IP that is used (standard cell libraries). In the future, I may get permission from the two sides and work on providing such material, but this is also really dependent on my time (both to go through the bureaucracy and then to actually go and make the recordings...). And as you can probably see from the amount of material I have uploaded lately, time is something I don't have a lot of. Who knows, maybe things will settle down and I'll find the time (though I have been promising my wife that "I will have more time next month" for about 15 years :).
@@AdiTeman Thank you for the detailed response and for the incredible content you’ve already provided. I completely understand the constraints around commercial restrictions and time commitments. Your lectures have been immensely valuable, and I genuinely appreciate the effort you put into them. I have completed the DVD lecture series and I am going to start with your SoC playlist soon. If you ever do get the chance to navigate the bureaucracy and find the time, a hands-on series would be fantastic. Meanwhile, I’ll continue learning from your existing materials and look forward to any new content you can share. It would be great if you could suggest some online courses or any other references in your knowledge that could help gain this hands on experience before I actually enter the industry. And I hope you manage to find that elusive free time soon :) Thanks again, and best wishes! Regards!
Thanks a lot for this video, Adi! I am more of an experienced Digital Design Engineer myself, so I know 90% of this stuff already. But for beginners this course is very valuable, Linux concepts are explained very clearly and with good examples. I only wish I had something like this course when I first started working with VLSI CAD tools and Linux. My path would have been so much easier! I will now watch other lectures on your channel for sure
Great to hear! Please give me feedback on my other videos and I'm open to suggestions for more material (though, who knows when I'll find time to record more :)
What is eda tool?
EDA stands for Electronic Design Automation. This is the general name of the programs used to design chips. We also call them CAD (computer-aided design) tools, but CAD is used in other fields, whereas EDA is usually used for hardware design utilities. I suggest watching my other courses to learn all about this field. You can find my material at enicslabs.com/education/
Thank you sir, your explanation is very easy to understand
You are most welcome
how to get the slide?
All of the slides are available on the EnICS Labs website. The webpage for this course is at: enicslabs.com/academic-courses/dvd-english/
@@AdiTeman thanks for your reply but I cant find the direct link to download it, can you please provide it?
@@thangdaoviet419 No problem. There is a button on the right panel of each lecture that says "Lecture X Slides". For this specific lecture, the link is: www.dropbox.com/scl/fi/d5sqn83htkyifkbed7nfu/Lecture-3-Synthesis-Part-1.pdf?rlkey=e0jfxerycb03brp3feq3q265t&dl=0
@@AdiTeman thank you very much, have a nice day ^^
I been working in VLSI for 7 years. And I still come back to your videos to refresh or relearn something! A heck lot faster than reading my Neil Weste textbook too lol
Thank you! That's what makes me motivated to provide more material (hopefully, I will have time to make some more videos later this year).
Thank you so much professor for putting this online for free. There are many university in USA that only cover parts of VLSI but nothing as clear and complete as your lectures. Some focus more on architecture, other more on circuit design. I found myself lacking in some areas when I entered industry 7 years ago. I get very good but also very segmented focusing just some circuit design, timing calculation and analysis but to become a good VLSI, we also need breadth. And that doesn't happen naturally by staying in one place. Your classes are helpful in gaining that breath. : - )
You are so welcome. That is what they are there for! Hopefully I will find time to prepare more videos later this year. I have a long queue of lectures waiting to be recorded - I just have to find the free time to get around to it.
Sir after i learning this 73 videos.can i have learned and apply for vlsi designer jobs
Well, I don't know if just watching the videos is enough, but it is a good start!
Errata: At time @9:54 there shouldn't be a "dx" in the expression for current. Thanks to @arghya.7098 for pointing this out.
At 9:54, shouldn't the expression for drift current be: I_d = −v(x)⋅Q(x)⋅W since current is velocity times charge, and the charge is proportional to the width of the channel. I don't understand why the infinitesimal channel length dx is included. Can you please clarify this point?
Yes, you are right. There is an "extra" dx there. Thank you for pointing this out (it "magically" disappeared on the next slide ;)
@@AdiTeman Thank you for the clarification, Professor. I really enjoyed the lecture and appreciate your guidance on this point.
lifesaver
Thanks!
GRT GET NEW STUFF
Hello Sir,could you please provide relevant lab work?It would be great
Hi, Sorry that I haven't provided it as of yet. I may be updating the course recordings soon (a lot has changed since 2020...) and maybe then I could add some labs.
Hii sir, Thank you for the playlist Is there any particular tool where I can start my pratical knowledge. There skywater 130 I use windows I can't use it. What type of tools should you recommend
Hi. Tough question. These things mostly run on Linux and aren't too friendly for home computers. But you can start by buying a starter FPGA and programming it. The FPGAs come with a tool suite that runs on Windows and you can learn a ton from it (and FPGA design is a very popular and required skill on its own). Try starter kits from Xilinx or Altera.
I think Din and Dout and TX and RX are mistakenly swapped at 10:15. And if not, it would great if you explain the naming convention
Hi Haziq, Indeed, this is a really bad naming convention, but I didn't invent it. Maybe I should have changed it, because it's so upside down, but I kept what was in a reference that I based it on. You can see in the bottom figure that the usage of DIN/DOUT is the same (pay attention that the PAD is connecting outside the chip, while the ports are connected to the chip core). I can try to give a makeshift explanation, but really, it is going to be a bad one, because if I were to design these I/Os, I would label the pins in the opposite way. Here it goes - "we are looking at the I/O from the perspective of the other chip, so its outputs are connected to DOUT and its inputs to DIN. Where the other chip is receiving is the RX I/O and where it is transmitting is the TX one". So after we got that horrible explanation out of the way, I will say that these things just depend on whatever the vendor who developed the circuit decided to call it. So you have to read the manual (good luck ;) and adhere to it.
@@AdiTeman well the explanation isn't that horrible. And by not changing it, you saved the students from future confusion. Thanks!
Man whoever came up with this is a genius. Thank you Dr. Teman. Could you please confirm one thing. Connectivity matrix saves us from calculating each quadratic wirelength and differentiating them partially in order to arrive at A matrix, right?
Hi, I guess you could look at it this way, but I think it is more straightforward than that. The solution to the optimization problem is to differentiate the entire system. This is a set of equations that you can collect into matrix notation. The connectivity matrix is basically the outcome of writing down all equations and collecting them together. But the "observation" is that this matrix has features that "are intuitive" and "make sense" and represent the connectivity of the netlist and therefore it's just straightforward to write it down. Note that there is some amazing work by University of Texas (David Pann), where they use deep learning inspired optimization to solve the placement problem ("DreamPlace"). I highly suggest looking into this, because it's really really beautiful.
איפה אפשר להוריד את המצגת? בקישור שמופיע יש רק את הרצאות הוידאו ללא המצגת
בקישור יש כפתור ליד כל הרצאה עם קישור למצגת.
Errata: At time @23:24 the default state of the Mux should (of course) be 1'bx (and not 4'bx on a one bit signal). Thanks @atharvaagiwal6051 for paying attention to this.
you didn't mention what is 456 and 789. Is it a single number 456789 or two different numbers
Hi, Actually, your question made me go look this up again and make sure I didn't make a mistake in the lecture and it also showed me something that I'm not sure how deeply I thought about it before. Luckily, there is no mistake in the lecture (as far as I can tell), but maybe I should have given the example a bit differently to make it more clear. What we want to do is fill a number into a register that doesn't fit in the immediate field of addi (so >4096). What we have to do is break it down into two numbers. The first, we multiply by 4096 and then we add the second to it. The number, therefore, that gets stored in the example is 456x4096+789. This can be a bit complicated due to sign extension and twos complement numbers. Therefore, we usually wouldn't write this sequence by hand, but rather use the compiler pseudoinstruction "li" (load immediate). This is, assuming that you actually are hand writing your assembly code, rather than doing what most people do, which is using a high level language ;). But for the purpose of understanding this, if I would have done lui 0x456 addi 789, then I would have gotten 0x456789 in the register (because a <<12 operation is 3 hexa positions).
Just another point, There is a great discussion about this here: stackoverflow.com/questions/50742420/risc-v-build-32-bit-constants-with-lui-and-addi
@@AdiTeman Thanks sir. Sorry for not mentioning the time stamp while asking the question.
Hi Professor Adam. Your videos are simply wonderful. Huge fan! One query, could you please provide an example scenario of exclude pin usage for better understanding? Thanks.
Wow, that is such a great question. It is something I have been teaching for years as an option, but seemed to never have thought about what it's good for! Your question made me look it up. I will say that clock trees are a bigger subject than I discuss in this lecture - much bigger. Every chip has new "surprises" in the clock tree that make you use these different definitions, and every time I have run into one, I say "I'll use that as an example in a lecture on CTS", but by the time it becomes relevant (usually, this is in the middle of a tapeout or other stressful times ;), I totally forget the scenario and why we needed these weird commands... Anyway, back to your question, I found an answer. The general high-level reason is that there can be cases where a net is both on a clock path and a data path. In such a case, you want to buffer the clock part of the net, but not the data part of the net, so you would put "exclude" on the data pins. That is a great explanation, right? Well, the obvious question is "why the heck would a clock net be a data net as well???". And the answer is not too obvious. One answer could be that you may have some observation circuitry on the clock and you treat this as data. But one of the user manuals shows a more common case that actually makes sense and that is the case of a clock divider. A clock divider is just a bunch of flops, where the output of one drives the clock signal of the other with a toggled input. In this case, we have the Q pin of the flop driving a clock net and so it needs to be handled by CTS. Until now, all is good - we want these to be buffered and such. So this is not the case of mixing clock and data. HOWEVER, we perform scan insertion (which I didn't cover in this course), where all flops are connected in a shift register configuration (scan chain) for testing. In this case, the clock net emanating from the Q pin of the divider flop goes to the CK pin of the other divider flop, but also goes to the SI pin of the next flop in the scan chain. This IS a data path and shouldn't be treated as a clock net. So the SI pin should be regarded an EXCLUDE pin. (note, this may be handled inherently by the CTS engine). Thanks for pointing this out so I learned something new.
Thanks for the amazing lectures, sir. I have a question. At 5:00 the HPWL calculated is 7 while if we count the units manually it's 9 as each of the two bottom right cells are also taking one unit of wirelength. You said that these cell doesn't add much in the wirelength but what if they are a little far (inside the bounding box). The actual wirelength can be 10 or 11 units and our estimation would be quite off.
Hi Haziq, Your question is legitimate, because I can say that I have been confused by this more than once (don't tell anyone 😂). But actually, it's almost irrelevant. The reason is that we are building an estimator that is trying to somehow quantify the wirelength, so we can define a cost function and optimize it. It is very clear that our estimator is far from accurate - it just needs to represent something that is better or worse than something else, so we can change it and see if that helps. So counting the distance (|x1-x2|) or the number of blocks that the wire occupies (which comes out |x1-x2|+1) is essentially the same for the purpose we are using it. In addition to that, as you will see in the following parts of the lecture, we are actually using other estimators more predominantly (such as the quadratic wirelength estimator). But the point is the same - we are trying to put a number that represents how good or bad our solution is for comparison to a different solution. You could look at it as a rating system that gives you 1-5 stars or a grade of 1-10 - as long as the better option has a higher rating, it doesn't really matter what the scale is.
@@AdiTeman Thanks. I have completed the following parts of the lecture and got your point. Thank you for taking your time out and explaining.
@@haziqiqbalhussain Great
These lectures are really well done!!
Thanks!
3:31, its written that frame pointer points to beginning of the frame, but you said it points to base of the frame. Is the base of frame really the beginning of the frame, or is there any problem in whats written ?
Hi, I don't think there's a problem. The "base" and "beginning" that I refer to are the same thing. Basically, it's the opposite of the stack pointer that points to the top of the stack. The frame pointer points to the bottom so we can return to where it started.
@@AdiTeman oh I see, thank you for replying
Now this lecture I like!! gj
Glad to hear that!
Nice explaintions sir.... Thank you 😊
Thank you!
In the default statement the output should be 1'bx. For 4:1 mux
Haha, Great catch! This slide (and video) has been around for quite some time and no one ever pointed that out. This is not an error, per se, since it's just an X (and 4'bx is basically the same as 1'bx) and the simulator and synthesizer would (hopefully) disregard this, but it was for sure unintentional in the slide. Thanks for finding this. I will pin the errata!
Hi! I coded my own spice and this serie helped me a lot! Thank you! The hardest part is nonlinear elements. Wrong first guess causes lots of trouble. I wonder if there is a method for guessing starting point as well. I think I will add bisection method after some failed iterations. Also do you know anything about modified trap integration? Mike Engelhardt (Creator of LTspice) claims that is the best method (simple a true trapezoid without the oscillation effects) and only he knows that.
Hi, I think you just went beyond my pay grade :). I'm far from being an expert - and for sure when it comes to deep down things that affect analog simulations - that's out of my expertise. There are many gurus out there, but I've had the pleasure to meet two of them - Andrew Beckett of Cadence (he answers many questions on online Cadence forums and is amazingly knowledgeable) and Prof. Andrei Vladimirescu, one of the designers of SPICE (and author of "The SPICE Book"). Andrei is a good friend and would probably be happy to answer your question if you reached out to him (...and bonus points if you told him that I sent you ;).
Hi there, i have a question that do we actually do this methodology when we do placement step, or tools will automatically do it? And thank you for your videos, it so good.
Hi Lam, Luckily, the algorithms are deeply implemented into the tools, so we don't have to do anything ourselves in terms of implementing them ;)
Awesome lesson! A thermal engineer here trying to learn Semiconductor to do my job- electronics thermal management better. Would be glad to know if anyone here else has similar background.
Great to know that my lectures are reaching other disciplines. I'd also love to hear if there are any other thermal engineers watching these.
why multiplexers are needed ? to chose different frequencies ? for dynamic volatages and freq ?.
Hi, Yes, these can be cases for multiplexers on the clock tree. There are actually many cases where you would multiplex several clocks onto the same clock tree. Just as an example that is commonly found on SoCs: We usually provide two clocks: (1) An external reference (from a crystal) with a low frequency (usually less than 100MHz and sometimes much lower) (2) An internally generated clock from a PLL or other clock generator. When we boot the system, the generated clock is not available. It takes time for the PLL to "lock". Additionally, we want a backup in case the PLL doesn't work properly or something like that. So we drive both clocks into a Mux, boot with the external clock, and transfer to the internal clock once it is stable.
In 1:32:20 why do we draw data like that one going up the other down at the Same time but for clock There is onlu one line up and down in a row?
Hi, Thanks for the question. Some things we - experienced engineers - take for granted, since we're used to seeing them so often and we may not explain them (though I do vaguely remember explaining this in one of my lectures somewhere). The clock is very deterministic. Every clock period, it goes up once and down once and it does this repeatedly at a constant rate. Therefore, this is how we draw the clock signal. On the other hand, we don't know what the data is in a general case. It could be '1', could be '0', could stay constant and could toggle. Therefore, we draw it in that "changing" or "unknown" kind of way. It is supposed to represent "all cases" where we must take into account that any level can be there and where the X's are, it may be changing (it will be stable where the lines are straight).
Thank you so much the detailed explanation
You are welcome!
Hello sir, your explanation is wonderful, I appreciate your efforts. Sir can you help to know the details of CCS Model and how it is different from the WLM Model.
Hi, So I think this is covered in Lecture 3 (Synthesis - Part 1). Specifically, you can find the video about Liberty timing models here: kzread.info/dash/bejne/X4KbudWKeqatnNY.html I don't go into great detail about CCS models (they're actually very complicated), but I explain where they come from and such. Wire load models (WLM) are something a bit different and I cover them in this lecture, as well. They basically are just a (really poor) estimation of the RC load of a net based on the fanout.
good
Thanks
Can you please suggest any course on layout design please
Hi, Sorry, but I don't know of any freely available courses on layout design. I'm not saying that there aren't any of these - I just am not aware of them. I am pretty sure that the leading EDA companies (e.g., Cadence, Synopsys, Siemens-EDA) provide such training and these are often free to academic participants. In addition, I get emails from KalTech (kaltech.co.il/ic-training/) with offers for their layout courses, but I have never taken one of them, so I cannot recommend them.
Analog on Top
Hi, I'm not sure what you are commenting. I would like to point out that this short presentation is for digital-on-top integration. In other words, the flow is run in a digital place and route tool, such as Cadence Innovus or Synopsys Fusion. The point of the presentation is to discuss how to prepare a custom-designed block for integration in a digital flow. This is as opposed to analog-on-top, which is preparing the entire design in a transistor-level tool, such as Cadence Virtuoso. Here, the custom block will be designed in such a tool, but the overall tapeout will be prepared in the digital tool.
this is good
Thanks!
different cells have different transition, so which value shoul be chosen from which cell's table ?. and what to do if we have CCS model libraries.
Hi. I'm not sure I understand the question exactly, but I'll try to answer. If you mean what to set as a default input transition - this is a good point. We want to model the input transition, but we don't know what is connected to the input and how. Therefore, the calculation will not be accurate. This is a limitation that cannot be overcome, but the point here is to get a "good estimate". In general, digital delay modeling is NOT ACCURATE (as opposed to SPICE), but provides a tradeoff between run-time and accuracy. So we are trying to get a number that is "good enough", while it is clear that this is not 100% right. In this case, we have two options - 1) provide a number that characterizes the process. This is very inaccurate and has nothing to do with CCS or the different gates. It's just some number that could be a reasonable transition so the delay of the next gate falls within the timing tables. 2) provide a typical gate from the library that may be connected to the input. In this case, the .lib (including CCS) of the gate is used for the delay calculation. This is not the gate that is actually connected, but is typical of the technology/library and therefore is a good estimate. In any case, these are just estimations. Not accurate. But better than assuming something that is not based on anything...
wow best explaination
Thank you!
Your lectures have been GOLDEN Professor. Thank you so so much! God Bless you and your family!
Thank you so much!
Does "ps" stand for picoseconds?
Yes, indeed, in these slides, ps is picoseconds