The UNCOL Problem - Computerphile
Can there be a universal intermediate programming language? Sounds like Esperanto to us - Professor Brailsford has more.
/ computerphile
/ computer_phile
This video was filmed and edited by Sean Riley.
Computer Science at the University of Nottingham: bit.ly/nottscomputer
Computerphile is a sister project to Brady Haran's Numberphile. More at www.bradyharan.com
Пікірлер: 202
If one has to choose a modern-day "winner" for Best Intermediate Code, among the 40 entries on the Wikipedia "Bytecodes" page , then LLVM IR would be a very strong contender. The sheer number of organisations that have adopted it , and the 2012 ACM Turing Award for its authors: Vikram S. Adve, Evan Cheng and Chris Lattner, speak for themselves.
@GilesBathgate
4 жыл бұрын
There is also Common Intermediate Language (CIL), formerly called Microsoft Intermediate Language (MSIL), but thats more about multiple languages targeting a single bytecode instead of multiple architectures.
@stIncMale
4 жыл бұрын
exactly, LLVM was my first thought
@UltimatePerfection
4 жыл бұрын
I think Java is also a very strong one due to how many environments it supports.
@stIncMale
4 жыл бұрын
@@UltimatePerfection Well, true Java bytecode, WebAssembly, CIL (aka MSIL) were all created to act as UNCOL. But despite my main language is Java, I still thought about LLVM IR for whatever reason :)
@sebastianelytron8450
4 жыл бұрын
I vote for Clang!
"There are 14 standards. We need one tu unify them all. We now have 15 standards"
@chrimony
4 жыл бұрын
That's the great thing about standards. There's so many to choose from...
@shelivsbaxters
4 жыл бұрын
@@chrimony
@shelivsbaxters
4 жыл бұрын
@Evi1M4chine I don't know who programmed you but it's a fail obviously.
@blain20_
2 жыл бұрын
Truest statement in the universe.
I once saw a ray tracer implemented in Postscript, and out of curiosity I submitted it to my university’s laser printer. About 5 hours later the printer finally spat out a rather underwhelming image, and then other students could finally print their homework again. (For whatever reason it was impossible to remove this ludicrously-long-running job from the print queue; it was no longer in lpq and power-cycling the printer didn’t help at all since it just restarted from scratch. It would have probably only taken half an hour if people had just been more patient!)
@SoniEx2
4 жыл бұрын
oof
The attenborough of programming
@gregoriysharapov1936
4 жыл бұрын
Here we have the Attenborough Programming Language, abbrievated as APL. I have a pen, and an APL. APL-pen. No idea how I made this joke
@gregoriysharapov1936
4 жыл бұрын
@Neel Shukla Oh bruh hahaha
@lawrencedoliveiro9104
4 жыл бұрын
Richard or David?
One other thing I feels worth mentioning, since PostScript was mentioned in this video: In the late 80s or so, Sun Microsystems, and I think NeXT Computer and maybe one or two other companies tried to create a graphical display standard that completes with X Window called Display PostScript. It added a number of constructs to PostScript to make it able to handle interactive events, and establish listeners for those events. The overall programming in Display PostScript was, like PostScript itself, in a postfix language, and felt a lot like FORTH that had runtime typing associated with data values, and a bit more high level control constructs. Once you got the hang of the postfix programming and the concepts of the display "context", it was actually fun and quite vast to write programs in that would directly interact with the user.
these video's by Professor Brailsford are a joy to watch,informative and insightful.I cannot thank him enough for sharing his knowledge.
It's worth noting that generating assembly directly from C is pretty much impossible without going through a representation where you've flattened out subexpressions but haven't yet picked registers. So you definitely want some sort of intermediate language, and, since you haven't yet used any details of the target architecture, it's tempting to consider it universal.
"...if we can only agree what it is." Well, there's your problem right there.
In the 1980s there was the Amsterdam Compiler Kit ( ACK) whose designed to allow you to build a compiler for any language you designed to generate an intermediate language (byte codes). Then the second phase of ACK was the OS/hardware code generation, were you defined your runtime platform. So the general idea was if a new language came along you would use ACK to build the compiler for the intermediate code generation and then using an existing ACK code generator for the runtime environment. Check out the Wikipedia page on the Ansterdam Compiler Kit! It brings back memories of 2nd year Computer science compilers course.
@lawrencedoliveiro9104
4 жыл бұрын
Also known as the “Free University Compiler Kit”. Richard Stallman wanted to use that as the basis for GCC, but soon discovered that their idea of “Free” wasn’t “Free” at all. So today we have GCC, and nobody uses ... that other thing, any more.
I just worked it out. Professor Brailsford is the Man from UNCOL... I'll see myself out xD
No discussion on LLVM IR? That seems to be the closest thing we have, and it generally works pretty well for me.
@zeejenkins
4 жыл бұрын
Was going to mention LLVM as well.
@profdaveb6384
4 жыл бұрын
Quite so. Please see top comment on this list which Sean has now "pinned"
@DoDoENT
4 жыл бұрын
I have a hunch that they are preparing an entire video dedicated to LLVM IR...
This is a very interesting set of videos. I've been having to go back and find the previous parts. It would be great to have a playlist that collected them all together.
The uncol problem is definitely a catching title cause you caught me like a fish
@sadhlife
4 жыл бұрын
same my dude
I would love to hear Dr. Brailsford speak more about Java, and specifically its choice of 'banning' pointers, in future! Great video, as ever =)
@lawrencedoliveiro9104
4 жыл бұрын
Java doesn’t “ban” pointers, it only obscures them. As a consequence, it suffers from the aliasing problem, e.g. a = b; a.field = «value»; System.out.println(b.field); /* what does this print? */
The solution to UNCOL problem is the Forth programming language. Invented by Charles Moore between 1968-1970. I've been using the language as a universal assembly language for years.
Computerphile is such a great channel, this video proves it again! Professor Brailsford, please don't stop sharing these interesting thoughts interspersed with historical repetitions they are fantastic. LLVM IR wasn't really mentioned in this one but I can see it's pinned as the top comment. I just wondered how much it "fit" into the UNCOL problem and how much of "UNCOL" it really implements.
@michaeltyniec7010
4 жыл бұрын
Professor Brailsford's videos are my favorite - but I love all of Computerphile. Makes me wish I studied Computer Science in college instead of Engineering.
When I was at Rutgers, there was a rumour that one of our operators had written a Mandelbrot-set generator in PostScript. Legend has it that the program would tie-up an HP LaserJet for _hours_ (quite an ornery way of doing a DoS against a printer).
the "Technology Independent Machine Interface" of IBM i (former OS/400) is also pretty nice
Would enjoy hearing the professor's thoughts on LLVM.
Interesting. Web assembly is an interesting addition to this story. It's a specification being implemented by browser vendors which supplies a compiler target. While its intended use is the browser, some people are exploring its use as a universal binary target for server-side applications (enabled by node - the JavaScript engine from Google Chrome). I couldn't tell you about performance, but it is interesting
@StuartThomson
4 жыл бұрын
I'm actually reasonably excited to see what happens with WASM, especially if it breaks out of the browser/Node space entirely (compiled to x86, for example). One correction: Node is not the JS engine of Chrome. Both Chrome and Node use V8.
What a fantastic bit of shade at the end :)
But what about LLVM Intermediate Representation? It seems pretty promising.
@JoshuaBarretto
4 жыл бұрын
It's not really universal though... We only see it as universal because almost all modern languages have their origins in C. For functional languages, it's almost useless without a secondary abstraction layer between.
@cmdlp4178
4 жыл бұрын
LLVM IR is useful to write a platform independent compiler/code tranformator as its middle end, but LLVM IR it is only supported by LLVM itself, which might not be available for every platform. There should be a standard for such intermediate language similar to the C standard, which enforces more compatibility.
@dealloc
4 жыл бұрын
@@JoshuaBarretto > We only see it as universal because almost all modern languages have their origins in C. While modern languages usually have some roots in C, it's not because of the limitations of LLVM. You could decide on your own syntax and still use LLVM as the middle layer to provide the machine code. As cmdLP rightly points out, its universality is largely based on whether it is available on more platforms. However, the end result that LLVM generates is only limited by what LLVM offers-which you can extend yourself if you need to.
@AntonioPetrelli
4 жыл бұрын
@@cmdlp4178 I don't see the availability for every platform, since probably it is already available almost for every platform. However there are differences that have to be taken. LLVM IR must be *generated for a platform* to comform to some OS specifications. For example, exception handling in Linux and Windows are so different that there are even different IR primitives to support one or the other. Moreover LLVM is tight to some ABI that have to be respected to make it work.
@lawrencedoliveiro9104
4 жыл бұрын
@@cmdlp4178 I think LLVM, like GCC is available on every platform on which the Linux kernel runs. That should be about two dozen major processor architectures. No other equivalent would even come close to those two.
2:02 The I/O problem mostly went away of its own accord. Once punch cards, paper tapes, magtapes, line printers and all the rest of it went away, everybody pretty much settled on something Unix/POSIX compatible (stdio). Writing to that gives you a nice, portable common denominator.
@lawrencedoliveiro9104
2 жыл бұрын
Which perhaps is why mainstream development is moving away from Windows. And why Microsoft is trying so hard to turn it into Linux.
This has probably been mentioned _to death_, but just in case it hasn't: * the term "byte code" (maybe not one word) predates Gosling by a long time. The Smalltalk-80 "Blue Book" (Smalltalk-80: The Language And Its Implementation") was available in the early 80s, and it defined a virtual machine that understood a set of byte codes sufficient to implement the Smalltalk-80 runtime system. GNU Smalltalk was implemented according to this, which, at the time, was believed to be more of a "spec" for how Smalltalks were actually built than it actually was. * Version 6 Unix (and probably before) had the C compiler emit assembler code, which was then run through `as` to produce machine code a.out files. The main driver program for the C compiler, `cc` would run the compile step, then the assemble step, then the linking step and finally produce the a.out file, and you could tell it to stop before executing each of these steps and leave the resulting file on disk. And this was in existence in the late 70s and probably much earlier. So Bjarne's use of compiling C++ to C and letting another tool compile that was nowhere new or novel. You may recall that the C preprocessor of "macro" directives like #define, etc was also a separate step that `cc` ran, and then fed THAT output to the actual C compilation step.
Professor Brailsford's music choice is on point. John Phillips is a great album and songwriter.
LLVM IR, don't really know what it is but I had to say it too.
@dealloc
4 жыл бұрын
It's cool. That's all you need to know!
@Gooberpatrol66
4 жыл бұрын
LLVM IR LLVM IR LLVM IR LLVM IR LLVM IR LLVM IR LLVM IR LLVM IR LLVM IR LLVM IR LLVM IR LLVM IR LLVM IR LLVM IR LLVM IR
@tedchirvasiu
4 жыл бұрын
@ Wow, thanks for the explanation, today I learned.
How come nobody asked the most pertinent question: Why on earth did you use sudo for your nano?
Microsoft use Common Interface Language for all dotNet compilers. I know C# is the closest language to the intermediate, but it's not the intermediate language. Any chance of computerphile covering dotNet properly, I know you'd have to lower yourselves to do so, but dotNet is very widely used.
@lawrencedoliveiro9104
4 жыл бұрын
It’s not clear what the future of Dontnet is these days. Didn’t Microsoft try to replace it with Silverlight? Then WinRT? Then UWP? And now (almost) full circle back to Dotnet Core? Proprietary platforms do not have a long shelf life.
@lawrencedoliveiro9104
4 жыл бұрын
@MichaelKingsfordGray None of them are in the same category as a decent platform API.
@lawrencedoliveiro9104
4 жыл бұрын
@MichaelKingsfordGray You were trying to suggest that DotNet should be taken seriously, when even Microsoft is no longer so keen on it.
@lawrencedoliveiro9104
4 жыл бұрын
@MichaelKingsfordGray “Gross blunder” hahaha. Consider why Microsoft is trying so desperately to make Windows more like Linux, and why they are now trying to push Dotnet Core which is open-source and cross-platform. Because the developers are deserting Windows for Linux. Windows-specific APIs don’t cut the mustard any more.
A universally-understood synthesized computer language? So, a sort of eSperanto?
I've seen the future, and it runs on ActionScript 1.0.
Not quite the same, but this reminds me of "The Last One" which was a unique software program in 1981 which took input from a user and generated a program in BASIC which could then be run. (Wikipeda)
Another effort was the intermediate code generated by MINT, which was the unlikely acronym for Machine Independent Organic Software Tools. I ported it to the BBC Micro, and spent a very enjoyable afternoon with one of its authors, Michael Godfrey, trying to get the source code off a magnetic tape in the Cambridge University Computer Lab.
@markwilliamhumphries
4 жыл бұрын
MINT seemed very Forth inspired
@PeteC62
4 жыл бұрын
@@markwilliamhumphries Yes, but with opcode generation (for a simple VM) rather than the threaded code generation of the traditional FORTH implementations..
Wow, I didn't know that PS was Turing complete.
@betlamed
4 жыл бұрын
Tbh, being turing complete isn't really all that hard.
@whuzzzup
4 жыл бұрын
Well, watch more Computerphile then :)
The man from UNCOL - Thursdays @ 3:30pm
@NeilRoy
4 жыл бұрын
This will fly right over the heads of the younger crowd. ;) My brothers were huge fans of this in the '70s.
Having written python-numpy sort at pathogen-host cleansed lawn in amino template, I am astonished at the depth of this author/speakers grasp of computational methods. I now return to my EDSAC emulator, wearing my T-rex graphic executed in paint. M.
I think the GNU compilers (GCC, etc.) do compile to a sort of intermediate language, so porting the compiler to a new hardware platform, or optimizing for a specific generation of hardware platform, is easier. As for Java, one thing that happened with the bytecode that Gosling perhaps did not foresee was hardware platforms that implement the bytecode as their assembly, as on many mobile devices. This would likely happen with any "ideal" universal intermediate code, that it would be implemented in hardware.
@RedwoodRhiadra
4 жыл бұрын
Yes, he mentions that about gcc at the end.
@menachemsalomon
4 жыл бұрын
@@RedwoodRhiadra Yeah, he threw it in, I missed it my first time through. Thanks.
@SimonBuchanNz
4 жыл бұрын
Which platforms have adopted a bytecode? That sounds interesting.
@menachemsalomon
4 жыл бұрын
@@SimonBuchanNz Unless I'm mistaken, Android devices implement the JVM (Java Virtual Machine, the bytecode interpreter) in hardware.
@SimonBuchanNz
4 жыл бұрын
@@menachemsalomon Nope, pretty much all are ARM. Android is sort of "natively" JVM, the sort of because they have to be different enough to not be sued, but that's only the app to OS interface, not hardware.
Huh… Ok. That seems to answer a question I had from a previous video. Of course, it also raises more questions…
You said Bjarne Stroustrup really well actually. I’m a Dane btw.
What's the difference between a universal language and something like Ada? In Ada the input and outputs and storage mechanisms are removed from the compiler and made into separate compilation units, but there is no intermediate code, the Ada compiler compiles to binary and the Ada input/output libraries are also compiled to binary. So I'm guessing this is portable rather than universal! Could you make a video of pros and cons of a portable language (i.e. compiler does not worry about input/output as that is in separate libraries) vs a universal language (with an intermediate form)?
@boring7823
4 жыл бұрын
The Prof did mention this in passing, when he said that C++ (and many, many other projects) use C as CIL (C Intermediate language). C's I/O is in the libraries too. To answer your question, there is no difference except the "level" of the language you're using in the middle. Oh, and as for Ada not having intermediate code, that depends entirely on the compiler, GNU's "Gnat" compiler of course uses the same compilation methods and IL that GCC does.
How about llvm? And specifically building to WebAssembly?? We're in the future now when it comes to compiling!
I remember coding on OpenVMS in the late 80's early 90's. It was possible to write code in C, COBOL, Fortran programs or modules which could call each other i.e. COBOL could call C which could call FORTRAN etc. Does that mean the compilers compiled down to a common standard before the linker resolved the remote calls?
@SaraWolffs
4 жыл бұрын
Not necessarily. Or rather, the common standard could be linkable machine code. Universal for every language that could be compiled for that machine, for obvious reasons, but hardly a proper intermediate language. If the compilers for those languages used the same calling conventions (or knew what calling convention to use when), you can easily have binary compatibility between languages without any compatible intermediate.
@lawrencedoliveiro9104
4 жыл бұрын
Yes, there was a common ABI on VMS, which even included exception handling. So a language could throw an exception which got handled in another language! (Assuming your language notion of “exception” bore some resemblance to the VMS notion...) Nowadays it seems a platform ABI is defined for C and C++, and other languages have to conform to that.
0:33 "Not as high as C"! Times have changed :P
@uddagisko
4 жыл бұрын
@MichaelKingsfordGray What are you talking about? Machine code is literally direct instructions to the processor in the form of binary sequences. You can't go lower than that. C is very much a high-level programming language (you don't work directly with specific processor instructions), albeit quite low-level by today's standards.
@silkwesir1444
4 жыл бұрын
@@uddagisko I think it was a joke.
Well if we had UNCOL, we would have to re-optimise or re-compile it when converting to the target. The true current UNCOL is bytecode, which is a software virtual machine, so each architecture is written (or just compiled) separately.
People keep forgetting about icon. It was, and is, a great language and runs on all kinds of machines!
There's a language called C--, invented by the same guy who invented Haskell. It's like C, but lower level. IIRC, it doesn't make any assumptions about having a stack, which might make it an interesting candidate for those who want Tail Call Optimisation, for example.
The radio, wonderful invention.
Web assembly?
The man from uncol
What is the most surprising turing-complete thing that exists?
@lawrencedoliveiro9104
4 жыл бұрын
C++ templates?
The Torelli's conjecture: the current best IL will never cover the abilities of the next best IL.
honestly i think it is not so much appreciated that i think C++ (for me) or plain C for others truely gives us the closest to this dream universal langauge, and python i think perhaps for practical day to day in a limited sense
Well, if it's going to convert everything down to C code, I may as well stick to coding in C. ;)
Why do so many people in the comments mention LLVM IR but not GIMPLE or for that matter Java bytecode or CIL?
@JECastle4
4 жыл бұрын
I immediately thought of CIL, clearly a big bias towards Unix and Java.
@WorBlux
4 жыл бұрын
GIMPLE isn't really used outside of GCC, in part due to licencing issues.
Double compile and Bobs yer UNCOL
Decades ago I was learning PS and realized I could send my own crafted postscript program to my company's big printers. Next thing you know, people where wondering why the big fast expensive printer was taking half an hour to print a weird picture. My first post-script program was "obviously" to draw a Mandelbrot fractal, by computing it on the printer itself. What else.
8:32 I chuckled
Prof. Brailsford, presented by Weird Fish :D
sudo nano, yeah.
LLVM IR. Now I’ve said it too.
@profdaveb6384
4 жыл бұрын
Please see top comment in this list - which Sean has now "pinned"
@casperes0912
4 жыл бұрын
ProfDaveB Thanks for the heads-up
This man is the OG of programming.
You kinda look like my uncol
Thought it was "unicorn" for a sec.
woah dude. that was uncol
So basically bytecode, eh?
@boring7823
4 жыл бұрын
Yes, but bytecode also implies that the IL is a byte based binary "language", while this is a very popular variation, there is nothing that says an IL has to be binary or byte based. IIRC, bytecode also often implies the language is stack centric, this also is not a requirement for an IL.
I like how he struggles when he gets into C++
UNCOL is callled "Universal Computer Oriented Language"
> many of you will not know it came of the back of Adobes very successfull language called postscript Well I did. Becaus Prof. Brailsford teached me.
I vote for Forth! :)
@gustinian
4 жыл бұрын
The elephant in the room...
@telecorpse1957
4 жыл бұрын
@@gustinian Why?
@telecorpse1957
4 жыл бұрын
@Vladimir Why?
The LLVM problem
It's pretty UNCOoL.
0:33 "Not as high as C" hides in javascript
Write once, run anywhere ! Oh, sh... that didn't work out as expeczted
Web Assembly has some potential. Being single purpose (web browser language) discards a lot of the issues of having to support an impossible number of platforms.
@iwikal
4 жыл бұрын
It's not just for the web though. It's purpose is to be a sandboxed vm, with input and outputs defined by the program that runs the engine. A browser would expose web apis but that's not part of wasm itself. They must have looked at nodejs and what it did with javascript, and wanted to make a language better suited for that kind of thing.
@dealloc
4 жыл бұрын
@@iwikal Yep, despite the name WebAssembly is designed with anything but the web in mind as a first class citizen. WebAssembly has a lot of potential to support future architectures, as it isn't constrained to a specific architecture like other assembly languages are. Although it lacks some notable features like a garbage collector and has some significant design constraints like only supporting 4 (numeric) types at the moment, so you will have to do a lot of heavy lifting at the moment. But it has a lot of potential.
@iwikal
4 жыл бұрын
@@dealloc I fail to be excited about the talks of bringing gc to wasm. Feels like it defeats the purpose. If I wanted a gc I could always compile one and ship it within the wasm executable. I do hope it's not mandatory to use it.
@dealloc
4 жыл бұрын
@@iwikal However, the GC will be immensely useful in the browser. Whether you then use it in other architectures would be your choice.
Something something Yavascript?
So it's llvm ir?
LLVM IR?
Isn't LLVM the UNCOL solution?
It's funny how this dude asks and answers his own questions.
The compiler LLVM defines LLVM-IR or intermediate representation, as the intermediate language to which you compile everything and then can compile to anything, and it includes optimizers.
uncol; don't think you're buff cause you're wearing contour
Did anyone mistake this guy for Danny Devito in the thumbnail?
it's what GCC has been doing with RTL.
LLVM IR or MLIR
As an engineer that uses python and Matlab for day to day scripting and calculations, and in my best sergeant Shultz voice: C is high?
@SaraWolffs
4 жыл бұрын
Sure. You get to call a function by just writing its name, don't you? You can even calculate the arguments in-line.
Clang IR does that by making front and back modular.
It's weird, I've spent a lot of the last few years working with Java and didn't realise the intellectual background to it. It would be interesting to a see a video about JIT compilation because that is really what made modern ICs usable for business purposes.
JavaScript
6:48 The PostScript graphics model was only ever designed to put marks on paper. It didn’t work very well for interactive graphics on a display screen. Also the language, while cleverly designed to be implementable with reasonable efficiency on common hardware of the time, was pretty crap (e.g. no lexical binding).
GCC comes to mind.
Uncle problem hahaha
@tedchirvasiu
4 жыл бұрын
hehehehhehe
Do you mean the jvm??
"provocative and mysterious title" aka clickbait lol
17 seconds ago... nice
WebAssembly is the new UNCOL language
We are still suffering from the over expressiveness of adobe languages.
transpiling to JS is becoming a thing, and perhaps thats as close as we get to a UCOL vision
Yes perl is the universal language. I'll see myself out.
@RonJohn63
3 жыл бұрын
It's the universal line noise...
MSIL (Microsoft Intermediate Language) There is even a book
C-- I guess...
C is high? damn!
@thoyo
4 жыл бұрын
I was thinking the same thing 🙃 . With such an intimate level of knowledge like Prof Brailsford's, I guess anything above compiler starts to seem high level.
@retepaskab
4 жыл бұрын
Well, it can do scopes, like BASIC can't.
@nosuchthing8
4 жыл бұрын
Compared to assembly, of course!
@pratikkore7947
4 жыл бұрын
there was a time when I used to think batch script (cmd) was a programming language