Map Reduce Paper - Distributed data processing
Paper that inspired Hadoop. This video explains Map Reduce concepts which is used for distributed big data processing.
This video takes some liberties to explain the underlying concept as simply as possible. For example; the map process for song count is typically implemented as, emit number 1 for each song title. After this a combiner function is used to locally aggregate/sum these counts per song.
Also, this video leaves out many implementation details, which are interesting. I encourage you to read the paper for them.
Thanks for watching.
Channel
----------------------------------
Complex concepts explained in short & simple manner. Topics include Java Concurrency, Spring Boot, Microservices, Distributed Systems etc. Feel free to ask any doubts in the comments. Also happy to take requests for new videos.
Subscribe or explore the channel - / defogtech
New video added every weekend.
Popular Videos
----------------------------------
What is an API Gateway - • What is an API Gateway?
Executor Service - • Java ExecutorService -...
Introduction to CompletableFuture - • Introduction to Comple...
Java Memory Model in 10 minutes - • Java Memory Model in 1...
Volatile vs Atomic - • Using volatile vs Atom...
What is Spring Webflux - • What is Spring Webflux...
Java Concurrency Interview question - • Java Concurrency Inter...
Пікірлер: 29
It's incredible how you compress a complex paper that can take days or even weeks to fully grasp into a ten minute video. You are an amazing teacher. Props to your animation that is on point.
i cant even begin to explain the level of clarity i achieved after watching this video!! Thanks a lot sir! Please keep posting more videos, it is very helpful for students like us :)
Really very well explained in a very short amount of time! Much appreciated
I highly appreciate the work you do. Keep up the great work
Thank you for the video! Very clear explanation. I especially liked the examples part.
The best explanation and pictorial representation of Map Reduce I came across. I saved this Playlist. It is too good and useful.
Dude, this was an amazing explanation!!
the best explaining video of this concept i have ever seen. Thanks :)
thank u! can't wait for bigtable design review. please do a zookeeper / etcd one.
You are an excellent teacher! Please keep making more such videos.
short and crisp explanation, thank you
awesome and crystal clear explanation. Such a big topic condensed to 10 minutes video. kudos to your work
One of the best explanation you can find on internet ! Please make a video on HDFS
Thanks for such awesome explanation. Keep doing the great work 😁👏
That was really good !!!
Great explanation!!
Excellent!!
After a long time I found excellent videos. May I request you to create videos/playlist on Kafka, Cassandra and AWS Cloud. I see them very tricky and hard to understand. Thanks for making awesome videos.
very good that you also covering latest technologies like Hadoop ecosystem. Expecting more things like these. 🙂
Excellent explanation....👍👍👍
You are brilliant
i hv certain questions related to java memory manegement and out of meemory...where i can send
Good
Need Google big data table video as you promised in GFS video.
Really good explanation! However, I have one question. I may have missed something but how exactly it deals with chunks replicated over a couple of nodes? There may be a case when we use some data twice so it can impact the result.
@architsaxena3792
3 жыл бұрын
I think that's why client informs master right. I mean master has all info where the nodes are duplicated so it can avoid duplicate servers.
@user-em9mw9ch3y
2 жыл бұрын
operations are run on only one of the 3 replicas ( remember that out of 3 servers, 1 is primary and other are secondary). If the primary fails, the GFS master sends the operation (map ) function to another secondary replica keeping the data and final result in the same server. my humble answer. Corrections are welcome.
Can't we get read/write frequency count from GFS master log files itself which is Stored remotely since it have read write log for files, I Just learning so might i understood wrongly
@DefogTech
Жыл бұрын
GFS's responsibility is to act as massive hard-disk, it does not have understanding of what is written on files. If you check the GFS video, clients directly store data on individual machines, and GFS Master is not aware of what is being written.