Dealing with Dynamic Data - Computerphile
Big Data is one thing, but what do you do if that data is constantly changing? Rebecca Tickle on Dynamic data.
/ computerphile
/ computer_phile
This video was filmed and edited by Sean Riley.
Computer Science at the University of Nottingham: bit.ly/nottscomputer
Computerphile is a sister project to Brady Haran's Numberphile. More at www.bradyharan.com
Пікірлер: 92
Coming from someone (me) who often has a difficult time wrapping my head around some of these concepts, you've done a great job at breaking this problem up into easily understandable bits. Thx
Fascinating! Thank you Rebecca. It would be cool to see an object lesson or a contextual representation of how this works; a real world example. Thanks again for this video!
I wish I could increase the window size for my KZread recommendations. They are way too dependent on what I just watched.
@modanq
5 жыл бұрын
Meta
Crisp clear and fascinating video! Thanks a lot
This single video was so informative that I had to take notes. Thanks for this gem of a video.
Thanks for the clear explanation.
Am I the only one loving the sound bite "later data" ?
@Ethernet480
5 жыл бұрын
later dater
@S7AINLESS
5 жыл бұрын
@@Ethernet480 it's just fun to say....
@esquilax5563
3 жыл бұрын
"Stick it on the laterbase" - Alan Johnson, Peep Show
That was a really, really good video. I enjoyed watching it.
What about that rotating/moving rocket though?
Yaaaay!!! Thank you Rebecca!
Thanks for the useful videos you are sharing
Awesome video
Software mentioned at the end of the interview: MOA moa.cms.waikato.ac.nz WEKA cs.waikato.ac.nz/ml/weka/index.html
Very engaging.
Could we apply such rules to, for example, a constant update of our dataset instead of a stream of data coming in? I mean, usually we have k-fold cross validation, but we could, for example, continuously increase the sample size in order to simulate such stream. It could be useful, for example, when dealing with historical data and when we do not have access to live-streaming data.
Can you have a few models performing the same classification and correcting each other on the go? That would solve both the time of analysis for big instances by managing incoming traffic and destributing the data between the models based on their load and the dips in accuracy by assigning a weight to each model and performing a self-check if the result is thought to be false. So it can perform in parallel (when different models looking at different data) and linear (when one model is suspected and it passes it to some of the other ones so they could all learn and correct). Am I spewing nonsense or is that a thing?
@asganaway
5 жыл бұрын
no you are not, but if you are using something like a NN you can probably have a single model that behave in a similar way, and yes your model/models can be adaptive in different ways.
@RamkrishanYT
5 жыл бұрын
Aren't Ensemble models something like that?
@girors
5 жыл бұрын
@@RamkrishanYT well they are but most of them use a combiner at the end to get one prediction. In my mind it's a switch between having a combiner at the end of some models at some time that they are struggling and not having it in other cases to get the power of parallel and maintain the accuracy. I've seen some that can sort and choose between what model to use for each data segment idk if it's the same thing or not cz the other models there were just not in use. In other words I've described a dynamic ensemble model that can choose where and when to mix wich models for better performance and accuracy.
Thank you.
the marker scratches against the paper gave me the goose bumps
Big Data seems like the abstract mosaic art form of Computer Science. A lot of large, annoying, and often not straight forward patterns to sort out.
@digitplays4008
5 жыл бұрын
Thats why I love it.
Sublime explanation
@talhatariqyuluqatdis
5 жыл бұрын
Lol python boi
Good
In my opinion this video should have been split into two different videos. One about the classification problem and classification methods and another video about dynamic data. Those are two different topics and they can (and should) be explained separately. Constantly jumping between them makes it hard to understand. I know most of this from university but I think for someone new to the topic this video may not be as helpful as it could.
Why not sum sliding window drifts of different sizes, then weight the functions by window size like ex moving averages used in market analysis?
@TheSam1902
5 жыл бұрын
I'd say because you could just pick a large window to begin with, add random dropouts of data and obtain the same result without recomputing n-times your model for n different window sizes.
@paxdriver
5 жыл бұрын
@@TheSam1902 it wouldn't be the same result but they'd both be approximating the same function, except my method could be processed with streaming data and the other would need to be fed into tensors I think.
What has become of me? I watched and understood this entire video with rapt attention.
Cos dynamic data.
Nice, clear explanation. What happens if our data stream is dynamic but also correlated. For example we have two subset data streams with different patterns coming sequentially but the first pattern is connected with the second pattern in an unknown way. like : XYZYYZ021001. So for this we need one machine learning model that can adapt for the second stream but the correlated part is somehow hidden in the stream. How do we handle drifting ?
I guess this is used for antivirus programs and spam email detecting. Great video, thanks
@player-8740
5 жыл бұрын
This comment will aquire hundreds of likes and replies over time
Cool
@Gooberpatrol66
5 жыл бұрын
Beans
Rebecca "Tickle"! What a wonderful name!
15:25 “There’s a few Bits of software that you could look at.” Hahahah!
Don't video encoding algorithms deal with a similar issue when anything moves around in a video?
@sabriath
5 жыл бұрын
I haven't looked at the code directly, but I would imagine that it would go off of frame-to-frame differences, and if those differences went over a threshold, it would just create a new key frame at location. I mean, that's how I would code it.
Jeez, Rebecca really knows what she's talking about.
Rebecca Tickle lol
Dat desync thou
@misterhat5823
5 жыл бұрын
What desync? Must be you.
Couldn't the same be accomplished with a graph database of the data, no training just a distance formula.
"No se lo que dijo pero miente " Tenes que ser modo Dios en momo ,para entender
There's like no math in this video :(. I was hoping for some classical algorithms even some simple Bayesian algorithm would have be great. The video is still a great explanation of the high-level factors at play in streaming algorithms.
Ok so there's a free data mining toolkit
i love how you avoid meteorological data as we all know the best predictor of weather is the window.
@je9625
5 жыл бұрын
Funny but not true
James May had a daughter?
Who really watches till the end aand understand everything, i certanly don't. :D
Whenever I need to deal with dynamic data, I just use malloc()
@CJBurkey
5 жыл бұрын
no
@totlyepic
5 жыл бұрын
@@CJBurkey Seriously. How does such an anti-intellectual comment get upvoted on a channel dedicated to education?
@satyris410
5 жыл бұрын
@@totlyepic lulz? I'm sorry I'll see myself out.
@deoxal7947
5 жыл бұрын
@@totlyepic Because we understand it's a joke.
@DutchmanDavid
5 жыл бұрын
@@totlyepic because it's not an anti-intellectual comment, but a joke instead. :p
There is a little bit too much terminology in the beginning and much abstract and conceptual ideas. More than 15 minutes... This is too much for me. Does it have to be so dry and without half-concrete examples?
I tried to Google "adorkable" and this video came up. Weird.
ok, I'll say it.... so if youtube uses such logic for its recommendations, its clearly an ugly mess of a system and doesn't work so well at all! ;)
@asganaway
5 жыл бұрын
welcome into machine learning pal :D
@misterhat5823
5 жыл бұрын
It appears to use random()
ps: no one on earth cares if you were first or fifth commenting nothing but your line position - quit making an even bigger mess of youtube's data processors, please & thank you!
@mikakorhonen5715
5 жыл бұрын
Stop telling people what to do with their lifes. Thanks.
@daverhodes382
5 жыл бұрын
@@misterhat5823 Mature comment.
@misterhat5823
5 жыл бұрын
@@daverhodes382 Who cares?
@Khepramancer
5 жыл бұрын
xD
I'm early :)
Well, third
This doesn't tickle my fancy.
Fourth 👽👽
oneth
the sound of pencil irritates me very badly
Second
Intelligence makes attractive
She's very pretty, I didn't understand the math, but I just felt like saying that
8th cus everyone gets a 🏆
yeah im first
@Petertronic
5 жыл бұрын
and last to get laid
Well, fifth
Interesting topic, but a bit boring presentation.
You are beautiful!!!!!
Boring 💤 Way to butcher such a cool topic. This is why statisticians hate computer scientists.
@mrhamgolian9006
5 жыл бұрын
Because statistics is so fucken exciting
Thanks for the clear explanation.