dplyr: Joins
Ғылым және технология
Joins let you combine two data tables together based on a shared column that uniquely identifies the records, also known as a key column. When your data is spread out across multiple tables, you may need to perform one or more joins to get it all into one big table before doing other data cleaning and analysis tasks.
Link to the Kaggle Notebook code used for this video series:
www.kaggle.com/hamelg/dplyr-in-r
View the whole dplyr in R playlist here:
• dplyr: Getting Started
dplyr cheat sheet from RStudio:
www.rstudio.com/wp-content/up...
dplyr documentation:
cran.r-project.org/web/packag...
Follow DataDaft on social media for news and updates:
Twitter: / datadaft
Join the DataDaft Discord to discuss all things data science:
/ discord
#dplyr #rprogramming #datascience
Пікірлер: 21
This is awesome! You have a gift for teaching. Thank you
Thank you! Excellent explanation and coverage of the joins in dplyr. I learnt a lot from coding along with you. I hope you make more videos to share your knowledge of R, and data science.
Straight to the point! Very helpful. cheers
Thank you so much! This was really helpful. Hugs from Argentina!
Thank you so much!! The join multiple columns was really helpful.
Great video on joins. Thanks
very basic ,very good.
THANK YOU
Thank you genius :)
Very helpful
FINALLY A VIDEO ABOUT THIS SUBJECT IN AMERICAN ENGLISH
thank you
Thank you for making these videos on dplyr and R studio, its really helping me out! If also may ask why is it that the full_join is the only command of the joins that uses by=c( ) instead of by=
@DataDaft
3 жыл бұрын
When joining on only one column, you can pass in that single column name using --> by = "join_col" In the full_join example where we are joining on two columns, we have to pass a vector of columns so we use --> by = c("join_col1", "join_col2")
Thank you very clear presentation. Can dplyr replace SQL?
awesome
which IDE are you using to run the codes? Thanks
Please what if you have two datasets with different colnames and rows but you are asked to merge the two
@DataDaft
3 жыл бұрын
If you have two columns with different names but that contain the same information/unique identifiers, such as "P_ID" in one data set and "Patient_ID" in the other, you can use the argument by = c("P_ID" = "Patient_ID") to join on that column despite the different column names. If the data sets don't have any variables in common (regardless of whether they actually have the column names) I'm not sure how they can be joined/merged in a meaningful way.
i fucking love u you saved me
which IDE are you using to run the codes? Thanks