Compare Two Excel Sheets with Different Number of Rows and Find Differences using Python
👉 Explore All My Excel Solutions: pythonandvba.com/solutions
𝗗𝗘𝗦𝗖𝗥𝗜𝗣𝗧𝗜𝗢𝗡
▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀
In this video, I will show you how to use Python to compare two Excel sheets with different numbers of rows and find the differences between them.
🌍 𝗟𝗜𝗡𝗞𝗦:
▶ Source code: github.com/Sven-Bo/compare-tw...
⭐ 𝗧𝗜𝗠𝗘𝗦𝗧𝗔𝗠𝗣𝗦:
00:00 - Introduction
00:25 - Install & load dependencies
01:07 - Data with the same shape using pandas
04:31 - Data with the same shape using xlwings
06:32 - Data with different shape using pandas
09:28 - Highlight the difference using xlwings
10:13 - Outro
𝗧𝗢𝗢𝗟𝗦 𝗔𝗡𝗗 𝗥𝗘𝗦𝗢𝗨𝗥𝗖𝗘𝗦
▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀
🆓【𝗙𝗥𝗘𝗘】Excel Add-in (𝗠𝘆𝗧𝗼𝗼𝗹𝗕𝗲𝗹𝘁): pythonandvba.com/mytoolbelt
📊 Data Visualization Excel Add-In (𝗚𝗿𝗮𝗳𝗹𝘆): pythonandvba.com/grafly
🤪 Fun Emoji Excel Add-In (𝗘𝗺𝗼𝗷𝗶𝗳𝘆): pythonandvba.com/emojify
📑 Excel Templates: pythonandvba.com/go/excel-tem...
🎓 My Courses: pythonandvba.com/go/courses
📚 Books, Tools, and More: pythonandvba.com/resources
𝗖𝗢𝗡𝗡𝗘𝗖𝗧 𝗪𝗜𝗧𝗛 𝗠𝗘
▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀
🔗 LinkedIn: / sven-bosau
📸 Instagram: / codingisfun_official
💻 GitHub: github.com/Sven-Bo
💬 Discord: pythonandvba.com/discord
📬 Contact: pythonandvba.com/contact
☕ 𝗕𝘂𝘆 𝗺𝗲 𝗮 𝗰𝗼𝗳𝗳𝗲𝗲?
If you want to support this channel, you can buy me a coffee here:
▶ pythonandvba.com/coffee-donation
Пікірлер: 108
this has plenty of day-to-day uses, thanks!
@CodingIsFun
2 жыл бұрын
Great to hear & thanks for watching the video!
exactly what I was looking for. good job and thank you
@CodingIsFun
2 күн бұрын
Glad I could help! Cheers, Sven ✌️
Thank you, i was looking for something like this
@CodingIsFun
Жыл бұрын
Happy to hear that it was useful. Thanks for watching! 👍
Fabulous video! Excellent explanation and perfect solutioning..
@CodingIsFun
Жыл бұрын
Thank you very much. Happy to hear that you liked the video!
Thank you very much for these incredible tutorials.
@CodingIsFun
2 жыл бұрын
An absolute pleasure, Yasser Khalil! Thanks for the kind comments! 👍
awsome in all ways : big thanks and big kudos for your clear & clever explanations !
@CodingIsFun
9 ай бұрын
Thanks for the positive feedback! Appreciate you taking the time to leave a comment.
Great content. Thanks for the code. !!
@CodingIsFun
Жыл бұрын
You're very welcome! Thank you for watching and for the comment!
Thank you so much very useful to me.
@CodingIsFun
10 күн бұрын
Happy to hear that it was useful; thank you for taking the time to leave a comment and for watching the video! Cheers, Sven ✌️
Thanks!
@CodingIsFun
Жыл бұрын
And I thank YOU, Alberto Montero. It appreciate your support. 🙏
Hi! Thank you for this great video! It has helped me out a lot! I just had one question though, I used the code from the second part of the video however when i execute my code, the rows on my worksheet don't get colors.
@CodingIsFun
Жыл бұрын
Thanks for watching and for your question. Hard to tell from a distance why the cells do not get a color. Sorry that I cannot help.
thanks for the video. If you could also make a video on how to compare two JSON files with different actions and find differences using Python. Thank you
@CodingIsFun
4 ай бұрын
Thanks for watching and your video suggestion! Cheers, Sven 👍
Thank you! I run into this type of task pretty regularly. Excel doesn't have a built-in formula / tool for this, right? It would take a bunch of LOOKUPs and stuff?
@CodingIsFun
2 жыл бұрын
You could also replicate it with Excel formulas, but adding (automatically) comments to cells would only be possible using VBA.
Hey awesome video. You have put a lot of effort here. Thanks for that. Just a query how to compare 2 Excel with some common and some different columns details
@CodingIsFun
8 ай бұрын
Thanks so much for watching the video and leaving a comment! Your request is definitely noted. However, I get a ton of requests for custom solutions and, as much as I'd love to help everyone out, I just don't have the time in my schedule to develop and test all of them. I hope you can understand. Happy Coding!
Hey , this is a great video ,my task was majorly done ,but xlwings does not run when we use the highlight code for very large datasets. Moreover if you could provide a video where we can compare files having different number of columns it would be very helpful.
@CodingIsFun
Жыл бұрын
Thank you for watching the video & your suggestion.
if there is only column difference can you highlight with different color and if the entire row is not present in any of the file can you highlight with different color? This will give you an idea if there is only one column that is different or the entire row is not present in any of the file
@CodingIsFun
2 жыл бұрын
Highlighting the entire row that was not in the initial file has been covered in the video (09:28 min). You could take that example as a starting point to check further (column by column) if the change is only in one column to use a different colour. I hope you understand that I am missing the time to code out an example for you. Happy Coding!
Hi, I have a question… in the data with different shapes sections, what if the difference in the shape of the excel file was that additional columns were added or both columns and rows changed, how would you handle that?
@CodingIsFun
Жыл бұрын
It's a good question but difficult to answer because it depends on many factors. Before comparing the data, you would probably need to check the shape of the data range and/or the column header (names). That said, I do not have the time to code an example.
Hello, what is your take on comparing XLSXs that have cells with multiline string values (ie, 7 lines of text). Pandas seems to be strangling with those.
@CodingIsFun
6 ай бұрын
Thanks for watching. I haven't tried that, but I guess the xlwings method shown in the video should work . Happy coding!
thanks for the video & explanation , if possible can you pls share the gitrepo or sample utility to test locally ???
@CodingIsFun
3 ай бұрын
Thank you for watching. You can always find the relevant links from my videos in the video description. Here you go: github.com/Sven-Bo/compare-two-excel-sheets-python. Happy coding and testing! Cheers, Sven ✌️
If I have a template workbook with 20+ shape-protected sheets, can I still use your code to find the diff between two ver? Or, what else should I do?
@CodingIsFun
2 жыл бұрын
Thanks for watching. Not sure. You might just want to give it a try to find out.
It's a great video, When I test this in my mac, it is throwing me permission issue in the line "initial_wb = app.books.open(initial_version)". Googled various websites couldn't find any helpful information.
@CodingIsFun
Жыл бұрын
Thanks for watching and for your question. Hard to tell from a distance why you are facing an error. Sorry that I cannot help.
for each row in source1.csv ,find all matching values from source2,csv how its done please help me to solve this. thank you
@CodingIsFun
2 жыл бұрын
Thanks for watching and for your question. By merging the dataframes - similar to what I have demonstrated at 7:04 min. Happy Coding! :)
Great video, thank you.. I have a question please, if Excel starts at 0 and pandas starts at 2, should not we add 3 no 2? please advise
@CodingIsFun
Жыл бұрын
Thanks for watching. I recommend downloading the Jupyter Notebook and 'playing' around with the figures. Happy Coding!
@serageibraheem2386
Жыл бұрын
@@CodingIsFun I got it as I was practicing... do you think that we can highlight values only not yhe whole row as you did in the first part of the video?
@serageibraheem2386
Жыл бұрын
@@CodingIsFun I got it as I was practicing... do you think that we can highlight values only not yhe whole row as you did in the first part of the video?
@CodingIsFun
Жыл бұрын
@@serageibraheem2386 It might be possible, but I do not know how to implement that right off the bat.
@serageibraheem2386
Жыл бұрын
@@CodingIsFun I will try and let u know.... thank uuuu so much 👏👏
I’m not very good at all in except. If I’m working on two spreadsheets one being the one that is uploaded with changes found in the second spreadsheet. If I use one these functions to compare do the column names have to be identical in both excel files or can they be similar. Sorry I’m so new to this and I just want to cry 😢
@CodingIsFun
2 жыл бұрын
Thanks for watching. When using pandas, the column headers need to be identical. However, you could also rename the column headers before using the compare method. Here is how you do it: stackoverflow.com/a/11354850 I hope it helps! Happy learning & coding! :)
👏👏👏👏🤌🤌🤌👍👍👍😊😊😊 Muito bom! Parabéns!
@CodingIsFun
Жыл бұрын
Glad you liked it. Thanks for watching and taking the time to leave a comment! 👍👍
After getting the data, which exists in right_only sheet, (8:41) I am getting all the index as NaN and I am not able to convert that to python list. It still remains NaN. How can I proceed?
@CodingIsFun
2 жыл бұрын
Hard to tell from a distance. Not sure what might be causing the issue.
Hai Can you help to find the difference in two excel sheet having multiple tabs also that tab having differences in the rows and columns.?
@CodingIsFun
2 жыл бұрын
Thanks for watching the video. Unfortunately, I do not have the time to code out an example tailored to your specific use case. I hope you can understand. Thank you!
@tpalexander1549
2 жыл бұрын
@@CodingIsFun I know you very busy. But you can do as a video it self ☺️ it will help me like people's i hope you understand the issue.thank you
Hi, I have a doubt, if our initial version file has 100 rows and updated version has two to three-rows deleted in between of that file. when try to follow your code, it skip that deleted details. Difference highlighted csv not showing this deleted details. pls help. I am new to this
@CodingIsFun
9 ай бұрын
Thanks for watching. I am not sure to which method you are referring to. In the video, showed different methods and also the use cases (and limitations) for each of those methods.
this can help bank reconcilation in excel
@CodingIsFun
2 жыл бұрын
Glad you think so! Thanks for watching & your comment! :)
in my code, i want to compare two elements if they are in the same column, how could i achieve that? thank you
@CodingIsFun
Жыл бұрын
Thank you very much for watching the video and your comment. Your requirements are well noted. Yet, I receive many requests for creating individual solutions. As much as I want to help, I simply do not find the time in my daily schedule to develop & test all the different requests. I hope you can understand. Thank you!
Hello. GREAT video, and I think this is just the code I am looking for, but I am just starting to learn Python, and how to use GitHub. when I went to the link for this code, I could not figure out how to get just the Python code. Is there a way to extract it from from the Compare_Workbooks.ipynb file that is contained in the ZIP file?
@CodingIsFun
Жыл бұрын
Thanks for watching. Here a couply of ways to convert the Notebook to a Python file: mljar.com/blog/convert-jupyter-notebook-python/ I hope it helps! Happy Coding!
@mikeniemotka9547
Жыл бұрын
@@CodingIsFun I am using jetbrain, not sure if that is the same as a jupyter? I tried to import the ipynb file that I I downloaded from you, but don't see any of those commands in the link you showed me. And other help would be greatly appreciated. Also, did you ever write a version where you had the a final file that would show the original AND the revised content of any changed cells?
how can i do this comparison if my excel files have different column names and different rows?
@CodingIsFun
2 ай бұрын
Thank you very much for watching the video and your comment. Your requirements are well noted. Yet, I receive many requests for creating individual solutions. As much as I want to help, I simply do not find the time in my daily schedule to develop & test all the different requests. I hope you can understand. Thank you!
Hii, Can we change the self and other to any other names?
@CodingIsFun
9 ай бұрын
Thanks for watching. Yes, you can using the result_names parameter as written in the pandas docs: pandas.pydata.org/docs/reference/api/pandas.DataFrame.compare.html
I have source sheet 60000 rows and target sheet having 32000 i want to find which rows are missing based on three column like item_code, start-date and cost. How to do this??
@CodingIsFun
4 ай бұрын
Thank you very much for watching the video and your comment. Your requirements are well noted. Yet, I receive many requests for creating individual solutions. As much as I want to help, I simply do not find the time in my daily schedule to develop & test all the different requests. I hope you can understand. Thank you!
At 4:18. What if i want Last Name as Index no instead of [10,21,56,85,99]. i have used this code so many times but i want the above solution to make my work easy. Kindly revert back with help. Thank You.
@CodingIsFun
Жыл бұрын
Thanks so much for watching the video and leaving a comment! Your request is definitely noted. However, I get a ton of requests for custom solutions and, as much as I'd love to help everyone out, I just don't have the time in my schedule to develop and test all of them. I hope you can understand. Happy Coding!
I wonder if the logic work if say the datasets have no unique identifiers ?
@CodingIsFun
2 жыл бұрын
Thanks for watching. What exactly do you mean by unique identifiers? In the xlwnigs example, the header names do not matter. The script will compare cell to cell, e.g. A1 (WB1) to A1 (WB2). When using pandas compare, you want to use the same column headers.
@calm_compost
2 жыл бұрын
@@CodingIsFun By unique identifier i meant to say that one field that distinguishes my row from the rest of the rows and which stays same throughout any point of time. Anyway I will try to replicate this logic in Excel and see if it helps. Thanks for the video though.... :)
hey thanks for the tutorial. I am unable to import path from pathlib. this throws me an error, 'ImportError: cannot import name 'path' from 'pathlib' ! can you please help me
@CodingIsFun
2 жыл бұрын
capitalize Path from pathlib import Path
@howard_roark1210
2 жыл бұрын
@@CodingIsFun that worked!! thank you . You earned a subscriber!!
@howard_roark1210
2 жыл бұрын
can you help me with the code that actually compares two excel files and copies the data only between corresponding cells which has a common name in the first column.
@CodingIsFun
2 жыл бұрын
@@howard_roark1210 What have you tried so far?
What if these 2 excels are different shpe? one is 32 row, another is 33 row? And I want to find what the differences ..
@CodingIsFun
2 жыл бұрын
06:32 min
Can we do comparison if had more than one worksheet?
@CodingIsFun
Жыл бұрын
That should be possible, but I have not tired it yet.
its showing that Book object does not have used_range attribute pls help
@CodingIsFun
2 жыл бұрын
Probably because you are using 'used_range' on a workbook and not on a worksheet object.
How to compare a specific number of rows of the excel sheets instead of all the rows?
@CodingIsFun
11 ай бұрын
Thank you very much for watching the video and your comment. Your requirements are well noted. Yet, I receive many requests for creating individual solutions. As much as I want to help, I simply do not find the time in my daily schedule to develop & test all the different requests. I hope you can understand. Thank you!
@sunilkumar-gs3ex
11 ай бұрын
@@CodingIsFun Okay! Thanks for the reply!
Hey I am facing issue with updated_wb.save() thing. Its giving me lot of errors
@CodingIsFun
2 жыл бұрын
What does the error say?
@saurabhsharma-bd1xf
2 жыл бұрын
@@CodingIsFun Traceback(most recent call last): file "",line 1 in Files "C\users\sharma\appdata\local\program\python\python310\lib\site-packages\xlwings\main.py", line 903 , in save with self.app.properties(display_alerts_false): Files "C\users\sharma\appdata\local\program\python\python310\lib\site-packages\xlwings\main.py", line 867 , in app return(app(impl=self.impl.app) Files "C\users\sharma\appdata\local\program\python\python310\lib\site-packages\xlwings\_xlwindows.py", line 551 , in app return App(xl=self.xl.application) Files "C\users\sharma\appdata\local\program\python\python310\lib\site-packages\xlwings\_xlwindows.py", line 135 , in __getattr__ v= getattr(self._inner, item) Files "C\users\sharma\appdata\local\program\python\python310\lib\site-packages\wincom\client\__init.py__", line 645 , in __getattr__ return getattr(d, attr) Files "C\users\sharma\appdata\local\program\python\python310\lib\site-packages\wincom\client\__init.py__", line 548, in __getattr__ return self.ApplyTypes(*agrs) Files "C\users\sharma\appdata\local\program\python\python310\lib\site-packages\wincom\client\__init.py__", line 572, in _ApplyTypes_ self.oleobj_InvokeTypes(dispid, 0, wflags, retType, argTypes, *args), pywintypes.com_errror:(-2147023170, 'The remote procedure call failed.', None, None)
@saurabhsharma-bd1xf
2 жыл бұрын
I really apprecaite if you can help me in resolving the error. I used the word to word code of yours and everything is working perfectly fine but i don't know why its giving error on save function. i tried several ways but nothing worked out for me. I hope if you can trace the error in it. Thanks for your help.
@CodingIsFun
2 жыл бұрын
@@saurabhsharma-bd1xf Please paste the error trace log here: pastebin.com/ Once done, share your paste bin ID (NOT THE FULL URL [links will be deleted by YT]), an example of pastebin ID: *MsEChyJw* ( www.screencast.com/t/7Q8W3zE8Rc )
@saurabhsharma-bd1xf
2 жыл бұрын
@@CodingIsFun aaJDTjyV
Nope. didnt get the merged excel for different lengths excels --------------------------------------------------------------------------- AttributeError Traceback (most recent call last) in ----> 1 with xw.App(visible=False) as app: 2 updated_wb = app.books.open(prod_version) 3 updated_ws = updated_wb.sheets(1) 4 rng = updated_ws.used_range 5 AttributeError: __enter__
@kevindandrade5307
2 жыл бұрын
ok my bad. needed to update xlwings==0.25.3
@travelwithprinceandekta7364
Жыл бұрын
@@kevindandrade5307 is there another way to fix this error with xlwings version 0.16.0 ?
why you used hoe="outer" ? why not "inner"
@CodingIsFun
Жыл бұрын
Because I am looking for any additions or deletions between the two datasets (df_initial and df_update)
@amitkhandelwal8030
Жыл бұрын
@@CodingIsFun Ohk , but Inner join also gives the difference between two csv tables , so my doubt is how to identify if many rows deleted and added in df_update how to extract that particular rows
@CodingIsFun
Жыл бұрын
@@amitkhandelwal8030 why not simply trying it out. Feel free to share your solution here