PyPDF2 Crash Course - Working with PDFs in Python [2023]

In this tutorial we will explore how to use PyPDF2 to read PDFs, extract text from PDFs, split PDFs , merge PDFs and more
⚡ PyPDF2 Crash Course ⚡ : Working with PDFs in Python
💻 Code:github.com/Jch...
📝 Blog:blog.jchariste...
📺 Become a Patron: / jcharistech
🎓=== Check out these Awesome Data Science Courses!===🎓
🧑🏻‍🔧 Building ML Web Apps:www.udemy.com/...
🧑🏻‍🎓 Learn Streamlit: www.udemy.com/...
🧑‍🎓BioInformatics in Python:www.udemy.com/...
🤵🏻 Go4DataScience & Go For NLP(Udemy): www.udemy.com/...
🧑🏻‍🔧 Machine Learning in Python:www.udemy.com/...
🧑🏻‍🎓 DVC and Git For Data Science:www.udemy.com/...
If you liked the video don't forget to leave a like 👍 or subscribe ❤️.
⚡ If you need any help just message me in the comments, you never know it might help someone else too. ⚡
⏲️===TimeStamps===⏲️
0:01 Introduction & Demo
01:30 Setup and Installing Packages
02:00 PdfReader vs PdfFileReader
02:50 Workflow of PyPDF2
03:40 How to Read a PDF File In Python
04:40 Metadata of PDF
06:20 How to get number of pages
07:10 How to Extract Text From PDF
08:50 Get PDF Metadata
10:15 Extract Text From PDF
14:46 How to Split PDFs
15:40 Split PDF Function
22:50 PdfWriter Position
24:10 How to Split PDF upto A Specific Page
33:20 Get Last Page of PDF
38:16 Merging Multiple PDFs
39:00 How to Fetch All PDFs in A Directory
41:35 How to Merge PDFs
45:20 Rotating a PDF Page
51:30 Recap
JCharisTech
Support the Channel: Become a Patreon
📺 Become a Patron: / jcharistech
◾◾◾Get The Data Science Prime App◾◾◾
@ Playstore : bit.ly/2LArYQu
◾◾◾ Need Your Dataset Cleaned check out this gig ◾◾◾
www.fiverr.com...
Follow
💻 / jcharistech
🌎 Website: jcharistech.com
📂 GitHub: github.com/Jch...
📱 Twitter: / jcharistech
📝 Blog: blog.jchariste...
📺 Patreon: / jcharistech
🌐 WP: jcharistech.wo...
🏫 Course: jcharistech-ins...

Пікірлер: 45

@ushasingh7752 Жыл бұрын
best tutorial i have ever taken with lots of exercises and detailed explaination. Thank you so much💗💗💗♥
@JCharisTech
Жыл бұрын
Glad it was helpful! Singh
@asheeshmathur7 ай бұрын
Very good tutorial, how to read a Bularian PDF, and read specific section to extract data. Any pointers will b helpful
@pariabr40277 ай бұрын
This topic was tough for me, but you explained it really well!
@JCharisTech
7 ай бұрын
Glad it was helpful
@imthebearimthebear3316 Жыл бұрын
excellent variable names and clean display really helps with the example
@JCharisTech
Жыл бұрын
Glad it was helpful
@MateFast_Oficial7 ай бұрын
¿Amigo, sabes como extraer comentario que tienen imagen o voz de un pdf? Gracias por adelantado.
@gamerk88Ай бұрын
Finally I have found a good tutorial
@rehanadgrtАй бұрын
Could u pls explain? how to compare two pdf and if it is not identical ,extract the extra parts.
@IThinkItsMeАй бұрын
This is a high quality tutorial 👌
@harshit_singh19 Жыл бұрын
@JCharis Tech how to read number of sections in PDF files ?
@HRTG1234 Жыл бұрын
Great tutorial! Thank you so much!
@JCharisTech
Жыл бұрын
Glad it was helpful!
@mangalwedeshrinivas7249 Жыл бұрын
At 29:10, I think the two statements, filename = os.path.splitext(...) and output_filename = f"... can be taken out of the loop.
@hnahler
4 ай бұрын
Definitely! They should be outside. The way it is written, the file will be saved for each page but then will be overwritten at the next iteration. Also, the range function in the for loop should add 1 so that the function uses 0 and 1 when the user wants to output the first two pages.
@Jon-bk2bw Жыл бұрын
Really thorough and updated methods, thank you!
@JCharisTech
Жыл бұрын
Glad it was useful!
@francescovecchio3931 Жыл бұрын
I have a simple problem; I have to read a pdf and change some words of text and then save to a new pdf that keeps thesame layoutc of originale pdf. I don't need anything else, but I can't find working examples on the web! Can you help me, Thanks, Francesco
@shreenaths6598 Жыл бұрын
Hi awesome video. I have one question, where i want create replica of pdf through automation python script and save it on cloud. Can u suggest for same? Advance Thanks
@niv8880 Жыл бұрын
I really really need to know how to flatten a pdf. Ghostscript doesn't work, Magick merges all my pages onto the first. I can't use pdftk on a Silicon Mac, roads all lead to nowhere!
@adan865710 ай бұрын
How can I get the underlined text?
@user-sv3bk3jf5j Жыл бұрын
HI, you Video was very Helpful !! Please Please create Video to code to put an image and / or a user defined text as a watermark into a pdf !!!????!!!!! if not video then please share the code ?? the other videos use the old libraries and not the new ones...
@wasima4463 Жыл бұрын
you did not cover the real scenarios, when extracted text from research paper pdfs contain weird fonts, non homogeneous spacing, newlines and sometime letters overlap on each other. Can pypdf2 deal with that?
@mrinkahok1522 Жыл бұрын
Hello, I almost succeeded in this. I have one more problem. I want to add more pages to the pdf file. But it overwrites the previous page. I want to add it but I can't.
@drunkpy1590 Жыл бұрын
How do you extract data from PDF to text then systematically show the extracted text on excel?
@mrinkahok1522 Жыл бұрын
I have a file with 15 pages and I want to write page 2 through page 15 to another file because I no longer need page 1. So I want to throw away page 1.
@NazeerAhmad-bk1ul Жыл бұрын
Thanks
@LeandroAuzier Жыл бұрын
do you know how to convert XML in PDF? i was looking for pyxml2pdf but i kinda don't get it at all, i don't know if its a stopped project or i get it wrong
@bc4198 Жыл бұрын
Awesome!
@JCharisTech
Жыл бұрын
Glad you think so!
@andrewmarty6001 Жыл бұрын
Fantastic Turtorial
@JCharisTech
Жыл бұрын
Glad it was helpful
@cheonglily3992 Жыл бұрын
very good video. I have a question if I have a pdf file , i only want 3 things from the pdf. airway bill no, total amount and if it's goods coming in for eg. then it import. how can i extract ?
@JCharisTech
Жыл бұрын
Hello Lily you can use the `extract_text` function and an `if condition` to achieve that. I hope this helps
@aphadke77 Жыл бұрын
How to extract embedded files information from a pdf file?
@drunkpy1590 Жыл бұрын
Great video!
@JCharisTech
Жыл бұрын
Glad you enjoyed it
@lucasmonta1 Жыл бұрын
what is the IDE you are using?
@Munichandra_Reddy Жыл бұрын
I want to build one donation website, Please help me, how to do, and how to add upi option in that website, and I want to store the data web excel or SQL server database ,and how to give that website my friends , please explain me and Don't Skip it , please help me by using Streamlit, please teach the code
@JCharisTech
Жыл бұрын
Thanks for the suggestion
@glass7933 Жыл бұрын
28:42 О_о. Do you speak Russian? Phrase "Это очень важно" in middle of the video was really surprising.
@mrinkahok1522 Жыл бұрын
Can you help me with this?
@KwameBrakoAsante3 ай бұрын
Hi. Are you Ghanaian? I just had to ask.
@mrinkahok1522 Жыл бұрын
for root, dirs, files in os.walk(main_data_path): for dir1 in dirs: huidige_file = path + "\\" + dir1 + "\\" + FILE if os.path.exists(huidige_file): pdf = PdfReader(huidige_file) with open(huidige_file, "rb") as pdf_reader: file_new = path + "\\" + dir1 + "\\" + outpdffilename pdfreader = PdfReader(pdf_reader) for i in range(1, len(pdfreader.pages)): selected_page = pdfreader.pages[i] # page = pdfreader.pages(i, num_of_pages) pdf_writer = PdfWriter() pdf_writer.add_page(selected_page) with open(file_new, "wb") as outPdf: pdf_writer.write(outPdf) print("Created a pdf: '{}'".format(file_new)), print(i) # pdf_writer.write(outPdf)

PyPDF2 Crash Course - Working with PDFs in Python [2023]

Пікірлер: 45

@JCharisTech

Жыл бұрын

@JCharisTech

7 ай бұрын

@JCharisTech

Жыл бұрын

@JCharisTech

Жыл бұрын

@hnahler

4 ай бұрын

@JCharisTech

Жыл бұрын

@JCharisTech

Жыл бұрын

@JCharisTech

Жыл бұрын

@JCharisTech

Жыл бұрын

@JCharisTech

Жыл бұрын

@JCharisTech

Жыл бұрын

Келесі