Detect Text in Images with Python - pytesseract vs. easyocr vs keras_ocr

In this video we learn how to extract text from images using python. We compare three popular libraries: pytesseract, easyocr, and keras_ocr. Examples are run in a kaggle notebook on the TextOCR dataset.
Notebook used in the video: www.kaggle.com/code/robikscub...
Timeline:
00:00 Extracting Text
00:42 TextOCR Dataset
01:31 Outline and Loading Data
04:19 Plotting Text Images
06:12 pytesseract
07:38 easyocr
10:18 keras ocr
13:14 plot comparison
20:00 Results comparison
Follow me on twitch for live coding streams: / medallionstallion_
My other videos:
Speed Up Your Pandas Code: • Make Your Pandas Code ...
Speed up Pandas Code: • Make Your Pandas Code ...
Intro to Pandas video: • A Gentle Introduction ...
Exploratory Data Analysis Video: • Exploratory Data Analy...
Working with Audio data in Python: • Audio Data Processing ...
Efficient Pandas Dataframes: • Speed Up Your Pandas D...
* KZread: youtube.com/@robmulla?sub_con...
* Discord: / discord
* Twitch: / medallionstallion_
* Twitter: / rob_mulla
* Kaggle: www.kaggle.com/robikscube
#python #computervision #deeplearning

Пікірлер: 111

  • @lphilippeabadieo298
    @lphilippeabadieo2988 сағат бұрын

    Hi there! I wanted to thank you for the incredible video on text detection in images using `pytesseract`, `easyocr`, and `keras_ocr`. It was exactly what I needed to resolve all my doubts about which tool to use in my projects. Your clear explanation and comparative analysis helped me understand the advantages and disadvantages of each, allowing me to make an informed decision. Thanks again for sharing your knowledge and for the quality of your content. I would love to see more videos like this, where you explain and compare different tools and technologies. Keep up the excellent work! Best regards and well done! Philippe

  • @pcargolo1
    @pcargolo19 ай бұрын

    I had fun watching this video! Great job. Also very nice to see you making typos and then fixing it. Many people on KZread cut those parts and it gives some sort of weird feeling like the person is pretending like they never make mistakes. It's impossible!

  • @cyberhard
    @cyberhard2 жыл бұрын

    Great video! Clear and concise. You've earned my subscription.

  • @robmulla

    @robmulla

    2 жыл бұрын

    Thanks so much for watching and subscribing!

  • @cyberhard

    @cyberhard

    2 жыл бұрын

    @@robmulla you're welcome

  • @tanvirhossain18
    @tanvirhossain18 Жыл бұрын

    Thanks a lot, Rob. This is a great Tutorial. Hats off!

  • @robmulla

    @robmulla

    Жыл бұрын

    I apprecaite the feedback. Glad to hear you found it helpful.

  • @MM-un7wg
    @MM-un7wg2 жыл бұрын

    Great job as always mate!

  • @robmulla

    @robmulla

    2 жыл бұрын

    Thanks MM!

  • @IntenseRouge
    @IntenseRouge Жыл бұрын

    Your video is really great, thank you Rob!

  • @robmulla

    @robmulla

    Жыл бұрын

    Thanks so much for the feedback. Hope it helped you out.

  • @richardbloemenkamp8532
    @richardbloemenkamp853211 ай бұрын

    Thanks for showing how to use all three methods. For the comparison part I think you could go a lot more in depth on the results. For most applications the results seem largely insufficient to me, but for some applications it is already fine.

  • @zaheerbeg4810
    @zaheerbeg481011 ай бұрын

    Highly appreciated

  • @Vietnamcamping89
    @Vietnamcamping899 ай бұрын

    Cool explanation, thank you

  • @MegaArti2000
    @MegaArti20006 ай бұрын

    amazing libraries; thx for sharing

  • @linuxmill
    @linuxmill Жыл бұрын

    fantastic work!

  • @robmulla

    @robmulla

    Жыл бұрын

    Thank you! Cheers!

  • @ashaykatrojwar880
    @ashaykatrojwar8802 жыл бұрын

    Nice explaination Rob.

  • @robmulla

    @robmulla

    2 жыл бұрын

    Thanks Ashay for the feedback!

  • @rushdamansuri8545
    @rushdamansuri8545 Жыл бұрын

    Thank you so much, your code removed my days of frustation.

  • @robmulla

    @robmulla

    Жыл бұрын

    Glad I could help!

  • @travezripley
    @travezripley2 жыл бұрын

    I love that you had an “Chain of Strength” image!!!! Straight Edge Hardcore Lives! Youth Crew!

  • @robmulla

    @robmulla

    2 жыл бұрын

    Glad you liked it. To be honest I had never heard of them before you mentioned it. But checked it out!

  • @tusharniras
    @tusharniras Жыл бұрын

    Hey Rob, thank you for the Video! this helped me a lot.!!! @to my Indian Developers, I have tried these libraries for Indian languages. And `pytesseract` seems to be a winner fro reading Marathi and Hindi language.

  • @robmulla

    @robmulla

    Жыл бұрын

    Thanks for watching and good to know about that library being best. Didn’t know it could do Hindi!

  • @crytex1747
    @crytex1747 Жыл бұрын

    Great Video !

  • @robmulla

    @robmulla

    Жыл бұрын

    Glad you enjoyed it

  • @ibraheem1224
    @ibraheem1224 Жыл бұрын

    Which library would you recommend using to extract all handwritten text from like a doctors prescription or a diary page or something like that?

  • @nalinbranden
    @nalinbranden Жыл бұрын

    Hey Rob, Amazing content, thanks for making this. Can you suggest the best method to detect words from a printed text? also like to isolate a single word out of a paragraph. Keep up the good work!

  • @riansyahtohamba8215
    @riansyahtohamba8215 Жыл бұрын

    Thanks rob!

  • @robmulla

    @robmulla

    Жыл бұрын

    Thanks for watching!

  • @anshaaa320
    @anshaaa32010 ай бұрын

    Hello Rob. Can you please tell what you think about which one of these 3 will work best if I have to extract text from images of products of inventory? for example soft drinks pictures, chips, snacks, chocolates, tissue boxes, vanity products, etc.

  • @ruksharalam173
    @ruksharalam1738 ай бұрын

    For document files, how do easy ocr and keras ocr perform as compared to Tesseract?

  • @arthurart2402
    @arthurart240210 ай бұрын

    it's was helpful

  • @tigranmkrtchyan7346
    @tigranmkrtchyan73462 жыл бұрын

    Hey Rob, thanks a ton for the awesome job you do, I have learnt a lot of new cool stuff (I was only aware of pytessaract, thought it is the best one and tried it on pictures), now I will definitely give a try to the other libraries as well. As I already mentioned, overall you are doing an extremely great job, I just have 1 idea and 1 suggestion : ) 1) idea: as you mentioned in the video, we have the annotations (like the ground truth) already provided in the dataset, right? Wouldn't it be a good idea to check the results vs the ground truth? I.e. smth like lib2 has 3 out of 10 annotations correctly ( recall 0,3) out of 6 predicted texts (0,5 precision) smth like accuracy etc. I understand, this is not perfect as it could only partially extract the text (like miss 1-2 letters) or recognize letter 'G' instead of 'C'. I mean the question is: what's a fair way to have some numerical result based on the dataset? Say library 1 has accuracy X precision Y and recall Z and based on this values and the confusion matrix to be able to make a selection of a library given a particular dataset? 2) Suggestion: I am mentioning this 3rd time already, but your videos are just marvellous, I mean extremely informative, to the point, no second wasted at all. Just perfect. So, what I would really love to see is (as you asked about suggestion for some next video) is almost the same kind of video for some audio task. There is this new BLOOM model out today, I haven't checked it yet, but maybe you could pick some models (like based on wav2vec2) that recognize voice (asr) and compare them using transformers and huggingface for example? Would really be nice to see some comparison of different models in some audio related task, where the whole pipeline (like loading audio, extracting the numerical features, feeding to some pretrained model and finally prediction) would be implemented and validated based on say word error rate. Thanks in advance ; )

  • @robmulla

    @robmulla

    2 жыл бұрын

    Great comment. Wow! Lots to respond to. For #1 that’s a great idea and I had considering doing a more formal evaluation metric to compare the models but ended up deciding it would be too much for this video. I think most of these models are released with their metrics on similar datasets. For #2 I’ll have to check out the model you are referring to I’ve never heard of it before but will read up. I was considering making a video on the new language translation model that meta released last week. Thanks for watching and I appreciate the feedback!

  • @YounessNet
    @YounessNet Жыл бұрын

    Hi Rob, a very informative video thank you! After we extract the data what is the best solution to save it into a table without missing the context of future images or PDFs? Per example if I save Customer's First Name from first file, it should be easier to detect it in the next file and save it under the same column

  • @FrancescoZaccaria-uv2zm
    @FrancescoZaccaria-uv2zm Жыл бұрын

    great video! i was asking myself if as a languages i could set "math". i'm trying to get a prediction of number and text together could i use one of this libraries or for handwritten numbers i have to use a totally different library?

  • @josephmeyer6107
    @josephmeyer61075 ай бұрын

    Great video, thanks. Which one do you like the best for invoices and business documents. any for handwriting

  • @xoblm4938
    @xoblm49382 жыл бұрын

    Hi Medallion, do you know how to improve the accuracy of the easyocr ? I have a image , like a dataframe image, it can only read 95% of the data in the image. Thanks in advance

  • @robmulla

    @robmulla

    2 жыл бұрын

    Great question. I’m not sure how you could improve it without training on an additional set of labeled data. You could attempt to use different models and somehow merge the results.

  • @4notheruser450
    @4notheruser450 Жыл бұрын

    good job thanks

  • @robmulla

    @robmulla

    Жыл бұрын

    Thanks for watching and giving feedback!

  • @vesper8
    @vesper85 ай бұрын

    Great video but man I would have loved to see another half hour where you compare the results to the original annotations, give a score to each technique, and then try to make improvements to get the score to increase

  • @gustavojuantorena
    @gustavojuantorena2 жыл бұрын

    Nice tutorial! I only try pytesseract in the past with PDF files. For future video recommendation I will like to see something about how you organize your ML experiments. In the competitions do you use some tool in order to try multiple models and hyperparameter tuning?

  • @robmulla

    @robmulla

    2 жыл бұрын

    Yes, I think pytesseract is really good for documents, but not so much for extracting text from images. Great reccomendation for future videos. There are a lot of great resources for organizing ML experiments these days. I used to just append all my experiment results to a CSV but now things like Weights & Biases or Neptune.ai make these things a lot easier. I'll have to think about how to make a video for that....

  • @asdasdasasdasdas9073
    @asdasdasasdasdas90738 ай бұрын

    Hello, do you mind telling which GPU you used for this test ?

  • @rahulsrivastav7014
    @rahulsrivastav7014 Жыл бұрын

    awesome lets try this.......

  • @robmulla

    @robmulla

    Жыл бұрын

    Have fun

  • @cs.mohitmakhija4230
    @cs.mohitmakhija42302 ай бұрын

    How to extract select data values for example an invoice image into a template seeking selected values from that image

  • @notmyname6452
    @notmyname64523 ай бұрын

    hi, i'm trying to find a way to process bulk ai generated art, and flag any images that contain even the tiniest watermark or text. I fumbled around with some OCR previously in python (don't remember what i used) but it wasn't very close to what i needed it to do considering the text is often just random scribbles or gibberish. any chance you could point me in the right direction to a possible solution? thanks.

  • @seumptrust
    @seumptrust Жыл бұрын

    Hi Rob, could you do a video on extracting text from the Crosswords puzzle, compare it to the correct answers image, and show the results?

  • @maggiezhang145
    @maggiezhang145 Жыл бұрын

    Thank you Rob for another great tutorial! I've been following along for one of my OCR project. Qq -- when I tried to pip install keras-ocr in Kaggle. I kept getting error message of "ERROR: Could not find a version that satisfies the requirement keras-ocr (from versions: none)". Do you happen to know why? Thanks!

  • @robmulla

    @robmulla

    Жыл бұрын

    Oh no. Good question but I don't know the answer, check the keras-ocr github to make sure there isn't an issue with the latest release maybe?

  • @Levy957
    @Levy9572 жыл бұрын

    ur amazing

  • @robmulla

    @robmulla

    2 жыл бұрын

    Thanks Levy. Hope you found the video helpful.

  • @korescoworld6118
    @korescoworld611820 күн бұрын

    Wow nice work... My question is in a situation where they're many text on the image is it possible for me to get just a word.. Can it work by automatic capture and convert to text

  • @TugimanS
    @TugimanS Жыл бұрын

    thats cool, but why keras ocr result doesn't use any capitalization?

  • @sanamrajchaudhary4513
    @sanamrajchaudhary4513 Жыл бұрын

    can you please video how to prepare our custom dataset with annotation. and how use these dataset with anotation to train a pretrained pytesseract or easyocr or kerasocr

  • @muhammadsyaukitarmizi9184
    @muhammadsyaukitarmizi9184 Жыл бұрын

    I have a problem, the problem is DataFrame.___init___() got an unexpected keyword argument 'coloumn'. How should i do ?

  • @robmulla

    @robmulla

    Жыл бұрын

    I think you spelled "column" wrong, you said "coloumn"

  • @muhammadsyaukitarmizi9184

    @muhammadsyaukitarmizi9184

    Жыл бұрын

    @@robmulla thanks Sir. The, how do fix error in (-215:Assertion Failed)!_src empty() in function 'cv::filter2D' ?

  • @Udayanverma
    @Udayanverma Жыл бұрын

    Keras takes 3 secs on rtx 4070. I have like 100k frames. How to make it faster?

  • @giriprasath6030
    @giriprasath6030 Жыл бұрын

    Hello sir, I tried to install keras-ocr through pip in kaggle. But it wont install throwing an error saying, "ERROR: Could not find a version that satisfies the requirement keras-ocr (from versions: none) ERROR: No matching distribution found for keras-ocr. What should i do for this?

  • @anishvikramvarma9952

    @anishvikramvarma9952

    8 ай бұрын

    Hey, even I am facing the same error. Did you find any solution to solve that?

  • @giriprasath6030

    @giriprasath6030

    8 ай бұрын

    @@anishvikramvarma9952 no I didnt find any. Try it in anaconda maybe

  • @dandyiy
    @dandyiy Жыл бұрын

    what is the best for form validation and for high performance, i want to use it with 100K form per day ?

  • @robmulla

    @robmulla

    Жыл бұрын

    Depends. Validation do you mean inference? It depends on your hardware. GPUs can make things faster.

  • @dandyiy

    @dandyiy

    Жыл бұрын

    @@robmulla yes i want to check if name or address are not empty and maybe extract form and value in object.

  • @TechDemocracy_ArturGorczynski
    @TechDemocracy_ArturGorczynski Жыл бұрын

    Hey! I wanted to ask how you add yousefl to video without background?

  • @robmulla

    @robmulla

    Жыл бұрын

    Thanks for watching. I’m using a green screen during the tutorials and OBS has filters that filter out the green.

  • @TechDemocracy_ArturGorczynski

    @TechDemocracy_ArturGorczynski

    Жыл бұрын

    @@robmulla Big thanks for both film, and answer :)

  • @ikubaru03s66
    @ikubaru03s66 Жыл бұрын

    Hi, can you explain this in ReactJS or JavaScript?

  • @robmulla

    @robmulla

    Жыл бұрын

    No I can’t. I don’t know those languages, sorry.

  • @vasupatel7013
    @vasupatel7013 Жыл бұрын

    hey can you please make something that can identify how many pages in a PDF are having images and how many pages are non-Image, Thanks in advance, or at least guide me through the process of doing do

  • @user-ek4ck8os3k
    @user-ek4ck8os3k Жыл бұрын

    what do i do if I have png images? glob doesnt for that?

  • @robmulla

    @robmulla

    Жыл бұрын

    Glob is just used to identify all the complete file paths. cv2 should be able to read png files the same as jpeg.

  • @poojabhandari631
    @poojabhandari631 Жыл бұрын

    What code editor you are using

  • @robmulla

    @robmulla

    Жыл бұрын

    This is a Kaggle notebook. The link is in the description.

  • @MehdiRS6
    @MehdiRS65 ай бұрын

    Hello Rob, I saw your video on text detection from images, and I have some questions. If you can help me, please, I appreciate it. Thank you.

  • @srikanthkoltur6911
    @srikanthkoltur6911 Жыл бұрын

    Can you make a video related to custom train out own datasets on EasyOCR and specially Keras_OCR That would be helpful

  • @robmulla

    @robmulla

    Жыл бұрын

    Thanks. I’ll see what I can do. The hardest part is finding good training dataset to use! If it exists there is probably a pretrained model that exists.

  • @srikanthkoltur6911

    @srikanthkoltur6911

    Жыл бұрын

    I am trying to mess with japanese data which is cc100 datasets U can find manga_ocr that's pretty good but i think if somehow we use that data we can make better in EasyOCR for handwriting

  • @mukhtarbimurat5106
    @mukhtarbimurat5106 Жыл бұрын

    thanks for video! what about paddle ocr?

  • @robmulla

    @robmulla

    Жыл бұрын

    Thanks for watching. Never heard of that. Is it any good?

  • @mukhtarbimurat5106

    @mukhtarbimurat5106

    Жыл бұрын

    @@robmulla I head that it's faster than easyocr, also it has several features: for text detection, text identification, document identification (like tables, ...), mobile and embedded devices support and looks like it's easy to fine-tune (for a new language). I gonna use it and will share experience

  • @tommasoseneca9189
    @tommasoseneca91892 жыл бұрын

    Hi, I'm a new sub here! I've tried many times to decode a datamatrix code in order to get the text string represented by that code but my Python script seems not able to decode it... I use opencv, libdmtx and pil but nothing... Probably once the script opens the picture and tries to find the code (the pattern) in the image nothing occurs, and it happens well before the decoding effort that should come right after a pattern has been recognized... Please help 😅 Thanks!

  • @robmulla

    @robmulla

    2 жыл бұрын

    Hey. Thanks for subbing. I’m not sure what you mean by data matrix. Are you able to first convert the file to something like a jpg first? That’s what I’d suggest.

  • @tommasoseneca9189

    @tommasoseneca9189

    2 жыл бұрын

    @@robmulla So instead of opening it as a png image you rather suggest to first convert it into a jpg image, and then use the decode command... Working with a png image and then using decode generally gives me an empty output []... Thanks! P.s. For the sake of clarity, my png image is a foto of an electronic device with a datamatrix printed on it

  • @anilsharma32g
    @anilsharma32g9 ай бұрын

    Dear Sir, I am your Subscriber I want to create a tool that finds text errors in the image. For Example: I forgot to write CONTACT US, BUY NOW, CONTACT NUMBER, SPELLING MISTAKE, etc... in my social media post. that the tool finds error and suggests what are missing or what is incorrect in social media post. 🙏 Please guide me and suggest what course I need to buy or what I need to learn to create this tool Thank you

  • @yokevideo
    @yokevideo Жыл бұрын

    Could you show how to OCR hand writing text

  • @robmulla

    @robmulla

    Жыл бұрын

    Hi! This is a hard task and depends on the handwriting and language. Unfortunately I haven’t looked into to it much! Good luck.

  • @Helloch3421
    @Helloch34214 ай бұрын

    Please help keras-ocr showing error while installing 😢😢😢 Error: couldn't find a version that satisfies the requirement Error: no matching distribution found for keras-ocr

  • @davidhugenberg2389

    @davidhugenberg2389

    Ай бұрын

    Verify your kaggle profile with your phone number (1 time key/code). Then copy the notebook again, and in the session options (right side of notebood) you will see that "internet is on". This will also let you use GPU resources.

  • @Helloch3421

    @Helloch3421

    Ай бұрын

    @@davidhugenberg2389 thx bro❤️

  • @StepanSkladanovskii
    @StepanSkladanovskii Жыл бұрын

    is it possible to get a text from a video using this method?

  • @robmulla

    @robmulla

    Жыл бұрын

    Great question. It should be possible. I have a KZread video about working with video data. Each frame is essentially just an image that you could apply these techniques to. Check it out!

  • @semireddy5108
    @semireddy51082 ай бұрын

    is there anyone have an idea how to extract table data from image by maintaining the table format

  • @eightbo

    @eightbo

    Ай бұрын

    Hi, did you ever figure this out?? I want to extract only column 4 of the table and output to .txt file, any idea how? 😅

  • @fahmidanial
    @fahmidanial Жыл бұрын

    Can I run this on Google colab?

  • @robmulla

    @robmulla

    Жыл бұрын

    You should be able to, although I haven't tried it. Let me know if you get it working!

  • @fahmidanial

    @fahmidanial

    Жыл бұрын

    @@robmulla thanks for the reply!

  • @Lolatyou332
    @Lolatyou33211 ай бұрын

    I was trying to do EasyOCR to accelerate some software of mine by using GPU instead of pytesseract, but the results are absolute dog shit on obviously readable text. Pytesseract is getting it word for word yet EasyOCR can barely get even a couple of the letters correct. Pretty disapointing, wish pytesseract was GPU enabled from the start.

  • @guocity
    @guocity3 ай бұрын

    they can't extract text in 90 degree

  • @luvjain4145
    @luvjain41454 ай бұрын

    can u give source code please ?

  • @robmulla

    @robmulla

    4 ай бұрын

    All the code can be found here (also in the video description): www.kaggle.com/code/robikscube/extracting-text-from-images-youtube-tutorial

  • @samannwaysil4412
    @samannwaysil441211 ай бұрын

    hi

  • @marcusvincent3023
    @marcusvincent3023 Жыл бұрын

    Does anyone else feel completely stupid watching this guy fly through this?! 🤣

  • @robmulla

    @robmulla

    Жыл бұрын

    We all start somewhere- also I edit out all the bad parts :D