extract text from microsoft word document using Python

Data Science often depends on information stored in Office file formats
Course materials for the Spring 2020 semester are available at
github.com/umbcdata601/spring...
and
most.oercommons.org/coursewar...
The 2019 Fall materials are at
most.oercommons.org/coursewar...
Those are an update to the Spring 2019 materials:
most.oercommons.org/coursewar...

Пікірлер: 22

  • @ulicesmoravaldovinos
    @ulicesmoravaldovinos22 күн бұрын

    being able to teach complex topics in such straight forward language and speed is a talent. thank you for sharing your talents to grow and teach the community of newcomers.

  • @samueldennis6501
    @samueldennis6501 Жыл бұрын

    Great job, I am an absolute novice self-starting using python and this provided incredible insight.

  • @chibuikeewenike

    @chibuikeewenike

    Ай бұрын

    This doesn't handle .doc files. It works for only .docx files. What package can handle .doc files.

  • @TheSuperUser
    @TheSuperUser2 жыл бұрын

    Thank you for a very informative video! I really like the way you build up the solution step by step which is more realistic and has better learning value.

  • @andreajulius2711
    @andreajulius27113 жыл бұрын

    WOW, this video is EXACTLY what I was looking for! Thank you sooooooo much! Awesome work! Keep it up!

  • @applieddatascience3424

    @applieddatascience3424

    3 жыл бұрын

    You're welcome. I've posted a link to the course materials in the video description.

  • @andreajulius2711

    @andreajulius2711

    3 жыл бұрын

    @@applieddatascience3424 Fantastic! Thanks a lot! 😎

  • @canaltrabalhointeligente
    @canaltrabalhointeligente2 жыл бұрын

    Hello, your video is aewsome!! Helped me a lot :)

  • @edwinportal4125
    @edwinportal41252 жыл бұрын

    Thanks for the tutorial

  • @doctari1061
    @doctari10612 жыл бұрын

    Very nice. Thanks. I'm just starting with Python, coming from older C languages. I see they utilize Dictionaries instead of Arrays. You video help me start processing some of the difference I'll need to deal with. It is also going to take time to process that I no longer need to strongly type my variables or use curly braces. Interesting.

  • @somekilgoretrout

    @somekilgoretrout

    2 жыл бұрын

    You're welcome. For typing in Python, take a look at type hints with mypy. They aren't enforced, but they can help in some cases.

  • @sloperspinches3122
    @sloperspinches31223 жыл бұрын

    Thanks for the tutorial. Just curious, is it possible to extract the texts from a Word document into a panda dataframe? Of course, there will be some data cleaning/preprocessing be involved. I've an interview transcript document in which I'm trying to do some automatic text summarization and sentiment analysis between an interviewer and interviewee(s).

  • @jackzero5230
    @jackzero52302 жыл бұрын

    how to read the font, text size, colour, etc...to an html object from docx files using python?

  • @TheSportify26
    @TheSportify262 жыл бұрын

    How to extract text from doc file?

  • @trunglongng4237
    @trunglongng4237 Жыл бұрын

    how can i keep its format? please help

  • @annefernando7176
    @annefernando71763 жыл бұрын

    How to deal with txt files?

  • @kanishkmair2920
    @kanishkmair29204 жыл бұрын

    How to deal with very long docx files?

  • @applieddatascience3424

    @applieddatascience3424

    4 жыл бұрын

    How is a long docx file different than a regular docx?

  • @shivrajshinde1738
    @shivrajshinde17384 жыл бұрын

    How to extract Table from Word using Python?

  • @applieddatascience3424

    @applieddatascience3424

    3 жыл бұрын

    stackoverflow.com/questions/46618718/python-docx-to-extract-table-from-word-docx from www.google.com/search?q=extract+Table+from+Word+using+Python

  • @AmitGupta-ir3it

    @AmitGupta-ir3it

    3 жыл бұрын

    I m getting empty text for hyperlink row data..how can I retrieve tables which contain hyperlinks in some of the rows ? Please advice me ..