transcription and speaker identification OpenAI-Whisper and Pyannote [Python]

Hello guys, in this video I will how you how to transcribe and identify the speaker by using OpenAI Whisper, Pyannote and Pydub .
For Pyannote you must register on huggingface website to get the access token.
Support me by subscribing to my channel and leave a like.
Github repository for the source code :
github.com/Mastering-Python-G...
OpenAi github link :
github.com/openai/whisper
Pyannote github link :
github.com/pyannote/pyannote-...
Pydub github link :
github.com/jiaaro/pydub
#openai
#openai_whisper
#pyannote
#pydub
#python
#speaker_identification
#transcription
#diarization

Пікірлер: 36

  • @Yacine_zaki_abderrazzak
    @Yacine_zaki_abderrazzak Жыл бұрын

    Thanks man, you deserve the best

  • @bootneck2222
    @bootneck22228 ай бұрын

    Great video. Thank you. Can the output be displayed on screen whilst it is processing?

  • @positivevibe142
    @positivevibe1426 ай бұрын

    ما شاء الله تبارك أخ محمد .... شكراً لك

  • @chungrandy780
    @chungrandy7803 ай бұрын

    Is there a colab version?

  • @hrishikeshnamboothiri.v.n2195
    @hrishikeshnamboothiri.v.n21958 ай бұрын

    try to include its requirements.txt also... Thanks

  • @leoncezammit2502
    @leoncezammit25026 ай бұрын

    Im really struggling to get this working, would i be able you to send you my output log ?

  • @ryanschwartz3340
    @ryanschwartz334010 ай бұрын

    nice video. Is the repo hard-coded to your directory structure? when I tried to change it, it said the format wasn't recognized

  • @masteringpython

    @masteringpython

    10 ай бұрын

    do you mean segment file ?

  • @ThePikkutyyppi
    @ThePikkutyyppi10 ай бұрын

    can i use this program to split speakers to their own files? or is this only for transcription

  • @masteringpython

    @masteringpython

    10 ай бұрын

    read more about pyannote to see how to split speakers

  • @ThePikkutyyppi

    @ThePikkutyyppi

    10 ай бұрын

    @@masteringpython What? Where?

  • @lawrencemedina5593
    @lawrencemedina55938 ай бұрын

    conda activate open_chatting does not work on my computer. "EnvironmentNameNotFound: Could not find conda environment: open_chatting You can list all discoverable environments with `conda info --envs`."

  • @masteringpython

    @masteringpython

    8 ай бұрын

    install conda toolkit then create an environment called open_chatting by typing : conda create --name open_chatting after that install the libraries that i mentioned in the video then run the code

  • @user-iu8le1pl3x
    @user-iu8le1pl3x6 ай бұрын

    Hi, Thanks for the Video. Need approach on how we can implement the solution with the large Audio with duration of 3 hours.

  • @KamilKaczmarekSolutions

    @KamilKaczmarekSolutions

    6 ай бұрын

    chunks

  • @KamilKaczmarekSolutions

    @KamilKaczmarekSolutions

    6 ай бұрын

    chunks and saving .txt from these chunks in files, add logic to see what chunks it already has (if you face error or sth, and you want to come back and don't have to start over, just continue where it left off)

  • @kmillanr
    @kmillanr3 күн бұрын

    no code in video

  • @user-ej4ol8zv9y
    @user-ej4ol8zv9y11 ай бұрын

    does this model work on languages other than English?

  • @masteringpython

    @masteringpython

    10 ай бұрын

    onely english

  • @PaweDuzy

    @PaweDuzy

    4 ай бұрын

    @@masteringpython Only english? What is I change model = whisper.load_model("small.en") to "small"? Acording to Whisper github documentation.

  • @WhiteShark010
    @WhiteShark01015 күн бұрын

    You have chance.

  • @Hirotodoroki
    @Hirotodoroki Жыл бұрын

    trying to run this but getting File contains data in an unknown format. tried several files and tried a wav file too, but no luck

  • @masteringpython

    @masteringpython

    Жыл бұрын

    I advise you to use python anaconda to create development environment .Then install whisper openai ,after installing this library run a simple test to check if everything works correctly .Then install pyannote library and also run a simple test ( read carefully the installation guides maybe you missed something while installing the library)

  • @nadeembaig5943

    @nadeembaig5943

    22 күн бұрын

    @Hirotodoroki were you able to resolve the error (File Contains data in Unknown Format)?

  • @user-zz3iv1qz6v
    @user-zz3iv1qz6v Жыл бұрын

    Thanks for the demo. I am getting the following error, even while using your audio.mp3 file: end = int(millisec(j[3])) return (int)((int(spl[0]) * 60 * 60 + int(spl[1]) * 60 + float(spl[2])) * 1000) ValueError: invalid literal for int() with base 10: ''

  • @user-zz3iv1qz6v

    @user-zz3iv1qz6v

    Жыл бұрын

    @mamido mami Yes, I did that, still getting the same error

  • @auflute

    @auflute

    Жыл бұрын

    same problem

  • @user-uy7fc3sf8x

    @user-uy7fc3sf8x

    Жыл бұрын

    same problem

  • @jbatista2008

    @jbatista2008

    10 ай бұрын

    From the error message and the code, it seems that the error is happening because the millisec function is trying to convert an empty string to an integer. The millisec function splits a time string, given in the format "hh:mm:ss.sss", into hours, minutes, and seconds, and then converts these components to milliseconds. Here is an example of the string being parsed: ['[', '00:00:00.998', '-->', '', '00:00:20.622]', 'G', 'SPEAKER_01'] When this loop runs, it returns an empty 'end' string: for l in range(len(k)): j = k[l].split(" ") start = int(millisec(j[1])) end = int(millisec(j[3])) The array position you want for 'end' is 4, not 3. Plus, it has a ']' symbol, so it must be cleaned up: for l in range(len(k)): j = k[l].split(" ") start = int(millisec(j[1].rstrip(']'))) # remove trailing ']' end = int(millisec(j[4].rstrip(']'))) # remove trailing ']'

  • @enriqueleonmacias249
    @enriqueleonmacias2499 ай бұрын

    Wow, the transcript takes like two times the duration of the file to process. I guess that this solution wouldn’t work to monitor hours of call recordings unless you use gpu servers.

  • @masteringpython

    @masteringpython

    9 ай бұрын

    it is recomended to use cuda ( nvidia gpu ) for speed cpu is very slow

  • @patoyrigoyen
    @patoyrigoyen11 ай бұрын

    Does this need GPU?

  • @masteringpython

    @masteringpython

    11 ай бұрын

    in this video i did not used GPU, but if you want to use it read the pyannote documentation

  • @ghulamshabbir9532
    @ghulamshabbir95328 ай бұрын

    do this work offline ?

  • @masteringpython

    @masteringpython

    8 ай бұрын

    yes