Announcement

Collapse
No announcement yet.

Voice to text with whisper

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

    Voice to text with whisper

    I was looking for an application that could take input from an audio file and extract the speech and write it to a text file. Information about whisper is here.
    Instructions for installing whisper are found at https://github.com/openai/whisper.
    Using Jupyter Notebook I have about a dozen python projects, each having their own environment.
    Jupyter Notebook and a ton of modules is in the repository.

    If you install a python module outside of a specific environment it will install in your home account as a generally available module.
    However, whisper is big app and I didn't want or need to install it in all of my projects, so from my home account I issued

    python3 -m pip install -U openai-whisper
    and a few minutes later it was installed without issues.

    pip list
    showed whisper in my list of modules. It also showed server nvidia drivers. It seems that whisper is designed to use an nvidia GPU if it can find one, otherwise it defaults to using the CPU.

    pip show openai-whisper
    showed the modules that whisper required and those that require whisper.

    Then I tried it out on an mp3 file

    whisper /home/jerry/Videos/2016_12_24_8_christmas_service.mp3 --model medium --language English
    and a few minutes later I had the text of the voice on the mp3. It was 100% accurate.
    After reading the documentation I discovered that English is the default language and specifying it is unnecessary. Also, "medium.en" is designed specifically to increase the speed of the tool on files spoken in English.

    The list of commands is given using
    whisper --help
    Last edited by GreyGeek; Apr 11, 2023, 05:25 PM.
    "A nation that is afraid to let its people judge the truth and falsehood in an open market is a nation that is afraid of its people.”
    – John F. Kennedy, February 26, 1962.
Working...
X