Creating a cross platform GUI for OpenAI's Whisper Part 5¶
GitHub link¶
Why use it?¶
The initial motivation for this repo is to have a GUI to manipulate Whisper. This is so that I can generate initial video to text files to put up transcripts on the kivyschool.com website. One good thing about this initial integration is that I can then switch models, make the GUI better, or even run this when I have no access to a terminal/cmd/powershell. Having complete control is really nice and being able to use Kivy to rapidly prototype a GUI is invaluable (this was built in 2 days discounting all the time I spent documenting it, total time was 6 days). Later on I still need to integrate multiprocessing to push blocking code onto a Python subprocess.
What can you learn from this?¶
In associated github repo and associated youtube playlist, you will learn:
- How to integrate Kivy and OpenAI's Whisper automatic speech recognition system.
- How to deal with various Poetry bugs very easily (assuming you got the setup in the PREREQUISITE SETUP video of Pyenv + python-poetry)
- How to integrate machine learning/tensorflow Python packages with Kivy and package them with PyInstaller
- How to create and visualize Kivy app, then easily make it.
- How to use and manipulate Kivy's FileChooser widget on desktop.
- How to get and set data on various Kivy widgets.
- How to manipulate Kivy popups
- How to manipulate filepaths with Pathlib
- How to package a complex Kivy app with PyInstaller
- See firsthand how to create an app from concept to completion.
- See the app at various stages by running the various midpointX.py files
AI Transcript provided by KivyWhisper¶
So hello and welcome to Kivy School. This is going to be part 5 of Kivy Whisper, which is create an executable with PyInstaller. So how do you create a PyInstaller executable? Time to follow Kivy School instructions on PyInstaller.
The first step is to poetry add PyInstaller or you can pip add PyInstaller if you use pip. Now the problem is it says basically the requirement is not compatible with your Python requirement. So the answer is do basic math and set Python to my current version. This is because my Python version in PyProject.toml is greater than 3.10, which means it can accept 3.14, which doesn't really exist as of the time of recording. But PyInstaller is asking for Python less than 3.13. So if I set Python to 3.10.9, it will satisfy PyInstaller's request. And all you have to do is just go to PyProject.toml and fix it up. If you go right here, Python 3.10.9 instead of saying greater than 310 and PyInstaller will run and be installed with poetry. Nothing crazy so far. Now it's installed.
Next we will choose between one file and one directory. I like one file. It looks cooler. It's not as messy. So we run this command right? Python -m to run a module and the module's name is PyInstaller argument is one file. The name is Kivy Whisper. This is going to be the name of the project. And the main.py. This is the actual Python file that I want to call. Next, we fix the spec file as for step four eight. We add from Kivy depths. All right, let's go to the spec file right here. Next, we add import SDL and glue right here. Add this line after ADATA right here. Star 3P for P and SDL 2. Add note KV so I have nothing to add in data so far.
However, Whisper AI might depend on some external files. So I have to check if the one file works out of the box or if this requires more files to be added. This requires manual testing. Can't create PyInstaller when Whisper AI is imported. So here they're going to say you're missing some hidden imports, which I will show you what I was missing. What did I manually add for Whisper AI to work? So here. That's the same one. Basically, one thing you can do is recursive copy metadata, open your whisper to the PyInstaller command. And that will correctly collect all of the Whisper library and package it correctly when you create an executable with PyInstaller.
Solution was recursive copy metadata, open AI whisper to the PyInstaller command. This is the new PyInstaller command. Python -m PyInstaller one file recursive copy metadata, open AI whisper. The name is Kivy whisper main.py. During manual testing, I found out that it's asking for Win32 time zone. And this one little quick hint. The way it took the screenshot was literally just press screenshot when it errors out. You want it to stay longer.
All you have to do is go right here and then like import time. Then time dot sleep. Ten seconds, right? Now, why will this work? Because very basically, if your Python executable is not running anymore, it will just exit, right? The window will just exit. Kivy inherently is a for loop, right? If your Kivy for loop runs and then immediately exits, the window will exit as well, right? But if after your Kivy app exits and then you import time, time.sleep(), your Python code is still running. It will hold open the command prompt and you can be able to see the error without being, you know, super fast with print screen.
There's something, just a quick note. If you can't get the error message immediately. So the solution is to add Win32 time zone to hidden imports and Kivy whisper.spec. So I'm not running this command anymore. All this command does is create the spec file. But here, now that I have a custom spec file, you need to actually run the spec file itself. So for that, let's go here.
So for that, you actually call the spec file. You need to say Python -m PyInstaller. Call the spec file and then say clean. I like using clean because sometimes it actually fixes problems that I have when you build from scratch. So another issue is that whisper is asking for melfilters in the temp folder. Then another problem, which I put together, is also asking for multilingual.tiktoken. So how did I fix this? Right? Right here, this path lib trick is just a complicated way of saying from dot van lib side packages. You send all the way to whisper assets from dot van lib site packages. You send to whisper assets. Why whisper assets? Because it says no such file or directory. This my local temp folder. This is the temp file that PyInstaller always creates to. It's not the same numbers, but it's like the same format. And it's looking for whisper assets, melfilters, whisper assets, melfilters. You send melfilters to whisper assets, melfilters, because that's where it's looking for melfilters. That's where it's looking for multilingual dot tiktoken.
So this is the path of sys executable. And this is just a trick that I'm using. I know how to get from sys executable to .venv lib site packages, but there's probably a smarter way to get this location. It's just this is quick and dirty. So I'm going to do it the way I know how to do it fast and easy. So this is sys executable. Import sys, sys executable. And this shows you the path of python.exe. And then it's good because this is the path of the python exe in your virtual environment. This is not the path of my system python. This is the path of my virtual environment python, which is the one I'm using. So we know where sys executable is right here.
From there, we can go to lib. And then from lib site packages, we can go to whisper. Now there's two notes. This approach depends on how your python system is installed, aka this approach has only been tested with a poetry plus pyam setup. And it might fail if installed through other means. There's number two. It's probably a smarter, more pythonic, more cross platform way to get the location of the lib folder.
But I just wanted to finish it. So this is the way I'm going to show you how I did it. So the solution is in the spec file. So the solution is going to be right here. The base environment and then filters path and then from filters to whisper assets. Filters to whisper assets. And then data is all data is just a list of tuples. The tuple is source destination. Source destination. That's all there is to it. And the data is right here. Data is equal to data. Right. And then the explanation is right here. Base venv is the base.venv folder that holds the lib folder where whisper resides.
Then filters path and filters path to is manual join of the location of melt filters.npc multilingual tiktoken based on what I assume about the Python poetry installation. The final step is telling PyInstaller where to find the location and where to send it. This is done through the data, which is just a list of tuples in this form. Source destination. The error messages are both looking for these files in the whisper assets folder. So now we also know the destination.
So here again filters path is the source. Right. Destination is whisper assets. And this destination is the destination that your standalone executable standalone one folder is using. This is the source. And then this is the destination. So run this command to build PyInstaller one file for the final time. Python -m to run a module PyInstaller.
Kivy whisper spec -clean. And then the executable is made in the distribution folder. And then it works. And then this is the transcript made with whisper AI medium model. Again, it works. A reminder to make a pip requirements dot text at the end. So if you don't like poetry, I did make requirements dot text.
The command is poetry export -F requirements dot text output requirements dot text. If people like using pip. So congratulations. Thank you for watching the series. So now you can also make Kivy whisper and I hope you learn something new. Hopefully learn something good. And I hope that also you can use this for your own apps as well. Thank you for watching. This has been Kivy School. Bye.
Article Error Reporting¶
Message @BadMetrics on the Kivy Discord.