movie_to_text.py
is a Python script that extracts audio from an MP4 video file, uploads the audio to Google Cloud Storage, and transcribes the audio to text using Google Cloud Speech-to-Text.
Before you begin, ensure you have the following:
- A Google Cloud Platform account.
- A project created in Google Cloud Platform.
- Billing enabled for your Google Cloud project.
- Google Cloud SDK installed on your local machine.
- A service account with the necessary permissions and a JSON key file.
Install the required Python libraries:
pip install google-cloud-storage google-cloud-speech ffmpeg-python
-
Create a new project in the Google Cloud Console.
-
Enable APIs:
- Enable the Cloud Speech-to-Text API.
- Enable the Cloud Storage API.
-
Create a Service Account:
- Go to "IAM & Admin" > "Service Accounts".
- Create a new service account with the "Owner" role.
- Generate a JSON key file for the service account and download it.
- Upload the JSON key file to your environment where you will run the script.
- Set the path to the JSON key file in the
credentials_path
variable in the script.
- Create a new bucket in the Google Cloud Storage section.
- Name your bucket and choose your storage class and location settings.
- Set the
bucket_name
variable in the script to the name of your bucket.
- Set the
credentials_path
variable to the path of your service account JSON key file. - Set the
bucket_name
variable to the name of your Cloud Storage bucket.
Run the script with the path to your MP4 file as an argument:
python3 movie_to_text.py /path/to/your/target.mp4
- Extract Audio: Uses FFmpeg to extract audio from the MP4 file.
- Upload to Google Cloud Storage: Uploads the extracted audio file to a specified Google Cloud Storage bucket.
- Transcribe Audio: Uses Google Cloud Speech-to-Text to transcribe the audio file and save the transcript as a text file.
To transcribe an MP4 file located at /path/to/your/video.mp4
:
python3 movie_to_text.py /path/to/your/video.mp4
This will create an MP3 file and a TXT file in the same directory as the video file, with the audio extracted from the video and the transcript of the audio, respectively.
- Ensure that FFmpeg is installed on your system and accessible from the command line.
- The script currently assumes that the video file exists locally. If you need to download it, you can uncomment and modify the
download_video
function as necessary.