Yt2txt

Description

This project allows you to download YouTube videos, extract audio, and transcribe the audio files into text using the Faster Whisper library.

Python 3.8 or newer must be installed. Download Python.
A GPU compatible with CUDA (e.g., NVIDIA RTX 4090) is recommended for better performance on Windows/Linux.
macOS users with Apple Silicon (M1/M2/M4) can use the CPU version optimized for Metal Performance Shaders (MPS).

Create and activate a virtual environment:

python -m venv .venv
.\.venv\Scripts\Activate

python3 -m venv .venv
source .venv/bin/activate

Install the required libraries for your project:

pip install -r requirements.txt

Install PyTorch with CUDA support. Replace cu118 with your specific CUDA version if necessary.

pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118

Install the MPS-optimized version of PyTorch:

pip install torch torchvision torchaudio

For systems without a GPU:

pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cpu

Run the following script to ensure PyTorch is configured correctly:

python -c "import torch; print(torch.cuda.is_available()); print(torch.backends.mps.is_available()); print(torch.__version__)"

Run the project with the following command:

python your_script.py --url "https://youtube.com/..." -o output -m large-v3

If using CUDA on an NVIDIA GPU, make sure to:

PyTorch leverages MPS (Metal Performance Shaders) to accelerate computations on Apple Silicon. No additional configuration is required.

If you'd like to contribute, ensure you test the project on multiple platforms and configurations (Windows, macOS, GPU, CPU).

Project developed by [Your Name/Team].