Blabbernauts - Cricket Commentary Generator
Blabbernauts is an AI-driven system to generate engaging and
detailed cricket commentary from video footage. The solution
leverages a multi-label, multi-class classification approach to
predict seven key events in a cricket delivery: Bowling Hand,
Bowler Type, Bowling Side, Length, Shot, Field Position, and
Outcome.
The model utilizes transfer learning with a pre-trained Vision
Transformer (ViT) to extract spatial features from video frames,
combined with temporal modeling via an LSTM to capture the
sequence of events. The predicted labels are fed into OpenAI's
GPT-4 API, which generates dynamic and contextually appropriate
commentary. To enhance the experience, ElevenLabs' high-quality
Text-to-Speech API is used to convert the generated text into
immersive audio commentary.
The system allows users to upload a cricket video, which is then
analyzed and accompanied by both text and audio commentary,
offering an engaging and interactive way to experience cricket
matches.
Project information
- Project Demo: View
- Paper: View Paper
- Technologies: Vision Transformer, LSTM, OpenAI GPT-4, ElevenLabs, Streamlit