Blabbernauts - Cricket Commentary Generator

Blabbernauts is an AI-driven system to generate engaging and detailed cricket commentary from video footage. The solution leverages a multi-label, multi-class classification approach to predict seven key events in a cricket delivery: Bowling Hand, Bowler Type, Bowling Side, Length, Shot, Field Position, and Outcome.

The model utilizes transfer learning with a pre-trained Vision Transformer (ViT) to extract spatial features from video frames, combined with temporal modeling via an LSTM to capture the sequence of events. The predicted labels are fed into OpenAI's GPT-4 API, which generates dynamic and contextually appropriate commentary. To enhance the experience, ElevenLabs' high-quality Text-to-Speech API is used to convert the generated text into immersive audio commentary.

The system allows users to upload a cricket video, which is then analyzed and accompanied by both text and audio commentary, offering an engaging and interactive way to experience cricket matches.

Project information

  • Project Demo: View
  • Paper: View Paper
  • Technologies: Vision Transformer, LSTM, OpenAI GPT-4, ElevenLabs, Streamlit