AI & ML Enabled Video Analysis and Interpretation
by Badal Bhushan, Mr. Suman Kumar Jha, Shani Rathore, Vivek Chauhan, Vivek Sharma, Yash Rajput
Published: January 16, 2026 • DOI: 10.51584/IJRIAS.2025.10120067
Abstract
With video content absolutely everywhere these days—on learning platforms, in business settings, across social media—trying to analyze it all by hand has become practically impossible. Our paper describes a framework we built that uses AI and machine learning to make understanding videos much simpler, whether you're uploading your own footage or just sharing a link to something online.
Here's how it works: the system examines what's actually happening on screen while also listening to the audio, then brings everything together into summaries that actually make sense. We're using a Transformer-based model that's really good at figuring out how different moments in a video relate to each other and what they mean in context. After you get your summary, there's also a lightweight language model that lets you have an actual conversation about what you watched—you can ask questions and get answers that show a real understanding of the content.