BinoSoRAs
AI-generated video detection through Binoculars on Sora
Joshua Bowden, Willy Chan
Stanford University
about us!
Willy
Josh
11 hours on conception + implementation
BinoSoRAs
BinoSoRAs is a novel algorithm and platform designed to authenticate the origin of videos through automated frame interpolation and deep learning techniques.
BinoSoRAs
BinoSoRAs is a novel algorithm and platform designed to authenticate the origin of videos through automated frame interpolation and deep learning techniques.
inspiration
Prompt: Several giant wooly mammoths approach treading through a snowy meadow, their long wooly fur lightly blows in the wind as they walk, snow covered trees and dramatic snow capped mountains in the distance, mid afternoon light with wispy clouds and a sun high in the distance creates a warm glow, the low camera view is stunning capturing the large furry mammal with beautiful photography, depth of field.
Prompt: A Chinese Lunar New Year celebration video with Chinese Dragon.
Prompt: Historical footage of California during the gold rush.
Prompt: A white and orange tabby cat is seen happily darting through a dense garden, as if chasing something. Its eyes are wide and happy as it jogs forward, scanning the branches, flowers, and leaves as it walks. The path is narrow as it makes its way between all the plants. the scene is captured from a ground-level angle, following the cat closely, giving a low and intimate perspective. The image is cinematic with warm tones and a grainy texture. The scattered daylight between the leaves and plants above creates a warm contrast, accentuating the cat’s orange fur. The shot is clear and sharp, with a shallow depth of field.
inspiration
inspiration
inspiration
inspiration
current solutions
current solutions
Current Solutions
Our Algorithm
bottom-up analysis
Customer Segment:
Revenue Per Customer
BinoSoRAs
introducing
AI-generated video detection through binoculars on Sora
detection results
Evaluation Threshold:
�52.87 Frechet Inception Distance
Overall Accuracy: 91.67%
Algorithm
Drawing from the state-of-the-art Binoculars framework for detecting AI-generated text from large language models, we upscale to video with BinoSoRAs.
LLM Detection
Unknown Text
Generated Text
LLM 1
LLM 2
To me, how surprising is the unknown text compared to the generated text?
‘surprising’: log-perplexity; how far out of distribution tokens are on average
Video Detection
Unknown Video
Interpolated Video
FLAVR CNN
Frechet Inception Distance
To me, how well can I recognize the unknown video compared to the interpolated video?
‘recognize’: Inception v3 image classifier ran on each frame, 2048-vector
BinoSoRAs
SoRA vs. Reality
Real Video
Real Video with Generator
SoRA Video
SoRA Video with Generator
SoRA Video
SoRA Video with Generator
BinoSoRAs
Our system's efficiency, scalability, and effectiveness proves necessary in addressing the evolving challenges of digital content authentication in an increasingly automated world.
What’s next?
Photo: after 11 hours
BinoSoRAs
AI-generated video detection through binoculars on Sora
Joshua Bowden, Willy Chan
Stanford University