Sieve
The video data research lab providing high-quality curated video datasets for AI model training, targeting frontier AI labs, Fortune 100 companies, and generative AI startups.
Sieve
The video data research lab providing high-quality curated video datasets for AI model training, targeting frontier AI labs, Fortune 100 companies, and generative AI startups.
Executive Summary
Sieve is a YC- and Matrix-backed seed-stage company building curated video datasets for frontier AI labs at a genuinely compelling moment — the generative video wave (Sora, Runway, Kling) is creating acute, real demand for exactly what Sieve sells. The market thesis is solid: the global AI training dataset market is $3.2B and growing at 22%+ CAGR, with video being the fastest-growing sub-segment. However, the traction picture is murky — claimed customers like "frontier AI labs and Fortune 100 companies" are unverified, one source pegs revenue at ~$145K (though another cites $1.8M), and the company has raised only $4M with no follow-on round in over two years. The single biggest risk is unresolved: Sieve's entire business model depends on sourcing video from "dozens of diverse data sources" at exabyte scale, and the company has disclosed nothing about its content provenance or licensing framework — a critical omission in a year when the U.S. Copyright Office concluded that AI training use of copyrighted content likely exceeds fair use.
Run your own diligence
Upload a pitch deck or paste any company URL to get a full AI-powered due diligence report in under 2 minutes.
Get started free →Free plan available · No credit card required