2/28/2026

Sieve

The video data research lab providing high-quality curated video datasets for AI model training, targeting frontier AI labs, Fortune 100 companies, and generative AI startups.

Disclaimer: This report is based on publicly available information and AI analysis. It does not constitute investment advice. Always conduct your own due diligence before making investment decisions.
54

Sieve

The video data research lab providing high-quality curated video datasets for AI model training, targeting frontier AI labs, Fortune 100 companies, and generative AI startups.

32
Risk
Execution, regulatory & market risk
65
Team
Experience, domain fit & gaps
78
Market
TAM size, growth rate & timing
42
Traction
Evidence of demand & momentum

Executive Summary

Sieve is a YC- and Matrix-backed seed-stage company building curated video datasets for frontier AI labs at a genuinely compelling moment — the generative video wave (Sora, Runway, Kling) is creating acute, real demand for exactly what Sieve sells. The market thesis is solid: the global AI training dataset market is $3.2B and growing at 22%+ CAGR, with video being the fastest-growing sub-segment. However, the traction picture is murky — claimed customers like "frontier AI labs and Fortune 100 companies" are unverified, one source pegs revenue at ~$145K (though another cites $1.8M), and the company has raised only $4M with no follow-on round in over two years. The single biggest risk is unresolved: Sieve's entire business model depends on sourcing video from "dozens of diverse data sources" at exabyte scale, and the company has disclosed nothing about its content provenance or licensing framework — a critical omission in a year when the U.S. Copyright Office concluded that AI training use of copyrighted content likely exceeds fair use.

Run your own diligence

Upload a pitch deck or paste any company URL to get a full AI-powered due diligence report in under 2 minutes.

Get started free →

Free plan available · No credit card required