Cactus Compute
On-device AI inference engine for smartphones, laptops, and edge hardware with automatic cloud fallback — a unified cross-platform SDK for developers building private, low-latency AI applications.
Cactus Compute
On-device AI inference engine for smartphones, laptops, and edge hardware with automatic cloud fallback — a unified cross-platform SDK for developers building private, low-latency AI applications.
Executive Summary
Cactus Compute is a YC S25 seed-stage startup building a cross-platform on-device AI inference SDK with an intelligent hybrid cloud fallback — a product position that is genuinely underserved in the developer tooling market today. The market timing is strong: the on-device AI software market is growing at ~29% CAGR and every major OEM launched dedicated on-device AI features in 2024-2025. The team is credible — Roman Shemet built the C++ engine from scratch and turned down Nvidia to do it; Henry Ndubuaku brings rare pre-seed operator credentials with two companies scaled to $25M+ ARR. However, the traction picture is still thin (4.2k GitHub stars, mostly self-reported usage metrics, no verified revenue or enterprise logos), and the competitive risk is existential-grade: Apple, Google, and Qualcomm ship native on-device inference runtimes for free, and llama.cpp has 70,000+ stars doing much of what Cactus does today. The single biggest risk is platform obsolescence — if Apple exposes its Private Cloud Compute routing logic to developers at the next WWDC, Cactus's core differentiator evaporates on iOS overnight.
Run your own diligence
Upload a pitch deck or paste any company URL to get a full AI-powered due diligence report in under 2 minutes.
Get started free →Free plan available · No credit card required