Company Spotlight: Haize Labs

Dec 03, 2025

Some startups chase the AI wave. Haize Labs chose to study the undercurrent. Before enterprises realized their GenAI pilots were drifting from excitement to exposure, Leonard Tang, Steve Li, and Richard Liu were already mapping the fault lines. Three Harvard grads with 15 peer-reviewed ML papers before turning 24, they saw a market racing ahead without a way to verify whether AI could hold its shape under pressure. They didn’t warn from the sidelines. They built the system that lets the rest of us trust the storm.

When Haize Labs stepped out of stealth in 6/2024, they arrived with receipts. They had run full-spectrum red-team evaluations on major AI systems and disclosed thousands of vulnerabilities to Anthropic, OpenAI, Cohere, and others. No theatrics, no burner accounts, just clean research and responsible disclosures. That impact traveled fast. By 8/2024, only 8 months after founding, they closed a $12.5M seed led by General Catalyst at a $100M valuation, with investors reportedly competing on price to get into the round. Reliability may not trend on social feeds, but it moves capital when it matters.

The platform is the backbone: Judge turns business rules into measurable evaluation standards. Haize, the engine carrying the company’s name, runs dynamic adversarial testing that digs into every edge case. Monitor tracks real-time reliability from dev to prod. Robustify uses every discovered weakness to strengthen the system. Their ACG algorithm accelerates adversarial generation by ~38x while cutting GPU memory needs by 4x, shifting red-teaming from a costly art to a scalable workflow.

Adoption mirrors the ambition. Frontier labs like OpenAI, Anthropic, and AI21 Labs rely on Haize Labs to pressure-test models before release. Enterprises like Deloitte, Weights & Biases, and MongoDB treat the platform as reliability insurance. Their work with AI21 on the Business AI Code of Conduct showed that alignment can be transparent, testable, and industry-ready. Their partnership with Vogent pushed their Verdict library into voice systems, proving their methodology travels across modalities.

The team reads like a research lab that ships like a startup. Leonard Tang brings experience from Allen Institute for AI, NVIDIA, Snap, and Amazon. Steve Li adds depth from Berkeley AI Research under Jacob Steinhardt. Richard Liu completes the founding trio. They’ve recruited operators like Jack Friedson from Datadog, researchers like Nimit Kalra, and early builders like Constantin Weisser and Kerem Kazan. Every hire fits the same pattern: scientific rigor paired with production urgency.

Haize Labs is hiring across research and engineering. If you want to build the systems that make AI trustworthy at scale, this is the room where the real testing begins.

https://www.haizelabs.com/careers

Let’s connect and keep the momentum going across the tech ecosystem. Whether you’re a founder shaping the future, a leader driving change, a VC backing bold ideas, or an investor spotting the next big thing, together, we’re pushing boundaries. Proud to be building the future with you.

Let’s connect on LinkedIn and Twitter (X), and keep the conversation going.

Full rundowns live www.devcuration.com

DevCuration

Discussion about this post

Ready for more?