Orbit AI Insights

Investment Criteria - May 8, 2025 - 16 min read

How We Evaluate Seed-Stage AI Companies: Our Diligence Framework

By Marcus Chen, Managing Partner

Full Article
Seed Stage AI Evaluation Framework

We publish this because we believe founders deserve to understand how investors evaluate them. The venture capital industry benefits from information asymmetry, but that asymmetry does not serve founders or the companies they build. If more founders understand what investors are actually looking for, they will build better pitches, ask better questions, and make better decisions about which investors to target for their Seed round.

Our framework has six dimensions. For each investment decision, we give each dimension a qualitative rating and a written assessment. When we pass on a company, we attempt to explain which dimension drove the pass and why. The framework has evolved over six years and 40+ investments, and it continues to evolve as we learn. This version reflects how we are thinking about seed-stage AI diligence in 2025.

Dimension One: Team Composition and Operator Depth

Team evaluation at the Seed stage is the most important and most subjective dimension of our diligence. Since seed-stage companies do not have the track record data that makes later-stage evaluation more quantitative, the investment is largely a bet on the team's ability to navigate uncertainty, make good decisions under pressure, and attract the people and resources the company will need to grow.

Our team assessment focuses on four qualities that we have found to be consistently predictive across our portfolio:

Domain depth vs. domain breadth. We have a strong preference for founding teams that have exceptionally deep expertise in one specific domain rather than broad general intelligence applied to a market they have studied for six months. The founders who have built the most valuable companies in our portfolio were people for whom the problem they were solving was either personally experienced or the result of years of professional immersion. Generic intelligence is valuable but it is not sufficient for the type of institutional knowledge that enterprise buyers actually pay for.

Technical capability and architectural clarity. For AI application businesses, we want founding teams where at least one person can build the core technical product without external dependencies. We also want evidence that the founding team has a clear and defensible view of their technical architecture -- not because we require founders to have figured everything out, but because architectural conviction signals that the team has thought through the implications of their technical choices rather than defaulting to whatever was most convenient to build initially.

Commercial instincts in technical founders. The most common failure mode we see in technical founding teams is building an impressive product that fails to close the first enterprise contract because the founders have not developed commercial instincts commensurate with their technical sophistication. We assess commercial instincts through specific questions about how founders plan to land their first three enterprise customers: Who is the specific buyer, what does the outreach look like, what is the pilot structure, and what does success look like? Founders who can answer these questions concretely signal that they have done the customer discovery necessary to have a sales strategy.

Founder market fit. This is a concept we use to describe the degree to which a founding team's specific background, relationships, and experience give them structural advantages in the market they are targeting. A former clinical informatics director building AI for emergency medicine has founder market fit: they have the relationships to get into hospitals, the credibility to be taken seriously by physicians, and the domain knowledge to design a product that reflects the reality of clinical workflows. A former consumer app developer building the same product does not have founder market fit, regardless of their general intelligence or technical capability.

Dimension Two: Product and Technical Architecture

Our technical partner Jordan Webb leads product and technical diligence for all investments. His evaluation focuses on whether the product architecture creates genuine competitive advantage or whether it is a thin wrapper around API calls that any well-funded competitor could replicate in 90 days.

The specific questions Jordan asks are:

Does the product do something that was not previously possible without AI, or does it do something faster or cheaper that was previously possible? The former category is far more interesting because it suggests a genuinely new value proposition rather than an efficiency improvement that can be replicated without AI as model costs decline.

Where does the proprietary value sit in the technical stack? A product where the value is in prompt engineering and a specific system prompt is not defensible. A product where the value is in a proprietary evaluation framework, a fine-tuned model on domain-specific data, or a custom data processing pipeline that creates a unique training corpus has technical depth that is difficult to replicate.

What is the architecture's trajectory as foundation model capabilities improve? Some AI application architectures become more valuable as models improve because better models enable more sophisticated applications of the underlying product concept. Other architectures become less valuable as models improve because the gap between what the foundation model can do natively and what the product adds shrinks over time. We want to invest in the former category.

Dimension Three: Market Size and Structure

We use a deliberate market sizing methodology that differs from the top-down TAM analysis that appears in most pitch decks. The fundamental problem with top-down TAM is that it tells you how much money is spent in a category but says nothing about whether a given product is plausibly positioned to capture a meaningful share of it.

Our preferred market sizing approach starts from the bottom up: identify a specific customer archetype, size the realistic universe of that customer type, estimate the realistic contract value with that customer type, and calculate the achievable revenue at specific penetration rates. Then ask whether that achievable revenue is sufficient to build a venture-scale business, and whether the customer archetype is a realistic starting point for expansion into adjacent segments.

For AI application businesses, we also pay close attention to market structure: Is this a market where a small number of dominant incumbents will be difficult to displace, or is it a fragmented market where a new entrant can capture market share without triggering a defensive response from incumbents with significantly more resources? The most attractive AI application markets we have invested in are those where incumbents are structurally unable to respond to AI disruption because their existing products or business models create internal constraints that prevent effective competitive response.

Dimension Four: Business Model and Unit Economics

We evaluate business model quality along three dimensions: revenue predictability, unit economic structure, and alignment between pricing and value delivery.

Revenue predictability matters for early-stage businesses because it determines how much runway a given amount of capital provides. Businesses with high-churn, transaction-based revenue require more capital and generate less investor confidence in future cash flows than businesses with annual recurring revenue and strong gross retention. At the Seed stage, we look for evidence of the business model that the company intends to build, not necessarily the fully developed unit economics that do not exist yet.

Unit economic structure in AI application businesses has a specific challenge: compute costs. AI products with heavy inference requirements need to demonstrate a credible path to positive gross margins as they scale, because the unit economics of AI inference can look very different at 10 customers than at 1,000. We work through unit economics in detail during diligence to understand what the gross margin profile looks like at scale and whether the pricing model is sustainable as the company grows.

Alignment between pricing and value delivery is the dimension most often overlooked in early business model design. Pricing models that grow naturally with customer value -- usage-based pricing, outcome-based pricing, or seat-based pricing with strong expansion logic -- create the revenue flywheel that enables efficient net revenue retention. Pricing models that are disconnected from value delivery -- flat fees, one-time payments, or structures that do not grow with usage -- tend to create friction in customer expansion and cap the revenue potential of each customer relationship.

Dimension Five: Competitive Positioning and Moat Trajectory

As we described in our moat essay above, this dimension evaluates whether the company is building genuine competitive defensibility through proprietary data, distribution advantages, or workflow depth. In our diligence process, we specifically map where the company sits today on the moat-building trajectory and where they expect to be in 24 months.

One important nuance: we evaluate moat trajectory, not current moat strength. A seed-stage company that does not yet have a strong moat but has a clear and plausible theory for how they will build one over the next two years is more interesting to us than a company with a compelling current competitive position that does not have a credible moat-building strategy. The former is executing against a thesis; the latter may be in a more precarious position than the current snapshot suggests.

Dimension Six: Timing and Market Readiness

The most commonly overlooked dimension in seed-stage diligence is timing. The history of technology investing is full of companies that had the right product for a market that was not yet ready, burned through their capital in the early market, and either failed or were acquired before they could benefit from the market adoption wave they anticipated. Being early is sometimes indistinguishable from being wrong.

For AI application businesses, timing analysis requires evaluating both the AI capability trajectory and the enterprise adoption trajectory. The AI capability question is: are the foundation model capabilities required to deliver the product's value proposition available now, or does the product depend on capabilities that are 12 to 24 months away? Companies that are building against capabilities that do not yet reliably exist are taking an additional risk layer that most investors do not adequately price.

The enterprise adoption question is: have enterprise buyers in the target market reached the level of AI readiness -- security policies, procurement processes, technical infrastructure -- required to evaluate and adopt the product? Markets where enterprise buyers are sophisticated enough to go through a real procurement process are more attractive than markets where the product would need to educate buyers about AI capabilities before getting to a product evaluation.

How We Weight the Dimensions

We do not apply a fixed weighting to the six dimensions. The appropriate weighting depends on the stage of the company and the type of investment. For companies at the very early stage -- before product and before revenue -- team composition carries more weight because it is the primary available signal. For companies with product and early customer traction, product quality and business model evidence carry more weight because there is real data available.

One consistent pattern: a failing grade on any single dimension is almost always a deal-breaker, even when the other five dimensions are strong. A team with exceptional execution ability cannot overcome a market that is structurally too small. A brilliant product in a large market cannot overcome a business model with negative unit economics at scale. This is why our pass notes typically identify the single dimension that drove the pass rather than averaging across all six.

We are publishing this framework in the interest of transparency, not because we think it is the only valid approach to seed-stage AI diligence. Different investors weight different things, and some of the best seed-stage investments in history would have failed our framework at the time of initial investment. Frameworks are useful tools for organizing thinking, not algorithms for making decisions. The judgment required to apply this framework well comes from the operating experience our partners have accumulated -- not from the framework itself.

Marcus Chen is the Managing Partner of Orbit AI. He previously served as CEO of DataStream (acquired) and has been investing in early-stage AI companies since 2019. This article represents his personal views and should not be construed as investment advice or as a complete description of Orbit AI's investment process.

Back to Insights Pitch to Orbit AI