// failure intelligence · how it works
A Naive Bayes + retrieval ensemble trained on 4,998 verified autopsies — drawn from a research dataset of 17,044. 86% Top-3 accuracy (72% primary cause only) · largest curated startup failure dataset. CB Insights and PitchBook do not publish predictive precision metrics because they do not have this model.
// data quality
Every case in the database passes a structured review before it is published. The criteria are the same regardless of sector, geography or funding size.
The company must have demonstrably ceased operations, been liquidated, declared bankruptcy, or been acquired under distress. Pivots and rebrands are excluded.
The primary cause of failure must be identifiable from public documentation — founder post-mortems, regulatory filings, press coverage, court records, or investor statements. If the cause cannot be substantiated, it is left unassigned.
Key milestones — founding, peak, first warning signals, shutdown — must be reconstructible from at least two independent sources. Survival months are computed from these dates, not estimated.
Analytical fields — collapse style, hype cycle, moat type, fatal mistake, archetype — are assigned by a human reviewer, not generated automatically. Each field is a considered judgment, not a classification label from a model.
Cases that do not meet all four criteria are either excluded entirely or published with a "unverified" flag. The verified / unverified split is reflected in the precision metrics above — backtesting is always run on verified cases only.
// precision metrics
Validated on a stratified 80/20 split of 17,044 autopsies in the research pool. n=3,421 test entries, ±1.7pp CI.
"72 out of 100: exact primary cause in Top-3. 86 out of 100: real cause (primary or secondary) in Top-3."
| Version | Top-3 | MRR | Dataset |
|---|---|---|---|
| V2 | 60.0% | — | ~1,500 |
| V3 | 61.3% | — | 2,366 |
| V4 | 73.0% | — | 5,318 |
| V5 | 80.4% | — | 5,471 |
| V6 | 61.9% | — | 16,112 |
| V8 | 68.1% | — | 8,324 |
| V9 | 86.1%† | — | 17,044 |
| V10 (current) | 72% / 86%* | — | 17,044 |
* V10: 72% primary cause only / 86% primary or secondary cause. NB era + retrieval ensemble α=0.50, n=3,421 ±1.7pp CI. † V9: multi-label metric only (primary or secondary cause). V6: dataset tripled (5k→16k) with external data; retraining from broader corpus reduced precision temporarily.
// model pipeline
Each analysis runs the same deterministic pipeline. Every score is traceable to concrete signals.
Exact parameters, bonus functions and bayesian adjustment values are proprietary and not published. The pipeline is auditable — every score is traceable to concrete signals.
// model signals
Approximate weights based on backtesting. The model uses 12 signals in total.
Primary pool filter. Highest discriminative power. Model anchor.
Exact or regional match. Captures regulatory and macroeconomic context.
Most predictive field in backtesting: +14.8% accuracy when present.
+9.3% confirmed predictive power. Founder profile correlates with collapse type.
Market cycle at founding. Neutralises temporal penalties.
New in V5. Exact and text-similarity matching of the primary fatal mistake. Includes unit-economics sub-classification (burn rate, CAC/LTV, margin compression).
Exact weights and internal model parameters are proprietary.
// reading the results
The output is not a grade. Each number has a precise technical meaning that determines how to use it.
A score of 72/100 means the startup shares 72% of structural patterns with companies that failed in this pool, not that it has a 72% chance of failing. The model identifies resemblance, not destiny.
Structural similarityThe model identifies the primary failure cause in the Top-3 in 72% of analyses (86% when secondary cause also counts). Collapse causes overlap — if the cause you suspect appears in the Top-3, the signal is strong.
n=3,421 · ±1.7pp CIScore ≥ 65: strong match, review the autopsy in detail. Score 40–64: moderate, relevant patterns but structural differences exist. Score 30–39: weak, pool is small, treat with caution. Closer score = more weight on that collapse timeline.
Min threshold: 30More reliable: SaaS, Fintech, Marketplace (>100 autopsies), well-defined hype cycle, funding <$200M. Less reliable: Crypto/Web3 (excluded by default), regulatory cause outside US/Europe, funding >$500M, pure timing collapse.
Context-dependent precision// known limitations
Transparency about areas where precision falls below the model average.
Below-average precision. Few entries with well-documented regulatory cause outside the US and Europe.
Expanded with 10,500+ LatAm/India/Africa/SEA entries
Hard to distinguish from 'market fit' without external signals. Pure timing signals are the least discriminative.
V7: isotonic regression with real feedback
Low internal diversity. Crypto collapse patterns differ from any other sector.
Sector excluded from matching by default
// responsible use
Risk scores and pattern matches are structured intelligence signals derived from real historical cases. They are designed to surface patterns worth investigating — not to replace primary research, financial modelling, or professional judgement.
Use UnicornBurn the way you would use any professional intelligence platform: as a focused starting point for deeper analysis. A high score means the structural profile closely resembles companies that failed — it does not mean the company will fail. A low score does not mean the company is safe. Always conduct independent due diligence. For the full scope of how outputs should and should not be used, see our Terms of Service 8.