Phase 0 · Conor's v0 — Flat Spec Baseline
21-module flat specification authored by Conor Maguire (Pineal Director of Research). Consultant-style brief — comprehensive on domain but no chaining, no abort gates, no adapter routing, no reproducibility framework.
- 21 analytical modules (statements, ratios, peers, forecasts) as parallel sections
- No sequential dependency graph — all modules at same layer
- No coverage gate — scoring proceeds regardless of data depth
- No sector adapter framework — default treatment of every ticker
- No methodology hash / audit trail → reproducibility not guaranteed
Cross-check: Perplexity Max caught hallucination surface + reproducibility gap. Became the problem statement for v1.0.
Phase 1 · v1.0 — Chain Architecture Introduced
Claude primary architect draft. First structured chain with behavioural rules, execution limits, data tiers, and weighted scoring.
- 6 sub-agents: Ingest → Reconstruct → Returns/DuPont → Adapter/Peers → Forecast/Variant → Scoring/Report
- 10 behavioural rules including no-fabrication, source log mandatory, sector adapter declared, forecast-vs-consensus delta as alpha signal
- 6 execution rules (E1–E5): 25-min sub-agent cap, schema validation, missing-input → N/A, ambiguity flag, disagreement halt
- 4 data tiers (Bloomberg/Koyfin/EDGAR/WebSearch)
- 7 sector adapters (default, reit, hotels, industrial, saas, banks, consumer)
- 6-category scoring (growth 15, profitability 20, cash flow 20, returns 20, balance 15, earnings 10)
#1 Ingest
#2 Reconstruct
#3 Returns
#4 Adapter
#5 Forecast
#6 Scoring
Phase 2 · v1.1 — Claude Refinement (Pre-Sandbox)
Expanded to 8 sub-agents, added Rule #11 computational protocol (Python computes, LLM interprets), integrated Perplexity v0 critique.
- Added #3B WACC computation — per-sector, quarterly-cached, decoupled from ticker-run
- Added #7 Change Detection — version-over-version delta tracking
- Rule #11: all arithmetic in Python; LLM prose cannot compute
- Hindsight-bias discipline: variant views written from stated vantage date only
#1 Ingest
#2 Reconstruct
#3 Returns
#3B WACC
#4 Adapter
#5 Forecast
#6 Scoring
#7 Change Det.
Verdict from Sandbox Pass 1: REVISE — not PROCEED. 5 production-blocking holes caught (coverage abort, sanity check, EQ floor, dual-tier harness, cold-start protocol).
Phase 3 · v1.2 — Post Sandbox Pass 1 (REVISE)
Five production-gate fixes accepted in full from the GCCP Sandbox. Chain expanded to 9 sub-agents; Earnings Quality weight raised at Growth's expense.
- Rule #12 Coverage Abort Gate: <60% field coverage → INSUFFICIENT_DATA, no score
- Rule #13 EQ Hard Floor: EQ <4/10 caps total at 60 (Band C ceiling)
- Rule #14 Dual-Tier Shadow Harness: two tiers per ticker, >5% divergence = DATA_TIER_DISPUTE
- Rule #15 Cold-Start Protocol: FIRST_RUN / METHODOLOGY_SHIFT / SCHEMA_SHIFT tags
- New #3C Sanity Check Gate between #3 and #4 — balance sheet articulation, OCF→cash bridge, equity rollforward, debt reconciliation
- Weight shift: Growth 15 → 10 · Earnings Quality 10 → 15
#1#2#3
#3B#3C Sanity
#4#5#6#7
Dalata dry-run rescored: Band C (v1.1) → Band D capped → INSUFFICIENT_DATA (coverage 30% fails 60% gate). Correct answer: do not score at Tier 4. Exactly the behaviour Rule #12 is designed to enforce.
Phase 4 · v1.3 — Post Sandbox Pass 2 (PROCEED)
Sandbox returned PROCEED with 8 spec-clarity notes (none architectural). First version to survive all three Octave legs — Claude, Perplexity, Sandbox ×2 including compressed Pass 3.
- Weighted coverage gate — critical 7 fields × 3 weight, secondary × 1. Smarter than flat 60%.
- Tier-4 explicit production ban — WebSearch only legal for back-testing, never for live output
- Provider-health gate at #1 entry — degraded tier poisons chain; check before proceeding
- Balance gate single-ownership in #3C — halt UX = RECONCILIATION_FAILED structured output
- N/A vs cap ambiguity resolved — N/A suppresses whole score; cap applies only to computed-poor categories
- Per-field primary-tier registry — Koyfin wins on ratios, FMP on raw P&L, Bloomberg on SBC. Kills phantom divergence.
- SECTOR_RECLASSIFIED tag added to cold-start protocol
- Integration architecture: single
FPRModuleclass with 9 methods, mirrorspanel_sentinel.pypattern — NOT 9 subprocess-chained files - JSON envelope:
<<<FPR_JSON>>>...<<<END_FPR_JSON>>>parallel to Pineal's parser - Bloomberg row removed from priority table until compliance question resolved
Pass 3 verdict: PROCEED. No new architectural holes. Seven Pass-2 hole closures mechanically sound. Ship to Conor; v1.3.1 patches in parallel.
Phase 5 · v1.3.1 — Eight Spec-Clarity Patches
Eight non-blocking spec-clarity notes from Pass 3, being addressed in parallel with Conor packaging. TDOC validation ran against the v1.3.1 draft — five-of-five gates passed.
- 1. SubAgentResult return type replaces exception flow (halt-vs-proceed becomes testable)
- 2. Explicit state threading — pure functions on
self; tests no longer order-dependent - 3. Per-method unit-test fixtures — especially #3C Sanity Check (highest priority)
- 4+5. Subprocess invocation confirmed — Pineal convention is subprocess from skill handler;
_fpr_runner.pyshim redundant - 6. Methodology hash must include constructor params + registry version (silent Rule #7 failure otherwise)
- 7. Primary-tier registry →
schemas/primary_tier.json - 8. learnings.md split →
learnings/fpr.md+learnings/panel.md(concurrent append risk) - 9. Halt taxonomy unified table — 5 halt states + tags consolidated into single reference
TDOC validation artefact: spec-interpretive run (no compiled class yet). Coverage 52% < 60% threshold → INSUFFICIENT_DATA suppressed the score. Correct behaviour.
| Gate | Verdict | Note |
|---|---|---|
| (a) Rule #13 EQ hard floor | PASS | Cap inert — raw 33 already below 60 ceiling |
| (b) #3C catches Livongo $13.4bn impairment | PASS | Equity rollforward articulates cleanly |
| (c) Accrual ratio fires on FY22 (~89%) | PASS | Triggered by impairment, not operating accruals |
| (d) Variant view hindsight-clean | PASS | Three falsifiable paths, no named outcome |
| (e) UNIVERSE_MISMATCH flagged | PASS | US-listed, outside Pineal's EU mid-cap scope |
Phase 6 · v1.3.2 — TDOC-Surfaced Gaps
Two methodology gaps surfaced by the TDOC run — neither blocking Conor packaging, both legitimate refinements flagged before production.
- Rule #13 cap inertness test: TDOC failed on mechanics (33 raw) before EQ cap needed to cut. To prove Rule #13 actually bites, build a reference-portfolio test ticker where ex-ante score is ~70 (Band B) but EQ <4/10 should cap it to 60. Candidate: Palantir (PLTR) — strong top-line + FCF + balance sheet, SBC ~30% of revenue.
- Accrual ratio decomposition: split total accruals into operating (ΔWC-driven) and non-operating (impairments, restructuring, gains/losses). Apply the >10% red flag to operating only. Flag non-operating separately as
IMPAIRMENT_EVENT— informational, not a red flag. TDOC FY22 89% currently fires the red flag because of the Livongo writedown — different diagnosis, same signal. Consequential: every post-impairment distressed-value name in Pineal's hunting ground generates a false positive without the split. - Reference portfolio for scoring calibration — median = 50, includes Palantir for Band-B-EQ-fails coverage and TDOC for distressed-coverage test
Phase 7 · Live Agent — Compiled FPRModule
The transition from spec to running code. Single Python class, subprocess-invoked by Skill #10, outputs land in Pineal's research tree.
- FPRModule class — 9 methods mirroring panel_sentinel.py shape
- 6 JSON schemas — output contract for each sub-agent (Ingest, Reconstruct, Returns, Adapter, Forecast, Scoring)
- Skill #10 markdown (
.claude/skills/fpr-module/SKILL.md) — slash-command UX layer - Outputs directory:
research/fpr/· consensus line appended tocontext/learnings/fpr.md - EU filing regime adapters — IFRS, French URD, German Annual Report
- Reuters ticker suffix handling (
.L,.PA,.DE,.IR) - Cowork plugin packaging — after parity run with sandbox
- Production data tier live: Koyfin + FMP shadow harness; Bloomberg Phase 3 pending BLPAPI decision
Definition of "live": TDOC can be run via
/fpr TDOC in a Claude Code session with the plugin installed, and produce a compliant, reproducible, schema-validated output artefact without human stitching.