V4.3 Shadow Model Note
Status: V4.3-shadow task-weighted shadow model
Promoted liveThis page explains release impact and readiness. It does not replace the formula spec on Methodology, the threshold detail on Appendix, the schema/download contract on Data, or the citation layer on Research.
Comparison baseline
V4.2
pre-promotion live release used for comparison
Required inputs ready
4/4
present locally for shadow scoring
Published task-native occupations
485
occupations currently scored with the task-native shadow model
Validation comparison
2/3
current match-or-improve gates passing
Published shadow artifacts
The shadow layer is now auditable as data, not just as a readiness note.
Shadow scores
Per-occupation task-adjusted scores and fallback status.
Comparison summary
Score deltas, band flips, and anchor-review counts versus V4.2.
Validation comparison
BLS, family, and cluster comparisons against the live baseline.
Current shadow validation deltas: cluster -0.6667, BLS -0.0471, family -0.2037.
What Changes Already Affect Users
Now live in V4.3
- Bootstrap uncertainty intervals are published on occupations today.
- Structural risk and near-term risk are separated in the forecast layer.
- Task-primitives fields now publish weighted evidence where normalized O*NET task matches exist; sparse occupations remain explicit null.
- The release and governance surfaces now expose shadow-model readiness instead of hiding it.
- 485 occupations now have published task-native shadow scores for comparison against the live baseline.
What the audit trail still preserves
- The V4.3 shadow model has already been promoted into the live structural release.
Remaining Input Gaps
- All required local shadow-model inputs are now present.
Input Readiness
anthropic task penetration
data/raw/external/anthropic_task_penetration.csv
onet task statements
data/raw/external/onet/Task_Statements.txt
onet task ratings
data/raw/external/onet/Task_Ratings.txt
empirical mobility
data/raw/external/sg_empirical_mobility.json
Coverage Snapshot
Occupations
562
current published universe
Direct mapped
521
eligible for the direct coverage gate
Median direct matched task share
100%
current direct-coverage gate basis
Task-weighted share
86%
headline score still untouched
Promotion Gates
| Gate | Threshold | Actual | State |
|---|---|---|---|
Median matched task weight share across direct-mapped occupations This gate prevents a sparse task layer from directly changing the headline score before task matching is broadly comparable. | >= 0.6 | 1 | pass |
Experimental task-adjusted score matches or improves current validation diagnostics Requires at least 2 of 3 external checks to match or improve baseline. Current results: cluster directional accuracy 0.3333 vs 1; BLS rho -0.1908 vs -0.1437; family rho -0.4457 vs -0.242. | at_least_2_of_3 | 2/3 | pass |
No implausible anchor label flips without written rationale 8/8 anchors screened; 0 candidates still need editorial sign-off. | zero_unexplained_flips | 0 | pass |
What V4.3 Proved
V4.3 proved that task evidence works best as a disciplined exposure upgrade, not as a full wholesale rewrite of the structural formula. The candidate task-native formulas remain published as research scaffolding for V5, not as live rules.
effective_coverage = Σ_t w_it · exposure_t · success_t
net_risk = automation_pressure_i · (1 - λ · concentration_i) · market_modifier_i
What Must Happen Next
- Keep the full V4.2 vs V4.3 comparison published so the promotion remains auditable.
- Start V5 as sidecar workstreams: augmentation heterogeneity, empirical mobility, posterior uncertainty, and realized-risk forecasting.
- Do not absorb multiple new constructs into the live score without separate sidecar validation first.
- Treat the current empirical mobility prior as supporting evidence until a higher-granularity Singapore transition dataset exists.