Zion Boggan
repos/Security Portfolio/prediction-market-bot-postmortem/index.html
zionboggan.com ↗
248 lines · html
History for this file →
1
<!doctype html>
2
<html lang="en">
3
<head>
4
<meta charset="utf-8">
5
<meta name="viewport" content="width=device-width, initial-scale=1.0">
6
<title>Prediction-Market Bot Postmortem | Zion Boggan</title>
7
<meta name="description" content="A Kalshi weather-trading bot taken from edge hypothesis to a documented, honest negative result, the evaluation harness, the era-split P&amp;amp;L, and the payout math proving the market had no edge to find.">
8
<link rel="icon" href="data:image/svg+xml,%3Csvg xmlns='http://www.w3.org/2000/svg' viewBox='0 0 32 32'%3E%3Crect width='32' height='32' rx='6' fill='%230c0e12'/%3E%3Ctext x='16' y='22' font-family='monospace' font-size='15' fill='%236cc7b8' text-anchor='middle'%3Ezb%3C/text%3E%3C/svg%3E">
9
<style>
10
  :root{
11
    --bg:#0c0e12; --bg2:#0f1217; --panel:#14181f; --panel2:#171c24;
12
    --line:#222936; --line2:#2c3543;
13
    --ink:#e8eaed; --soft:#c3cad4; --muted:#8a94a3; --faint:#5d6675;
14
    --accent:#6cc7b8; --accent-dim:#274b47;
15
    --maxw:1020px;
16
  }
17
  *{box-sizing:border-box;}
18
  html{scroll-behavior:smooth;}
19
  body{margin:0;background:var(--bg);color:var(--ink);
20
    font-family:-apple-system,BlinkMacSystemFont,"Segoe UI",Roboto,Helvetica,Arial,sans-serif;
21
    font-size:16px;line-height:1.65;-webkit-font-smoothing:antialiased;}
22
  .mono{font-family:ui-monospace,SFMono-Regular,"SF Mono",Menlo,Consolas,monospace;}
23
  a{color:var(--accent);text-decoration:none;}
24
  a:hover{color:#8fe0d2;}
25
  .wrap{max-width:var(--maxw);margin:0 auto;padding:0 24px;}
26
 
27
  /* nav */
28
  nav{position:sticky;top:0;z-index:20;background:rgba(12,14,18,.82);
29
    backdrop-filter:blur(10px);border-bottom:1px solid var(--line);}
30
  nav .wrap{display:flex;align-items:center;justify-content:space-between;height:58px;}
31
  nav .brand{font-weight:600;letter-spacing:.2px;}
32
  nav .brand .dot{color:var(--accent);}
33
  nav .links{display:flex;gap:26px;font-size:13.5px;}
34
  nav .links a{color:var(--muted);}
35
  nav .links a:hover{color:var(--ink);}
36
  @media(max-width:680px){nav .links{display:none;}}
37
 
38
  /* hero */
39
  header.hero{padding:74px 0 54px;border-bottom:1px solid var(--line);
40
    background:radial-gradient(900px 380px at 78% -10%, #11201e 0%, transparent 60%);}
41
  .avail{font-size:12.5px;letter-spacing:1.5px;text-transform:uppercase;color:var(--accent);
42
    display:flex;align-items:center;gap:9px;margin-bottom:20px;}
43
  .avail .pulse{width:7px;height:7px;border-radius:50%;background:var(--accent);
44
    box-shadow:0 0 0 0 rgba(108,199,184,.5);animation:p 2.4s infinite;}
45
  @keyframes p{0%{box-shadow:0 0 0 0 rgba(108,199,184,.45)}70%{box-shadow:0 0 0 8px rgba(108,199,184,0)}100%{box-shadow:0 0 0 0 rgba(108,199,184,0)}}
46
  h1{font-size:clamp(34px,6vw,52px);line-height:1.05;margin:0 0 8px;letter-spacing:-1px;font-weight:680;}
47
  .hero .sub{font-size:clamp(16px,2.4vw,20px);color:var(--soft);margin:0 0 24px;font-weight:500;}
48
  .hero .lede{max-width:660px;color:var(--soft);font-size:17px;margin:0 0 28px;}
49
  .hero .lede b{color:var(--ink);font-weight:600;}
50
  .cta{display:flex;flex-wrap:wrap;gap:12px;align-items:center;}
51
  .btn{display:inline-flex;align-items:center;gap:8px;padding:10px 18px;border-radius:8px;
52
    font-size:14.5px;font-weight:550;border:1px solid var(--line2);color:var(--ink);background:var(--panel);}
53
  .btn:hover{border-color:var(--accent-dim);background:var(--panel2);color:var(--ink);}
54
  .btn.primary{background:var(--accent);color:#06231f;border-color:var(--accent);font-weight:650;}
55
  .btn.primary:hover{background:#8fe0d2;color:#06231f;}
56
  .meta{margin-top:26px;display:flex;flex-wrap:wrap;gap:8px 22px;font-size:13px;color:var(--muted);}
57
  .meta .mono{color:var(--faint);}
58
 
59
  /* sections */
60
  section{padding:64px 0;border-bottom:1px solid var(--line);}
61
  .shead{display:flex;align-items:baseline;gap:14px;margin-bottom:30px;}
62
  .shead .idx{font-size:13px;color:var(--accent);letter-spacing:1px;}
63
  .shead h2{font-size:14px;letter-spacing:2px;text-transform:uppercase;color:var(--muted);margin:0;font-weight:600;}
64
  .shead .rule{flex:1;height:1px;background:var(--line);}
65
 
66
  /* flagship */
67
  .flag{background:linear-gradient(180deg,var(--panel) 0%,var(--bg2) 100%);
68
    border:1px solid var(--line2);border-radius:14px;overflow:hidden;}
69
  .flag .top{padding:30px 32px 8px;}
70
  .flag .tag{font-size:12px;letter-spacing:1.5px;text-transform:uppercase;color:var(--accent);margin-bottom:12px;}
71
  .flag h3{font-size:27px;margin:0 0 6px;letter-spacing:-.4px;}
72
  .flag h3 .v{font-size:13px;color:var(--muted);font-weight:500;margin-left:8px;letter-spacing:0;}
73
  .flag .grid{display:grid;grid-template-columns:1.25fr 1fr;gap:30px;padding:14px 32px 30px;}
74
  .flag p{color:var(--soft);margin:0 0 16px;}
75
  .flag .stats{display:grid;grid-template-columns:1fr 1fr;gap:12px;margin-top:6px;}
76
  .stat{background:var(--bg);border:1px solid var(--line);border-radius:9px;padding:13px 15px;}
77
  .stat .n{font-size:21px;font-weight:680;color:var(--ink);}
78
  .stat .k{font-size:12px;color:var(--muted);margin-top:2px;}
79
  .spec{background:var(--bg);border:1px solid var(--line);border-radius:10px;padding:18px 18px;}
80
  .spec .sk{font-size:11px;letter-spacing:1.5px;text-transform:uppercase;color:var(--faint);margin-bottom:10px;}
81
  .spec ul{margin:0;padding:0;list-style:none;font-size:13.5px;}
82
  .spec li{padding:6px 0;border-top:1px solid var(--line);color:var(--soft);display:flex;justify-content:space-between;gap:14px;}
83
  .spec li:first-child{border-top:none;}
84
  .spec li span{color:var(--muted);}
85
  .flag .foot{padding:0 32px 28px;display:flex;gap:18px;flex-wrap:wrap;font-size:14px;}
86
  @media(max-width:720px){.flag .grid{grid-template-columns:1fr;}}
87
 
88
  /* lab cards */
89
  .cards{display:grid;grid-template-columns:1fr 1fr;gap:20px;}
90
  @media(max-width:680px){.cards{grid-template-columns:1fr;}}
91
  .card{border:1px solid var(--line);border-radius:12px;overflow:hidden;background:var(--panel);
92
    display:flex;flex-direction:column;transition:border-color .15s,transform .15s;}
93
  .card:hover{border-color:var(--accent-dim);transform:translateY(-2px);}
94
  .card .thumb{height:172px;overflow:hidden;border-bottom:1px solid var(--line);background:#fff;}
95
  .card .thumb img{width:100%;height:100%;object-fit:cover;object-position:top left;display:block;}
96
  .card .body{padding:18px 20px 20px;display:flex;flex-direction:column;flex:1;}
97
  .card h3{margin:0 0 9px;font-size:17px;}
98
  .card p{margin:0 0 14px;font-size:14px;color:var(--soft);flex:1;}
99
  .tags{display:flex;flex-wrap:wrap;gap:6px;margin-bottom:14px;}
100
  .tags span{font-size:11.5px;color:var(--muted);background:var(--bg);border:1px solid var(--line);
101
    border-radius:5px;padding:3px 8px;}
102
  .card .lnk{font-size:13.5px;font-family:ui-monospace,Menlo,monospace;}
103
  .card .lnk::after{content:" →";}
104
 
105
  /* research */
106
  .rlede{color:var(--soft);max-width:680px;margin:-6px 0 26px;}
107
  .research{display:flex;flex-direction:column;gap:0;border:1px solid var(--line);border-radius:12px;overflow:hidden;}
108
  .ritem{display:grid;grid-template-columns:120px 1fr auto;gap:18px;align-items:center;
109
    padding:18px 22px;border-top:1px solid var(--line);}
110
  .ritem:first-child{border-top:none;}
111
  .ritem:hover{background:var(--panel);}
112
  .ritem .cls{font-size:11px;letter-spacing:.5px;text-transform:uppercase;color:var(--accent);}
113
  .ritem h3{margin:0 0 3px;font-size:16px;}
114
  .ritem p{margin:0;font-size:13.5px;color:var(--muted);}
115
  .ritem .go{font-family:ui-monospace,Menlo,monospace;font-size:13px;white-space:nowrap;}
116
  @media(max-width:680px){.ritem{grid-template-columns:1fr;gap:6px;}.ritem .go{margin-top:4px;}}
117
  .progs{margin-top:22px;}
118
  .progs .sk{font-size:11px;letter-spacing:1.5px;text-transform:uppercase;color:var(--faint);margin-bottom:11px;}
119
  .progs .row{display:flex;flex-wrap:wrap;gap:7px;}
120
  .progs .row span{font-size:12.5px;color:var(--soft);background:var(--panel);border:1px solid var(--line);
121
    border-radius:6px;padding:4px 10px;}
122
 
123
  /* credentials */
124
  .cred{display:grid;grid-template-columns:1.1fr 1fr;gap:28px;}
125
  @media(max-width:680px){.cred{grid-template-columns:1fr;}}
126
  .cred p{color:var(--soft);margin:0 0 14px;}
127
  .cred .role{font-size:14px;color:var(--muted);}
128
  .cred .role b{color:var(--ink);font-weight:600;}
129
  .certs{list-style:none;margin:0;padding:0;}
130
  .certs li{padding:9px 0;border-top:1px solid var(--line);font-size:14px;color:var(--soft);
131
    display:flex;gap:10px;align-items:baseline;}
132
  .certs li:first-child{border-top:none;}
133
  .certs li .c{color:var(--accent);font-family:ui-monospace,Menlo,monospace;font-size:12px;}
134
 
135
  footer{padding:46px 0 64px;}
136
  footer .row{display:flex;flex-wrap:wrap;justify-content:space-between;gap:18px;align-items:center;}
137
  footer .links a{color:var(--soft);margin-right:20px;font-size:14px;}
138
  footer .note{color:var(--faint);font-size:12.5px;max-width:520px;}
139
 
140
  /* detail pages */
141
  .detail-hero{padding:40px 0 28px;}
142
  .back{display:inline-block;font-size:13px;color:var(--muted);margin-bottom:22px;font-family:ui-monospace,Menlo,monospace;}
143
  .back:hover{color:var(--ink);}
144
  .kicker{font-size:12px;letter-spacing:2px;text-transform:uppercase;color:var(--accent);margin-bottom:13px;font-family:ui-monospace,Menlo,monospace;}
145
  .detail-hero h1{font-size:clamp(28px,5vw,42px);margin:0 0 12px;letter-spacing:-.6px;}
146
  .detail-hero .tagline{font-size:clamp(16px,2.2vw,19px);color:var(--soft);max-width:780px;margin:0 0 18px;}
147
  .facts{display:grid;grid-template-columns:repeat(auto-fit,minmax(148px,1fr));gap:12px;margin-top:24px;}
148
  figure{margin:0;}
149
  .shot{border:1px solid var(--line2);border-radius:12px;overflow:hidden;background:#fff;margin:30px 0 6px;}
150
  .shot img,.shot video{display:block;width:100%;height:auto;}
151
  figcaption{font-size:13px;color:var(--muted);margin:11px 2px 0;}
152
  .content{padding:6px 0 0;}
153
  .content h2{font-size:13px;letter-spacing:2px;text-transform:uppercase;color:var(--muted);margin:44px 0 16px;font-weight:600;border-top:1px solid var(--line);padding-top:30px;}
154
  .content h2.first{border-top:none;padding-top:6px;margin-top:18px;}
155
  .content p{color:var(--soft);margin:0 0 16px;}
156
  .content ul,.content ol{color:var(--soft);margin:0 0 16px;padding-left:22px;}
157
  .content li{margin:6px 0;}
158
  .content strong{color:var(--ink);font-weight:600;}
159
  .content code{font-family:ui-monospace,Menlo,monospace;font-size:13px;background:var(--panel2);border:1px solid var(--line);border-radius:4px;padding:1px 5px;color:var(--soft);}
160
  .content pre{background:var(--bg2);border:1px solid var(--line2);border-radius:10px;padding:15px 18px;overflow-x:auto;margin:0 0 18px;}
161
  .content pre code{background:none;border:none;padding:0;font-size:12.5px;color:var(--soft);line-height:1.62;}
162
  .content table{width:100%;border-collapse:collapse;margin:2px 0 20px;font-size:13.5px;}
163
  .content th{text-align:left;color:var(--muted);font-weight:600;border-bottom:1px solid var(--line2);padding:9px 12px;font-size:11px;letter-spacing:.6px;text-transform:uppercase;}
164
  .content td{color:var(--soft);border-bottom:1px solid var(--line);padding:9px 12px;vertical-align:top;}
165
  .content td code{font-size:12px;}
166
  .gallery{margin-top:8px;}
167
  .repo-line{margin:42px 0 0;color:var(--faint);font-size:12.5px;font-family:ui-monospace,Menlo,monospace;}
168
</style>
169
<link rel="canonical" href="https://zionboggan.com/prediction-market-bot-postmortem/">
170
<meta name="author" content="Zion Boggan">
171
<meta name="robots" content="index, follow, max-image-preview:large">
172
<meta property="og:type" content="article">
173
<meta property="og:site_name" content="Zion Boggan">
174
<meta property="og:title" content="Prediction-Market Bot Postmortem | Zion Boggan">
175
<meta property="og:description" content="A Kalshi weather-trading bot taken from edge hypothesis to a documented, honest negative result, the evaluation harness, the era-split P&amp;amp;L, and the payout math proving the market had no edge to find.">
176
<meta property="og:url" content="https://zionboggan.com/prediction-market-bot-postmortem/">
177
<meta property="og:image" content="https://zionboggan.com/assets/og-default.png">
178
<meta name="twitter:card" content="summary_large_image">
179
<meta name="twitter:title" content="Prediction-Market Bot Postmortem | Zion Boggan">
180
<meta name="twitter:description" content="A Kalshi weather-trading bot taken from edge hypothesis to a documented, honest negative result, the evaluation harness, the era-split P&amp;amp;L, and the payout math proving the market had no edge to find.">
181
<meta name="twitter:image" content="https://zionboggan.com/assets/og-default.png">
182
<script type="application/ld+json">{"@context":"https://schema.org","@type":"TechArticle","headline":"Prediction-Market Bot Postmortem","description":"A Kalshi weather-trading bot taken from edge hypothesis to a documented, honest negative result, the evaluation harness, the era-split P&amp;amp;L, and the payout math proving the market had no edge to find.","url":"https://zionboggan.com/prediction-market-bot-postmortem/","image":"https://zionboggan.com/assets/og-default.png","author":{"@type":"Person","name":"Zion Boggan","url":"https://zionboggan.com"},"publisher":{"@type":"Person","name":"Zion Boggan"}}</script>
183
</head>
184
<body>
185
<nav><div class="wrap">
186
  <a class="brand mono" href="/" style="color:var(--ink)">zion_boggan<span class="dot">.</span></a>
187
  <span class="links">
188
    <a href="/#oversight">Oversight</a>
189
    <a href="/#labs">Labs</a>
190
    <a href="/#research">Research</a>
191
    <a href="/#background">Background</a>
192
    <a href="/">Home</a>
193
  </span>
194
</div></nav>
195
<header class="hero detail-hero"><div class="wrap">
196
  <a class="back" href="/#labs">&larr; All work</a>
197
  <div class="kicker">MARKETS / QUANT</div>
198
  <h1>Prediction-Market Bot Postmortem</h1>
199
  <p class="tagline">A Kalshi weather-trading bot taken from edge hypothesis to a documented, honest negative result, the evaluation harness, the era-split P&amp;L, and the payout math proving the market had no edge to find.</p>
200
  <div class="tags"><span>Quant</span><span>Backtesting</span><span>Evaluation harness</span><span>Honest negative result</span><span>Brier score</span><span>Expected value</span><span>Kalshi</span><span>Market microstructure</span><span>Walk-forward</span></div>
201
  <div class="facts"><div class="stat"><div class="n">138</div><div class="k">settled live trades audited</div></div><div class="stat"><div class="n">44.9%</div><div class="k">actual bracket hit rate (62/138)</div></div><div class="stat"><div class="n">66.2%</div><div class="k">break-even win rate required</div></div><div class="stat"><div class="n">0.51</div><div class="k">realized reward:risk ratio</div></div><div class="stat"><div class="n">−$160.72</div><div class="k">pre-fix-era P&amp;L over 97 trades</div></div><div class="stat"><div class="n">0.37</div><div class="k">Gaussian Brier (0.25 = coin flip)</div></div></div>
202
  <div class="cta" style="margin-top:24px"></div>
203
</div></header>
204
<section><div class="wrap">
205
  
206
  <div class="content">
207
  <h2 class="first">The hypothesis</h2>
208
<p>The premise was that a 31-member GFS ensemble plus the NWS point forecast could out-predict retail traders on Kalshi weather contracts, and that a Gaussian probability model fed by that ensemble would find mispriced single-degree temperature brackets to bet NO on. The earlier research notes pushed this hard: a variable Kalshi fee model (<code>ceil(0.07 × contracts × price × (1−price))</code> instead of a flat $0.05), extremized log-odds aggregation of ensemble + NWS + base rates, GFS run-timing awareness (data lands ~3.5h after 00/06/12/18Z), and explicit longshot-bias avoidance, all aimed at squeezing a 7-9% edge past the fee threshold.</p><p>The model itself was not the problem. Live testing confirmed the ensemble pipeline ran correctly, 31 members, ~1.7°F spread. The problem was the market the model was pointed at. The post-2024 Kalshi regime change is unforgiving: after the volume explosion from $30M to $820M/quarter, professional market makers entered and takers now lose on average. Any edge had to come from genuinely better information, and on single-degree brackets there was none to be had.</p>
209
<h2>The evaluation harness</h2>
210
<p>The repository ships the part worth keeping. The walk-forward backtester (<code>empirical_analysis.py</code>, standard library only) replays the committed trade dataset chronologically: for trade <em>i</em> it trains a Laplace-smoothed empirical P(hit) model on trades 0..i−1 only, compares it against the live Gaussian, and reports Brier, win rate, EV, and total P&amp;L at four edge thresholds. The strict no-lookahead split is the whole point, it is what separates a real backtest from curve-fitting.</p><pre>def gaussian_p_hit(nws, lo, hi, mae=DEFAULT_MAE_FALLBACK):
211
    sigma = mae * math.sqrt(math.pi / 2.0)
212
    z_lo = (lo - nws) / sigma
213
    z_hi = (hi - nws) / sigma
214
    cdf = lambda z: 0.5 * (1.0 + math.erf(z / math.sqrt(2)))
215
    return max(0.0, cdf(z_hi) - cdf(z_lo))
216
 
217
def brier(pred, outcome):
218
    return (pred - outcome) ** 2</pre><p>The decision-gate evaluator (<code>c4_eval.py</code>) runs unattended on cron with no Claude session: it pulls shadow predictions, backfills outcomes from the Kalshi API, applies a hard liquidity filter, and scores Brier plus post-fee EV against criteria written down in advance. The EV and verdict logic is verbatim:</p><pre>fee = main.kalshi_taker_fee(px)
219
won = (outcome == 1) if side_yes else (outcome == 0)
220
ev = ((1.0 - px) - fee) if won else (-(px + fee))
221
evs.append(ev)
222
...
223
cell_ok = bool(best_cell and best_cell[1] >= 10 and best_cell[3] > 0)
224
passed = (ev_mean > 0) and (brier < BRIER_GATE) and cell_ok</pre>
225
<h2>What the backtest showed</h2>
226
<p>Two cuts settle the case. First, the calibration table, empirical P(hit) by distance of the NWS forecast from the bracket midpoint, against what the Gaussian predicted. The Gaussian is wrong in the same direction everywhere, badly underestimating the hit rate by 0.24 to 0.65:</p><table><thead><tr><th>|forecast−mid| &lt;</th><th>n</th><th>emp P(hit)</th><th>Gaussian P(hit)</th><th>gap</th></tr></thead><tbody><tr><td>1.5°</td><td>4</td><td>0.750</td><td>0.105</td><td>+0.645</td></tr><tr><td>3.5°</td><td>60</td><td>0.350</td><td>0.085</td><td>+0.265</td></tr><tr><td>5.0°</td><td>28</td><td>0.429</td><td>0.062</td><td>+0.367</td></tr><tr><td>8.0°</td><td>12</td><td>0.667</td><td>0.029</td><td>+0.638</td></tr></tbody></table><p>The Gaussian Brier across all resolved trades is <strong>0.3705</strong>, worse than the 0.25 of a blind coin flip. Second, the walk-forward P&amp;L at four edge thresholds, the Gaussian loses at every band, and the loss only deepens as you demand more edge (because the model's confidence is anti-correlated with reality):</p><table><thead><tr><th>min_edge</th><th>trades</th><th>WR</th><th>P&amp;L</th><th>EV/trade</th><th>Brier</th></tr></thead><tbody><tr><td>0.45</td><td>5</td><td>20.0%</td><td>−$8.05</td><td>−$1.610</td><td>0.7345</td></tr><tr><td>0.35</td><td>20</td><td>35.0%</td><td>−$29.32</td><td>−$1.466</td><td>0.5785</td></tr><tr><td>0.25</td><td>75</td><td>52.0%</td><td>−$104.15</td><td>−$1.389</td><td>0.4259</td></tr><tr><td>0.15</td><td>97</td><td>56.7%</td><td>−$76.81</td><td>−$0.792</td><td>0.3825</td></tr></tbody></table><p>The walk-forward loss of −$104.15 at <code>min_edge=0.25</code> reproduces the bot's live −$94 in that band (the drift is data-join and unclamped-era accounting). The empirical model is better calibrated (Brier 0.2554 vs 0.3748 on all resolved) but it almost never finds a tradeable edge, which is itself the finding: there is nothing to trade.</p>
227
<h2>Why the edge died</h2>
228
<p>An era-split of the P&amp;L was the decisive cut. Nearly 100% of the lifetime loss happened before the 2026-04-21 overconfidence-clamp patch; afterwards the bot was essentially break-even, not profitable. The −$157 total and 48% drawdown on the cumulative chart were old damage, not fresh losses.</p><table><thead><tr><th>Era</th><th>Trades</th><th>Avg ensemble prob</th><th>P&amp;L</th><th>EV/trade</th></tr></thead><tbody><tr><td>Pre-Apr-21 (unclamped Gaussian)</td><td>97</td><td>0.01-0.14</td><td>−$160.72</td><td>−$1.66</td></tr><tr><td>Post-Apr-21 (MAE-σ floor active)</td><td>41</td><td>~0.10</td><td>+$3.06</td><td>≈ $0.00</td></tr></tbody></table><p>The payout math explains the floor at break-even. The narrow brackets hit <strong>62/138 = 44.9%</strong> of the time, near coin flips. The realized reward:risk was <strong>0.51</strong> (average win +$3.32, average loss −$6.46), which demands a break-even win rate of <code>1 / (1 + 0.51) ≈ 66.2%</code>. The bot's actual win rate was <strong>54.3%</strong>. You cannot make money betting NO on near-coin-flip events when the payout structure requires a 66% win rate. The Apr-21 MAE-σ floor crudely clamped the model's hit probability up to ~10%, which capped the catastrophic overconfidence but could never manufacture an edge that the market does not contain. Single-degree brackets sit below NWS/ensemble forecast resolution and Kalshi prices them efficiently.</p>
229
<h2>The fixes that did and didn&#x27;t work</h2>
230
<p>Several fixes were tried across the bot's life; the honest accounting is mixed, and one fix was deployed against a bug that never existed.</p><ul><li><strong>Worked:</strong> the MAE-σ floor. It stopped the pre-Apr-21 catastrophic bleed by clamping the Gaussian's hit-probability floor. Do not remove it, removing it reproduces the −$160 era. But it produced break-even, not profit.</li><li><strong>Worked:</strong> the variable Kalshi fee formula. The flat $0.05 estimate was 2.5-5× too high on mid-priced contracts, which had been silently rejecting trades with 7-9% true edge. Correcting the fee math is real, but it only matters if an edge exists to clear it.</li><li><strong>Worked, narrowly:</strong> the hybrid bracket probability <code>max(Gaussian, raw_count±0.5°F)</code>, which caught converged-ensemble cases the Gaussian smeared out (Chicago: Gaussian said 31%, raw count showed 74%).</li><li><strong>Didn't work / rejected:</strong> a ±2°F NWS-distance guard blocked 7 winners and 0 losses for −$16.19 net, NWS distance is not a predictor of bracket failure. A METAR entry filter was dead code: trades are placed 12-30h before observations become informative.</li><li><strong>The fix against a non-problem:</strong> the 2026-04-27 audit blamed a dead <code>OPENMETEO_PROXY</code> node and fixed it. The ensemble pipeline was never dead. That memory entry is now flagged invalid.</li></ul><p>The root cause of the misdiagnosis loop was a logging gap. The <code>INSERT INTO trades</code> statement omitted three diagnostic columns (<code>raw_ensemble_probability</code>, <code>model_count</code>, <code>models_used</code>), so every row showed <code>model_count = 1</code> and a NULL ensemble probability. Three separate audits, an earlier one and the first two passes of this one, read that and concluded the 31-member ensemble was dead. It was not; the columns were simply never written. The bug never cost a cent of P&amp;L, but it cost three audit cycles and one deployed fix on a non-problem. The oscillation is preserved in the writeup rather than smoothed over.</p>
231
<h2>What I&#x27;d do differently</h2>
232
<p>The cheapest thing you can do before shipping a bot is build the evaluator first, in shadow mode, with the gate criteria written down <em>before</em> you look at the numbers, then build the strategy. The unbuilt pivot spec encodes that discipline. The first shadow scans immediately exposed why a naive restart would just repeat the bleed: most logged markets were px = $0.01 with the model claiming 0.18-0.64 edge, deep-longshot illiquid contracts where the huge edges are model-overconfidence artifacts, not alpha. Hence the hard liquidity filter (px ≥ $0.10, volume ≥ 20) before any EV is computed at all.</p><ul><li><strong>Select the market first.</strong> Verify a payout structure can clear a defensible win rate before tuning any model. The pivot kills narrow brackets entirely (<code>MIN_BRACKET_WIDTH = 5.0°F</code>) and only trades threshold markets when <code>|forecast − threshold| ≥ 1.5 × city_MAE</code>, the zones where NWS genuinely beats retail.</li><li><strong>Gate restart on a no-capital evaluation.</strong> A pre-committed rule: ≥30 resolved liquid shadow predictions, Brier &lt; 0.25, clearly positive post-fee EV, holding in the highest-volume city/market-type cell (not one lucky cluster). A structural <code>kalshi_place_order()</code> no-op under <code>SHADOW_MODE</code> makes risking capital impossible by construction, not by a single boolean.</li><li><strong>Treat a no-edge market as a stop signal.</strong> For a strategy with no edge, not trading is the correct play. There is zero historical data on the wider/threshold markets, so the pivot cannot be backtested, it requires a shadow data-collection window before any capital. With that appetite absent, the bot was retired. That is the right answer, and the framework is what made it defensible.</li></ul>
233
  </div>
234
  
235
  <p class="repo-line">Repository &middot; github.com/zionboggan/prediction-market-bot-postmortem</p>
236
</div></section>
237
<footer><div class="wrap row">
238
  <div class="links">
239
    <a href="/">Portfolio</a>
240
    <a href="https://www.linkedin.com/in/zion-boggan">LinkedIn</a>
241
    <a href="https://oversightprotocol.dev/">Oversight</a>
242
    <a href="mailto:zionboggan0@gmail.com">Email</a>
243
  </div>
244
  <div class="note">Built and deployed on a self-hosted Proxmox homelab. This page mirrors the
245
  project's documentation and results so the work is fully viewable here.</div>
246
</div></footer>
247
</body>
248
</html>