Day 17 — JSON Schemas: Pydantic Models for Trade Records

Mar 15, 2026

∙ Paid

What You Will Build Today

By the end of this lesson you will have a working system that takes raw, unpredictable JSON from a real broker API and turns it into clean, guaranteed-correct Python objects — the kind of foundation that every serious trading system depends on. You will also understand why getting this wrong destroys accounts, and exactly how to get it right.

Learning Goals

Understand why raw dictionaries are dangerous in financial systems
Build a TradeRecord model using Pydantic v2 with strict financial invariants
Implement a thread-safe, memory-bounded TradeLog
Connect your schema to real Alpaca Paper Trading API responses
Measure validation performance and set meaningful production alerts

Part 1 — The “Dict Soup” Trap

Let’s start with what a beginner — or honestly, a lot of intermediate programmers — would write when they first get fill notifications from a broker API.

import json

def process_fill(raw: str) -> dict:
    record = json.loads(raw)
    pnl = (record["filled_avg_price"] - record["cost_basis"]) * record["filled_qty"]
    return {"symbol": record["symbol"], "pnl": pnl}

That code looks completely fine. It is clean, readable, and short. It passes unit tests on synthetic data. It clears the backtest. Then it hits a real Friday afternoon near market close during an earnings release — and then:

filled_qty comes back as "10" (a string, not a number). Python silently multiplies. The P&L is garbage, and nothing crashes to warn you.
cost_basis is null on a short position. A TypeError fires. The process dies mid-trade.
filled_avg_price is 0.0 on a partial fill still pending settlement. A stop-loss triggers on a phantom price.

None of these are bugs in your code in the traditional sense. There is no syntax error. The logic is correct — it is just that you let unvalidated data from a third party, over a network, subject to undocumented changes flow directly into financial math. That is the bug.

Part 2 — Why This Breaks: The Missing Validation Boundary

Here is the deeper technical problem. Alpaca’s v2/orders endpoint has historically returned:

filled_qty as both string and float depending on API version and order type
null for filled_avg_price on GTD orders that have not triggered yet
ISO 8601 timestamps with and without millisecond precision depending on the event source

A plain Python dict enforces nothing. By the time bad data surfaces five call frames deep, the original payload is gone and the stack trace tells you nothing useful. You are debugging a symptom, not the cause.

The second problem is numeric precision. When JSON numbers become Python floats:

>>> 0.1 + 0.2
0.2999999999999999
>>> sum(0.1 for _ in range(10_000))
999.9999999998124   # 876 microunits of drift

At ten thousand fills per day, this is not a theoretical concern. It is a measurable P&L discrepancy that will appear in end-of-day reconciliation with no obvious cause — because the drift accumulated one fill at a time, invisibly, all day long.

Continue reading this post for free, courtesy of Python Quant.

Or purchase a paid subscription.