Case Study
Executing a Large SPY Sell Order
A step-by-step walkthrough of how our system detects market conditions, makes execution decisions, compares against benchmarks, and explains every choice — all on real market data.
Benchmark results (mean IS)
Higher = better average sell price vs arrival — we are closing a position
Key strategies at $5 M notional
| Strategy | Mean IS (USD) | Std (USD) |
|---|---|---|
| RL | $36,645 (73.3 bps) | ±$87,707 |
| TWAP | $26,721 (53.4 bps) | ±$66,712 |
| Immediate | $3,191 (6.4 bps) | ±$37,309 |
What this means in USD
At $5,000,000 notional: $9,925 better than TWAP ; $33,455 better than Immediate execution
Scale (illustrative): Imagine trading this notional every hour → ~$59,547 better than TWAP, ~$200,727 better than Immediate.
Read IS definition in the User ManualScenario
An institutional portfolio manager needs to liquidate a large SPY position over 10 trading days. The goal is to minimise implementation shortfall — the gap between the decision price and the actual average execution price — measured in basis points (bps). One basis point equals 0.01% of the trade value.
Reading the Market
Regime Detection with Hidden Markov Models
Before making any trading decisions, we need to understand the current market environment. Our Hidden Markov Model (HMM) analyses historical volatility and order-book imbalance to classify each trading day into one of two regimes:
Low volatility, stable conditions. The agent can afford to trade more slowly, patiently working the order to minimise market impact.
Higher uncertainty and price swings. The agent accelerates execution to reduce the risk of adverse price moves while the order remains open.
calm
220 days
avg vol: 11.0%
elevated volatility
32 days
avg vol: 18.7%
The Agent Executes
PPO Reinforcement Learning Policy
Our trained RL agent observes market conditions at each step — current inventory, remaining time, price movement, volatility, liquidity, and the detected regime — then decides what fraction of the remaining order to execute. The chart below shows one execution episode:
IS (bps)
333.20
Completed
Yes
Steps
10
Arrival Price
$482.88
Benchmark Comparison
How does our agent compare to classical strategies?
We measure performance using Implementation Shortfall (IS) in basis points. A higher IS means the trader received better average prices relative to the arrival price (for a sell order). We compare our RL agent against four classical benchmarks:
- TWAP — Time-Weighted Average Price: sells equal amounts at each time step.
- VWAP — Volume-Weighted Average Price: sells proportional to market volume.
- Almgren–Chriss — academic optimal schedule balancing urgency and market impact.
- Immediate — sells everything at once in the first bar (maximum market impact).
Each cell is colour-coded from red (worse) to green (better): higher mean IS is better for a sell, and higher completion is better. Values are averaged over many execution windows.
Show full benchmark table
| Strategy | Mean IS (bps) | Std IS (bps) | Completion Rate |
|---|---|---|---|
| RL | 73.29 | 175.41 | 100% |
| TWAP | 53.44 | 133.42 | 100% |
| VWAP | 44.39 | 139.65 | 100% |
| Almgren-Chriss | 52.28 | 131.28 | 100% |
| Immediate | 6.38 | 74.62 | 100% |
AI Governance
Plain-English Explanation by Claude
For every execution decision, our system generates a human-readable explanation suitable for a portfolio manager or compliance officer. This is the output for the scenario above:
Key Takeaway
Our regime-aware RL agent adapts its execution pace to market conditions, achieving competitive implementation shortfall compared to classical strategies while providing full transparency through LLM-powered governance. Every decision is explainable, every trade is accountable.
Try It Yourself in the Execution Lab