Backtest Contract
Every run should declare the market family, symbol, data family, time window, route or channel, cursor behavior, freshness gate, and output schema before it starts. That turns a notebook into an executable research artifact instead of a one-off query.| Backtest requirement | 0xArchive surface | What to record |
|---|---|---|
| Bounded records | REST history routes | Route, query parameters, cursor chain, request IDs |
| Event order | WebSocket replay | Channel, speed, start/end, gap events, replay controls |
| Venue taxonomy | Venue coverage docs | Family, symbol style, route prefix |
| Freshness and gaps | Data-quality routes | Status, incident overlap, stale windows |
| Application code | SDKs or generated clients | Package version, OpenAPI version, response schema |
| Shell jobs | CLI JSON output | Command, env key source, stdout file, exit code |
Build The First Run
Choose one venue family
Start with the exact family: Hyperliquid core perps, Hyperliquid Spot, HIP-3 builder perps, HIP-4 outcome markets, or Lighter. Do not normalize symbols before you know which namespace they belong to.
Select the data family
Trades, order books, candles, funding, open interest, liquidations, L3/L4, and replay answer different research questions. Pick the narrowest one that can test the hypothesis.
Check quality before the pull
Call
/v1/data-quality/status and any route-specific freshness endpoint before using the window. Store that result with the run.Pull a bounded window
Use
start, end, limit, and cursor handling. Avoid unbounded loops. Save request IDs and pagination state.REST Window Example
success, data, and meta.request_id, then decide whether the route returns meta.next_cursor for pagination. Store the exact query string. Backtests often become impossible to debug because the dataset kept the rows but not the request that produced them.
Replay Example
Use WebSocket replay when the order of messages is part of the strategy. REST history can answer “what trades happened in this window?” Replay is better for “what would the book handler have seen as events arrived?”gap_detected messages as part of the backtest result. A gap is not just a transport event; it changes the meaning of the derived metrics. Mark the run incomplete, narrow the window, or rebuild from a checkpoint.
Point-In-Time Rules
Do not mix current metadata into historical evaluation unless the model is explicitly allowed to know it. Keep symbol lookup, market family, instrument metadata, and data-quality state tied to the run time where possible. If a route returns current metadata for convenience, store that fact in the manifest so a reviewer understands the boundary. Do not fill missing market data with synthetic records inside the raw dataset. If a strategy layer interpolates or imputes, keep that as a separate derived artifact and label it. The raw pull should stay as close as possible to the API response plus reproducibility metadata.Backtest Generation Checklist
Generated backtest examples should start from a bounded checklist, not from a broad loop.| Field | Capture |
|---|---|
| Venue family and symbol | Hyperliquid core BTC, Spot pair, HIP-3 builder symbol, HIP-4 outcome, or Lighter symbol |
| Data family | Trades, order books, candles, funding, OI, replay, L3, L4, or reconstruction |
| Window | Explicit start, end, limit, cursor behavior, and replay speed when applicable |
| Quality check | /v1/data-quality/status, freshness, coverage, incident overlap, or replay gap handling |
| Trace fields | Route, channel, request IDs, cursor chain, code version, OpenAPI version, and output path |
| Stop rule | Do not generate unbounded loops; widen only after the first bounded run is inspectable |