The Catalog · The Deep Dive

Everything in the LORE library.
Every category, every product.

The full catalog — eight categories across three layers, organized so an AI agent can navigate from raw market activity all the way through to research-ready dossiers without losing causal safety.

Products
10,352
Families
53
Categories
8
Exchanges
9
History
14y
Layer 1 / Foundation

Market Activity

The raw, sequenced record of every trade, level, and book event across the nine major crypto venues. Normalized to a common schema, replayable by timestamp, and aligned across exchanges so your AI sees the market the way the matching engine did.

This is the foundation everything else in the library is built on. If a regime tag, a setup label, or a research dossier exists upstream, it traces back to a specific row in here.

What's inside
Print tape
Every executed trade with native venue timestamp and monotonic sequence ID.
81.7M perp bar rows
9 timeframes
Level-2 order books
Multi-level book snapshots and updates where the venue exposes them. Hyperliquid BTC active, others queued.
9,162 depth rows
1m / 5m / 1h
Kraken BTC true-L3 queue lifecycle
Message-level order book lifecycle on Kraken BTC — queue position, add/cancel/replace events, true-L3 reconstruction. The only true-L3 product in the library. Everything else stays explicitly L2 / depth / proxy.
message-level
Kraken BTC only
Mark, index, and premium
Reference prices and perp premium series, time-aligned to the print tape.
82.1M rows
Coverage
VenuesBinance USD-M, Binance Spot, Hyperliquid, Kraken, Coinbase, OKX, Bybit, Deribit, dYdX
Assets150 USD-M perpetual futures · 65+ spot
Timeframes1m · 3m · 5m · 10m · 15m · 30m · 1h · 4h · 1d
History2012 → today (varies per asset)
Example use case
Agent query
"Pull every BTC trade from 14:22:00 to 14:25:00 UTC on March 12, 2024 on Binance USD-M and Hyperliquid."
→ Returns aligned event streams from both venues with sequence IDs, sides, sizes, prices, and venue timestamps. Replayable end-to-end.
Layer 1 / Foundation

Derivatives

Funding rates, open interest, basis spreads, and forced-close events for the perpetual futures universe. The forces moving the market that don't show up in the candle alone.

An AI agent looking only at price misses why the move happened. This category gives it the funding pressure, leverage state, and liquidation context that explain whether a move was real or mechanically forced.

What's inside
Funding rate history
Realized funding per perp, with rolling z-scores and percentile ranks for context.
813K rows
Funding & carry feature pack
Curated funding-rate features built for carry, basis, and rate-curve research — historical + forward, source-neutral.
historical + forward
Open interest series
OI history, OI delta, OI as percent of supply where available.
66.8M rows
Mark / index / premium
Reference + premium time series for every perp.
82.1M rows
Liquidation & deleveraging
Forced-close events plus deleveraging-cascade context. Live forward collectors for binance_usdm_liquidations and bybit_liquidations; historical events tagged with size, side, asset, venue, and surrounding cascade markers.
live + historical
Binance USD-M, Bybit forward
Perp / spot basis
Cross-venue basis spreads, computed against spot reference.
all 150 perps
Coverage
VenuesBinance USD-M, Hyperliquid, OKX, Bybit, Deribit, dYdX (perp universe)
Assets150 USD-M perpetual contracts
Timeframes1m through 1d, plus per-funding-cycle for funding
History2020 → today (perp data depth varies; 81 perps with 5y+, 142 with 3y+)
Example use case
Agent query
"Show me every BTC long liquidation cluster over $5M since 2024 where funding flipped negative within 30 minutes."
→ Returns event-grouped liquidations joined to funding rate transitions, with timestamps, sizes, and the surrounding price context.
Layer 1 / Foundation

Positioning & Flow

Where traders are crowded, who's trapped, and how aggressive the flow is. The data layer that separates a real conviction move from a forced one.

An AI agent that only sees candles can't tell when shorts are about to be squeezed or when longs have been silently leveraging up for hours. This category surfaces those positioning dynamics directly.

What's inside
Long / short ratios
Aggregate trader positioning across exchanges where reported.
9 venues
OI delta + crowding indicators
Open-interest changes, crowding scores, leverage state.
66.8M rows
Trapped longs / trapped shorts
Heuristic flags for one-sided pressure that hasn't unwound yet.
all perps
Buy / sell aggression
Net taker flow, aggression ratios, side-imbalance tracking.
118.9M rows
Absorption + large-trade activity
Absorption signals, large-trade markers, block-print detection.
118.9M rows
Coverage
VenuesBinance USD-M (primary), with cross-venue aggregation where positioning data is exposed
Assets150 perps, all timeframes
Timeframes1m · 5m · 15m · 1h · 4h
History2021 → today (rolling 4-year window for some derived flow products)
Example use case
Agent query
"Find SOL setups where buy aggression was elevated for 30+ minutes while OI was falling. Was the move durable?"
→ Returns matching windows with paired flow + OI panels and the post-window outcome distribution.
Layer 2 / Intelligence

Cross-Exchange

Multi-venue confirmation, lead/lag dynamics, and dispersion signals — so an AI can tell whether a price move is real coordination across exchanges or one venue acting alone.

Single-venue data is the most common AI failure mode in crypto: a move on Binance gets treated as truth without checking whether spot and other perps confirmed it. This category eliminates that blind spot.

What's inside
Venue confirmation ratio
For each move, how many other venues confirmed within the same window.
10.6M rows
Lead / lag strength
Which venue moved first, and by how much, across rolling windows.
74 products
Cross-exchange dispersion
Standard deviation of price across venues at each timestamp.
9 venues
Isolated-move flags
Boolean tags for moves that didn't propagate across venues — likely thin liquidity or venue-specific.
live + historical
Listing & launch dynamics
New-asset listing timeline across venues — which exchange listed first, launch-window price discovery, cross-venue dispersion at debut, early-volume profile. Historical + forward.
historical + forward
Reference + pair gaps
Inter-venue gap series, including spot-vs-perp and perp-vs-perp pairs.
all major pairs
Coverage
VenuesAll 9 (Binance USD-M, Binance Spot, Hyperliquid, Kraken, Coinbase, OKX, Bybit, Deribit, dYdX)
AssetsBTC, ETH, SOL, and other top liquidity pairs across all venues; partial coverage extending to 80+ assets
Timeframes1m · 5m · 15m · 1h · 4h
History2020 → today
Example use case
Agent query
"Was the BTC breakout on Hyperliquid at 09:14 UTC confirmed across spot and other perp venues, or isolated?"
→ Returns a venue-by-venue grid showing which exchanges moved with Hyperliquid and which didn't, plus dispersion at the moment of the move.
Layer 2 / Intelligence

Chart Cognition

What a trained human trader sees on a chart — encoded so an AI agent can read it directly. Support, resistance, trend channels, volume profile, named setups, and natural-language captions for every window.

This is the largest single category in the library by row count. Without it, an AI working on raw OHLC has to invent its own technical analysis. With it, the chart is already pre-read.

What's inside
Support / resistance state
Active levels, level age, recent touches, broken-and-reclaimed flags.
119M chart-state rows
Trend & channel structure
Trend direction, channel slope, channel position, channel age.
all 150 perps
Volume profile
Volume-by-price profiles with point of control and value area markers.
9 timeframes
Setup labels
Named pattern labels (breakout, reclaim, rejection, squeeze, etc.) tagged at occurrence.
22.8M setup labels
Plain-English captions
Generated short captions describing each window — readable directly by an LLM.
3,637 products
Coverage
AssetsAll 150 perps + top spot pairs
Timeframes1m · 5m · 15m · 1h · 4h · 1d
History2020 → today
Update cadenceEach completed bar across all timeframes
Example use case
Agent query
"Find every BTC 1h chart over the last 90 days where price reclaimed prior resistance with rising volume profile inside a trending channel."
→ Returns matching windows with chart-state, captions, setup labels, and links to the underlying tape rows.
Layer 2 / Intelligence

Regimes & Context

The market state your AI is reasoning in. A pattern that works in a quiet compression regime can fail in a liquidation cascade. This category gives every market row a causal regime context: trend state, volatility state, liquidity state, crowding state, cluster age, recent regime shifts, and where each asset stands relative to its peers.

This is also where the Relative Strength Atlas lives — peer rankings across the universe by momentum, funding, volume, volatility, liquidity, OI, crowding, and regime behavior. Most crypto edges are not "BTC went up 2%" — they're "SOL is leading the universe right now while ETH is lagging." This category makes that explicit.

What's inside
Relative Strength Atlas
Peer ranks across the universe by momentum, funding, volume, volatility, liquidity, OI, crowding, and regime behavior.
1.99M rows
Regime state & cluster IDs
Causal regime tag per row: regime ID, when it started, age so far (live-safe — no future leakage), recent-change flag, cluster family ID.
1.99M rows
Trend / chop / vol regime tags
Categorical market state tags across multiple horizons.
546 regime products
Tradability Atlas
Execution context around each market setup — liquidity, spread, depth, slippage risk, funding drag, crowding, venue coverage.
8.3M tradability scores
Microstructure Execution
Source-neutral execution microstructure: spread series, passive-maker execution pack, and fill-quality signals built on 2.16B inventoried quote, book, and trade rows. Forward labels for passive fills, markouts, and adverse selection are tagged pending_source_coverage until forward trade-print and depth feeds land.
9 products · 313K feature rows
BTC / ETH / SOL · 1m + 5m
Breadth · dispersion · liquidity context
Cross-asset breadth, dispersion, and liquidity state to frame any single-asset move.
all 150 perps
Coverage
Assets150 perps + top spot pairs
Timeframes1m · 5m · 15m · 1h · 4h · 1d
History2020 → today
Live safetyAll regime + cluster columns are causal — live rows know age-so-far, never final regime length
Example use case
Agent query
"For every period where SOL was top-3 in momentum rank but bottom-half in funding rank, what's the next-7-day return distribution?"
→ Returns matching cross-sectional windows joined to outcomes, conditioned on regime state.
Layer 3 / Memory

Edge Atlas + Research

The strategy-discovery memory of the library. Named market episodes, outcome tables, replays of similar historical setups, base-rate comparisons, and ranked research lead cards your AI can drill into.

This is also where the Outcome Atlas lives — forward returns, MFE/MAE, barrier-touch order. These are labels, kept clearly separated from feature-safe inputs so an agent can study what happened next without accidentally training on the answer.

What's inside
Edge Atlas
Named market episodes (breakouts, reclaims, squeezes, capitulations, etc.) with structured metadata.
748 strategy maps
8.2M episode rows
Research Lead Cards
Ranked research opportunities, each with regime context, sample size, source trust, tradability, and a prepared agent prompt.
644 ranked leads
Pattern Memory Replay
Compact before/during/after windows for every setup, plus historical analogs for direct comparison.
644 replays
Matched Control Atlas
Same-regime, no-setup baseline comparisons. Tells your AI whether a setup actually beat its base rate.
644 leads
173 beating controls
Research Evidence Graph
Knowledge-graph traversal from a lead through its supporting evidence — products, regimes, controls, source trust.
1,368 nodes
Outcome Atlas
Forward returns, MFE/MAE, barrier-touch order. For training and evaluation, never as a feature input.
8.3M outcome rows
Coverage
EpisodesBuilt across all 150 perps where chart cognition + regime data exists
Lead rankingBy regime, sample size, base-rate excess, tradability, source trust
History2020 → today
Outcome separationOutcome columns are tagged via Column Safety Manifest — agents cannot use them as features by design
Example use case
Agent query
"Pull the top 10 ranked breakout leads from 2024 that beat their matched control, and replay each one with its 18 closest historical analogs."
→ Returns 10 lead cards plus 180 analog windows, joined to regime context and outcomes (clearly labeled as outcomes, not features).
Layer 3 / Memory

Trust + Workbench

The reliability scoring, validation gates, and leakage-safe pipeline that turn the rest of the library into something an AI agent can train and test against without accidentally cheating.

If the rest of the catalog is the data, this category is the rules of the game. It's what makes the difference between an AI that learns a real pattern and one that quietly trains on the future.

What's inside
Source reliability scores
Every dataset and source tagged with one of four reliability buckets: high-trust, usable-with-checks, limited-coverage, research-only.
9 sources scored
Validation gates
18 readiness gates and 48 structural validation checks. Every product has to clear them before it's published.
18 gates · 48 checks · 0 failed
Column Safety Manifest
Every column tagged with its role: safe-input, outcome-label, metadata, warning, do-not-use-for-training. Schema-level enforcement against leakage.
all products
Model Split Manifests
Pre-built causal train / validation / test / walk-forward / holdout splits. No future-looking data in any past window.
all timeframes
Model-Ready Feature Packs
Curated bundles of causal-only features per use case (momentum, breakout, mean reversion, funding carry, etc.). Outcomes excluded by construction.
5,520 marts
Agent Navigation + Workbench guidance
Start-here paths, complexity tiers, column-by-column guidance for which fields are safe to feed an agent.
4 tiers · 6 paths
Coverage
Reliability bucketshigh-trust · usable-with-checks · limited-coverage · research-only
Column rolessafe-input · outcome-label · metadata · forward-pending-label · quality-warning · execution-context · do-not-use-for-training
Split typestrain · validation · test · walk-forward · holdout
Audit cadenceValidation gates run on every release
Example use case
Agent query
"Pull a Model-Ready Feature Pack for a 1h breakout strategy, with the matching walk-forward split manifest. I want every column to be flagged safe-input."
→ Returns the curated pack with the column safety manifest attached, the split manifest, and a one-line confirmation that no outcome columns are present.

Ready to use this?

All eight categories are accessible from day one on every plan. The only thing that scales between Indie, Startup, and Pro is commercial scope and rate limits.