GaiaEx AcademyGaiaEx Academy
KDB+/Q: The Language of Tick Databases
DeveloperProgrammingacademy.article.readingTime

KDB+/Q: The Language of Tick Databases

How Wall Street stores and queries billions of market data points

Share Posts

KDB+ and Q in one minute

KDB+ is a column-oriented time-series database; Q is its vector language. Sell-side and buy-side firms adopted it for tick storage, joins, and rolling analytics where scanning columns beats row-by-row loops.

Performance comes from columnar layout, aggressive use of memory mapping, and vector primitives implemented close to the metal. It is not a general OLTP replacement; it is tuned for append-heavy time series.

Columnar layout (conceptual) sym A A B time t₁ t₂ t₃ price p₁ p₂ p₃ Vector ops touch contiguous arrays → cache-friendly scans Schema and types still matter: bad sym lists cost space
Tables are column vectors; analytics filter and aggregate along columns and time.

Q basics

Q is terse and right-to-left. Vectors are native; loops exist but hot paths avoid them. Tables are dictionaries of columns, which matches how tick data is stored.

prices: 100.5 101.2 99.8
avg prices
deltas prices

Readable Q comes from small functions and comments—density alone is not a virtue in shared codebases.

Tickerplant, RDB, HDB

A common pattern: a tickerplant ingests feeds and fans out to subscribers; a real-time database holds the current session; a historical database stores partitioned history on disk. End-of-day processes roll RT into HDB partitions.

select last price by sym from trades where time > .z.t - 00:05

Crypto never closes, so “session” boundaries are operational choices, not exchange bells.

Tick pipeline (classic three-piece) Feed handlers Normalize symbols Timestamp at boundary Tickerplant Log + pub/sub No long-term store RDB + HDB Intraday vs history Partition by date Gap recovery uses logs; verify feed gaps and late corrections Crypto feeds can burst: size buffers and back-pressure
Separation of real-time and historical keeps hot paths small and queries predictable.

Where it shows up

Firms use KDB+ for surveillance dashboards, quote analytics, and research datasets. Names and deployments vary; the pattern is fast slice-and-dice over ticks and orders.

Crypto venues generate continuous data—plan retention, replay, and compliance export up front.

License cost and alternatives

Commercial licensing and specialized hiring are real costs. Open alternatives (ClickHouse, Timescale, QuestDB, DuckDB over Parquet) trade ecosystem fit for price. Benchmark on your query mix, not vendor slides.

Crypto tick stacks

Many teams pair Kafka or Redpanda with columnar storage and SQL engines. The invariant from KDB+ still applies: partition by time, keep schemas strict, and measure end-to-end lag from exchange timestamp to query result.

GaiaEx API consumers should log server timestamps and sequence identifiers if exposed—correlation beats guessing.