Feature Extraction

The nano-lob crate provides comprehensive feature extraction from order book snapshots for machine learning models. These features capture market microstructure signals used to predict short-term price movements.

LobFeatureExtractor

The core feature extraction engine extracts 44+ features from the order book state.

Initialization

use nano_lob::features::LobFeatureExtractor;

// Default: tick_size=0.25, qty_scale=100.0
let extractor = LobFeatureExtractor::new();

// Custom parameters for different instruments
let extractor = LobFeatureExtractor::with_params(
    0.25,   // tick_size: ES/NQ futures
    100.0,  // qty_scale: normalize quantities
);

Source: nano-lob/src/features.rs:58-74

Feature Structure

pub struct LobFeatures {
    // Price-based features
    pub microprice: f64,           // Volume-weighted mid price
    pub weighted_mid: f64,         // Depth-weighted mid price
    pub spread: f64,               // Bid-ask spread in ticks
    pub mid_price: f64,            // Simple mid price
    pub best_bid: f64,
    pub best_ask: f64,
    
    // Imbalance features
    pub imbalance_l1: f64,         // Level 1 imbalance (-1 to +1)
    pub imbalance_total: f64,      // Total depth imbalance
    
    // Depth features
    pub bid_depth: f64,            // Total bid quantity (normalized)
    pub ask_depth: f64,            // Total ask quantity (normalized)
    pub bid_levels: [f64; 10],     // Bid quantity at each level
    pub ask_levels: [f64; 10],     // Ask quantity at each level
    pub bid_cumulative: [f64; 10], // Cumulative bid depth
    pub ask_cumulative: [f64; 10], // Cumulative ask depth
}

Source: nano-lob/src/features.rs:10-40

Core Features

Microprice

The microprice is a volume-weighted mid price that accounts for the liquidity at the best bid and ask:

microprice = (bid * ask_qty + ask * bid_qty) / (bid_qty + ask_qty)

This provides a more accurate estimate of the “fair” price than the simple mid:

let microprice = extractor.microprice(&book).unwrap();

// Example calculation:
// bid = 5000.00, bid_qty = 100
// ask = 5000.25, ask_qty = 50
// microprice = (5000.00 * 50 + 5000.25 * 100) / 150
//            = (250000 + 500025) / 150
//            = 5000.17

Implementation:

let total_bbo_qty = bid_q + ask_q;
if total_bbo_qty > 0.0 {
    features.microprice = (bid * ask_q + ask * bid_q) / total_bbo_qty;
} else {
    features.microprice = features.mid_price;
}

Source: nano-lob/src/features.rs:96-102, 158-174

Weighted Mid Price

Weights multiple price levels by inverse distance and quantity:

let weighted_mid = extractor.weighted_mid(&book, 5).unwrap();

// Weight = 1 / (level + 1)
// Level 1: weight = 1.0
// Level 2: weight = 0.5
// Level 3: weight = 0.33
// ...

Implementation:

for i in 0..levels {
    let weight = 1.0 / (i as f64 + 1.0);
    
    if let Some(level) = book.bid_level(i) {
        let qty = f64::from(level.quantity.value());
        bid_sum += level.price.as_f64() * qty * weight;
        bid_weight_sum += qty * weight;
    }
    // ... same for ask side
}

weighted_mid = (bid_sum + ask_sum) / (bid_weight_sum + ask_weight_sum)

Source: nano-lob/src/features.rs:176-206

Book Imbalance

Measures the imbalance between bid and ask liquidity:

imbalance = (bid_qty - ask_qty) / (bid_qty + ask_qty)

Values range from -1 (all asks) to +1 (all bids):

// Level 1 imbalance only
let imb_l1 = extractor.book_imbalance(&book, 1);

// Multi-level imbalance (e.g., top 5 levels)
let imb_5 = extractor.book_imbalance(&book, 5);

// Example:
// bid_qty = 100, ask_qty = 50
// imbalance = (100 - 50) / (100 + 50) = 0.333
// Positive imbalance suggests upward price pressure

Implementation:

pub fn book_imbalance(&self, book: &OrderBook, levels: usize) -> f64 {
    let bid_qty = f64::from(book.total_bid_quantity(levels).value());
    let ask_qty = f64::from(book.total_ask_quantity(levels).value());

    let total = bid_qty + ask_qty;
    if total > 0.0 {
        (bid_qty - ask_qty) / total
    } else {
        0.0
    }
}

Source: nano-lob/src/features.rs:104-106, 208-220

Advanced Features

Order Flow Imbalance (OFI)

OFI tracks changes in order flow between consecutive book states:

let prev_book = /* previous book snapshot */;
let curr_book = /* current book snapshot */;

let ofi = extractor.order_flow_imbalance(&prev_book, &curr_book);

OFI Calculation Logic:

Bid side:
- If bid price improved (higher): +new_bid_qty
- If bid price worsened (lower): -old_bid_qty
- If same price: delta_qty
Ask side:
- If ask price improved (lower): -new_ask_qty
- If ask price worsened (higher): +old_ask_qty
- If same price: -delta_qty

Implementation:

pub fn order_flow_imbalance(
    &self,
    prev_book: &OrderBook,
    curr_book: &OrderBook,
) -> f64 {
    let mut ofi = 0.0;

    // Bid side OFI
    if let (Some((prev_bp, prev_bq)), Some((curr_bp, curr_bq))) = 
        (prev_book.best_bid(), curr_book.best_bid()) 
    {
        if curr_bp > prev_bp {
            ofi += f64::from(curr_bq.value());
        } else if curr_bp < prev_bp {
            ofi -= f64::from(prev_bq.value());
        } else {
            ofi += (i64::from(curr_bq.value()) - i64::from(prev_bq.value())) as f64;
        }
    }

    // Ask side OFI (similar logic)
    // ...

    ofi / self.qty_scale
}

Source: nano-lob/src/features.rs:222-261

VPIN (Volume-Synchronized Probability of Informed Trading)

VPIN estimates the probability of informed trading by analyzing volume buckets:

use nano_lob::features::VpinCalculator;

let mut vpin = VpinCalculator::new(
    1000,  // bucket_size: complete bucket after 1000 contracts traded
    50,    // num_buckets: use last 50 buckets for calculation
);

// Add trades as they occur
vpin.add_trade(Quantity::new(10), true);   // buy
vpin.add_trade(Quantity::new(5), false);   // sell

// Calculate VPIN (0 to 1)
let vpin_value = vpin.calculate();

// High VPIN (>0.7) suggests high probability of informed trading
// Low VPIN (<0.3) suggests more random trading

VPIN Formula:

VPIN = Σ|buy_volume - sell_volume| / Σ(buy_volume + sell_volume)

Implementation:

pub fn calculate(&self) -> f64 {
    if self.buckets.is_empty() {
        return 0.0;
    }

    let mut abs_imbalance_sum = 0.0;
    let mut total_volume = 0.0;

    for (buy, sell) in &self.buckets {
        let buy_f = f64::from(*buy);
        let sell_f = f64::from(*sell);
        abs_imbalance_sum += (buy_f - sell_f).abs();
        total_volume += buy_f + sell_f;
    }

    if total_volume > 0.0 {
        abs_imbalance_sum / total_volume
    } else {
        0.0
    }
}

Source: nano-lob/src/features.rs:298-388

Trade Flow Tracking

Track cumulative trade flow over time:

use nano_lob::features::TradeFlowTracker;

let mut tracker = TradeFlowTracker::new();

// Record trades
tracker.record_trade(Quantity::new(100), true, timestamp);  // buy
tracker.record_trade(Quantity::new(50), false, timestamp);  // sell

// Get net flow
let net = tracker.net_flow();  // 100 - 50 = 50

// Get flow imbalance (-1 to 1)
let imbalance = tracker.flow_imbalance();  // (100-50)/(100+50) = 0.333

Source: nano-lob/src/features.rs:390-451

ML Feature Vector

Convert all features to a flat array for ML model input:

let features_array: [f64; 44] = extractor.to_array(&book);

// Feature layout:
// [0]      microprice
// [1]      weighted_mid
// [2]      spread
// [3]      imbalance_l1
// [4-13]   bid_levels (10 levels)
// [14-23]  ask_levels (10 levels)
// [24-33]  bid_cumulative (10 levels)
// [34-43]  ask_cumulative (10 levels)

Implementation:

pub fn to_array(&self, book: &OrderBook) -> [f64; 44] {
    let features = self.extract(book);
    let mut arr = [0.0; 44];

    arr[0] = features.microprice;
    arr[1] = features.weighted_mid;
    arr[2] = features.spread;
    arr[3] = features.imbalance_l1;

    // Bid levels (10 levels)
    for i in 0..FEATURE_LEVELS {
        arr[4 + i] = features.bid_levels[i];
    }

    // Ask levels (10 levels)
    for i in 0..FEATURE_LEVELS {
        arr[14 + i] = features.ask_levels[i];
    }

    // Bid cumulative
    for i in 0..FEATURE_LEVELS {
        arr[24 + i] = features.bid_cumulative[i];
    }

    // Ask cumulative
    for i in 0..FEATURE_LEVELS {
        arr[34 + i] = features.ask_cumulative[i];
    }

    arr
}

Source: nano-lob/src/features.rs:264-295

Usage Example

Real-Time Feature Extraction

use nano_lob::{OrderBook, LobFeatureExtractor};
use nano_feed::parser::MdpParser;

let mut parser = MdpParser::new();
let mut book = OrderBook::new(1);
let extractor = LobFeatureExtractor::new();

// Process market data stream
loop {
    let (message, remaining) = parser.parse(&buffer)?;
    
    if let MdpMessage::BookUpdate(update) = message {
        book.apply_book_update(&update);
        
        // Extract features
        let features = extractor.extract(&book);
        
        println!("Microprice: {:.2}", features.microprice);
        println!("Spread: {:.2} ticks", features.spread);
        println!("Imbalance L1: {:.3}", features.imbalance_l1);
        println!("Total bid depth: {:.0}", features.bid_depth);
        println!("Total ask depth: {:.0}", features.ask_depth);
        
        // Convert to ML input
        let ml_input: [f64; 44] = extractor.to_array(&book);
        // Feed to model...
    }
}

Temporal Features with History

use nano_lob::snapshot::SnapshotRingBuffer;

let mut history = SnapshotRingBuffer::new(100);  // Keep last 100 snapshots
let extractor = LobFeatureExtractor::new();

// After each book update
let snapshot = book.to_snapshot(timestamp);
history.push(snapshot);

// Calculate OFI over last N snapshots
if let (Some(prev), Some(curr)) = (history.get(history.len() - 2), history.latest()) {
    let prev_book = OrderBook::from_snapshot(prev);
    let curr_book = OrderBook::from_snapshot(curr);
    
    let ofi = extractor.order_flow_imbalance(&prev_book, &curr_book);
    println!("OFI: {:.3}", ofi);
}

Multi-Instrument Tracking

use std::collections::HashMap;

let mut books = HashMap::new();
let mut extractors = HashMap::new();

// ES futures: tick_size = 0.25
books.insert("ES", OrderBook::new(1));
extractors.insert("ES", LobFeatureExtractor::with_params(0.25, 100.0));

// NQ futures: tick_size = 0.25  
books.insert("NQ", OrderBook::new(2));
extractors.insert("NQ", LobFeatureExtractor::with_params(0.25, 50.0));

// Extract features for each instrument
for (symbol, book) in &books {
    let extractor = &extractors[symbol];
    let features = extractor.extract(book);
    println!("{} microprice: {:.2}", symbol, features.microprice);
}

Performance Characteristics

Extraction Latency

Full feature extraction: ~500-800ns
Microprice only: ~50-100ns
Book imbalance: ~100-200ns
OFI calculation: ~200-400ns
VPIN update: ~50-100ns

Benchmark your system:

cd crates/nano-lob
cargo bench --bench features

Memory Footprint

LobFeatures: 440 bytes (44 × f64 + overhead)
VpinCalculator: ~1KB (depends on num_buckets)
TradeFlowTracker: 32 bytes

Feature Interpretation

Microprice vs Mid Price

Mid price: Simple average, ignores liquidity
Microprice: Weighted by BBO liquidity, better fair value estimate
Use microprice for:
- Order placement decisions
- Fair value estimation
- Spread crossing decisions

Imbalance Signals

Positive imbalance (>0.3): More bid liquidity → upward pressure
Negative imbalance (<-0.3): More ask liquidity → downward pressure
Neutral (-0.2 to 0.2): Balanced book

OFI Interpretation

Positive OFI: Net aggressive buying → potential price increase
Negative OFI: Net aggressive selling → potential price decrease
OFI is a leading indicator (predicts next price move)

VPIN Thresholds

VPIN < 0.3: Low informed trading, safer to provide liquidity
VPIN 0.3-0.7: Moderate informed trading
VPIN > 0.7: High informed trading, higher adverse selection risk

Next Steps

ML Integration

Feed features into ML models

Order Book

Learn about the OrderBook structure

Market Data

Ingest market data feeds

Strategy Development

Build trading strategies

​LobFeatureExtractor

​Initialization

​Feature Structure

​Core Features

​Microprice

​Weighted Mid Price

​Book Imbalance

​Advanced Features

​Order Flow Imbalance (OFI)

​VPIN (Volume-Synchronized Probability of Informed Trading)

​Trade Flow Tracking

​ML Feature Vector

​Usage Example

​Real-Time Feature Extraction

​Temporal Features with History

​Multi-Instrument Tracking

​Performance Characteristics

​Extraction Latency

​Memory Footprint

​Feature Interpretation

​Microprice vs Mid Price

​Imbalance Signals

​OFI Interpretation

​VPIN Thresholds

​Next Steps

ML Integration

Order Book

Market Data

Strategy Development

LobFeatureExtractor

Initialization

Feature Structure

Core Features

Microprice

Weighted Mid Price

Book Imbalance

Advanced Features

Order Flow Imbalance (OFI)

VPIN (Volume-Synchronized Probability of Informed Trading)

Trade Flow Tracking

ML Feature Vector

Usage Example

Real-Time Feature Extraction

Temporal Features with History

Multi-Instrument Tracking

Performance Characteristics

Extraction Latency

Memory Footprint

Feature Interpretation

Microprice vs Mid Price

Imbalance Signals

OFI Interpretation

VPIN Thresholds

Next Steps