Autonomous LLM agents for trade intelligence, sub-millisecond risk, and 14-state order execution — running live on a dual-node NVIDIA Grace-Blackwell GB10 cluster.
Not a chatbot bolted onto a dashboard. Seven production services, on-prem GPUs, 200 Gbps RDMA fabric, and a CME-calibrated margin engine — all speaking open protocols (llms.txt, MCP, x402).
Every service below is live in production and exposed through a Cloudflare Tunnel. Click any hostname to hit the real endpoint.
/v1/market-insights, /v1/portfolio-recommendationsscipy.optimize calibrationMachine-readable, agent-native, and pay-per-call — the platform is built for the protocols the next generation of financial AI agents will speak.
Every service exposes /llms.txt and /llms-full.txt — machine-readable descriptions of endpoints, schemas, and capabilities for AI agent discovery.
Model Context Protocol endpoints let AI agents discover and invoke tools programmatically, with typed schemas and structured results.
Spark · Mercurius · SPAN 2Pay-per-call API access gated by x402 on Base Sepolia. No API keys — just on-chain USDC. Verified with live on-chain transactions.
Spark · MercuriusServices operate as autonomous agents — Spark generates, Mercurius validates, OMS executes. HMAC-signed ledgers make every step cryptographically auditable.
platform-wideDual-node NVIDIA Grace-Blackwell GB10 cluster. TensorRT-LLM + vLLM. NVFP4 / FP8 quantization. 235B-parameter models in production.
LocalInframajor-broker TraderAPI + IBKR (ib-insync with fully unattended IBC v3.23). Single broker-abstraction layer across the stack.
Spark · OMS · SSETFI'm a senior engineering leader with 20+ years at the intersection of institutional trading infrastructure and frontier AI systems. I build the thing end-to-end: the LLM agents, the risk math, and the GPU metal they run on.
Currently leading the AI-native modernization of Organization's Risk Monitor (P0 initiative — 10× analyst triage speedup), while independently architecting UnifiedQuant, Mercurius (sub-ms agentic risk firewall), Agentic OMS, UnifiedRBM (cross-asset RBM), and LocalInfra (dual Grace-Blackwell cluster running 235B-parameter inference at 15.4 tok/s).
Fluent in CME SPAN 2 mathematics, agentic LLM orchestration (DSPy DAGs, MCP, x402), Blackwell-class GPU optimization (NVFP4, TensorRT-LLM, RDMA), and — most importantly — the strategy / factory / visitor patterns that keep a JVM-era trading core safe during Python translation.