| Category | Count | Details |
|---|---|---|
| NVIDIA Libraries | 15+ | CUDA 12.8 full suite + cuQuantum + NCCL + Triton |
| LLM Providers | 4 | Anthropic · DeepSeek · NVIDIA NIM · Novita (proxy) |
| LLM Models Available | 13+ | Claude S4.6/H4.5/O4.7 · Nemotron 49B (self-hosted)/70B · Llama 405B/3.3-70B · Maverick 17B · DeepSeek V3/V4 · FinBERT |
| GPU Frameworks | 8 | JAX/XLA · PennyLane · cuPy · PyTorch · WebGPU |
| ML Libraries | 8 | scikit-learn · SciPy · NumPy · Numba · SymPy · QuantEcon · Transformers · CCXT |
| Market Data Sources | 9 | Rithmic · Databento · Finnhub · OANDA · FRED · CME · FIGI · CCXT · TV Scanner |
| Storage Layers | 3 | Redis 7.4 · QuestDB 9.3.4 · SQLite (quantumx.db / correlations.db / atallum.db) |
| Active Microservices | 10+ | FastAPI · Django · TankFarm · Celery · Rithmic · ExecBridge · QuantumTunnel · nginx |
| GPU Servers | 7 | SaladCloud JAX (Tier 1 ML) · SaladCloud Nemotron 49B (deploying) · H200 (on-demand) · Main server CPU |
| Observability Tools | 4 | W&B Weave · wandb · Sentry · OpenTelemetry |
| Cloud Providers | 3 | DigitalOcean · Google Cloud · SaladCloud |
| Total Python Packages | 172+ | 172+ packages — plus esig, roughpy, pennylane-lightning, lineax, jaxtyping (added May 2026) |
| Rank | GPU | VRAM | Architecture | Memory BW | Location / Use | Tier |
|---|---|---|---|---|---|---|
| #1 ◆ FLAGSHIP | NVIDIA H200 SXM | 141 GB HBM3e | Hopper GH100 | 16,896 CUDA cores | 640 Tensor cores | 3.35 TB/s | Dedicated — MERQUAN COMMAND ssh h200 | Tailscale 100.117.88.107 HPC SDK 26.3 | cuQuantum | RAPIDS | cuOpt | DEDICATED |
| #2 | NVIDIA A100 SXM × 2 | 80 GB HBM2e × 2 = 160 GB total | Ampere GA100 · 6,912 CUDA · 432 Tensor · 108 SMs | 2 TB/s × 2 | MERQUAN-R1 Training / NeMo Blueprint / data-flywheel | TRAINING |
| #3 | NVIDIA H100 SXM5 (NIM cloud) | 80 GB HBM2e | Hopper GH100 — 16,896 CUDA cores, 528 Tensor cores | 3.35 TB/s | NVIDIA NIM Cloud integrate.api.nvidia.com/v1 Per-token via Enterprise Account | NIM CLOUD |
| #4 FALLBACK | NVIDIA L40 | 48 GB GDDR6 | Ada Lovelace AD102 | 18,176 CUDA | 568 Tensor | 864 GB/s | Previous dedicated GPU server Fallback / overflow compute | FALLBACK |
| #5 | NVIDIA RTX 5090 | 32 GB GDDR7 | Blackwell GB202 | 21,760 CUDA | 680 Tensor | 1,792 GB/s | SaladCloud pool — best node | DEDICATED |
| #6 | NVIDIA RTX 4090 | 32 GB GDDR6X | Ada Lovelace AD102 | 16,384 CUDA | 512 Tensor | 1,008 GB/s | SaladCloud pool | DEDICATED |
| #7 | NVIDIA RTX 3090 Ti | 32 GB GDDR6X | Ampere GA102 | 10,752 CUDA | 336 Tensor | 936 GB/s | SaladCloud pool | DEDICATED |
| #8 | NVIDIA RTX A5000 | 32 GB GDDR6 | Ampere GA102 | 8,192 CUDA | 256 Tensor | 768 GB/s | SaladCloud pool | DEDICATED |
| #9 | NVIDIA RTX 3090 | 32 GB GDDR6X | Ampere GA102 | 10,496 CUDA | 328 Tensor | 936 GB/s | SaladCloud pool | DEDICATED |
▌ CLOUD CONTAINERS
| Container | GPU Allocation | CPU / RAM / Disk | Model / Engine | Endpoint | Pool | Status |
|---|---|---|---|---|---|---|
| merquanjaxsalad (JAX Tier 1 ML) | Best available node from pool RTX 5090 → 4090 → 3090Ti → A5000 → 3090 | 32 vCPU | 60 GB RAM | 250 GB SSD Priority: High | Replicas: 1 | JAX v2 — 9 engines: Black-76 · HAR-RV/GARCH · VaR Path Signatures · Rough Vol Mamba SSM · Neural SDE Deep Hedging · Quantum VQC | durian-alfalfa-sz0nun7i8h0cf8ud .salad.cloud:8888 | High | ACTIVE |
| NIM Nemotron 49B (LLM Inference) | RTX 5090 (32 GB) — required for 49B int4 Fails over if unavailable | 32 vCPU | 60 GB RAM | 250 GB SSD Priority: High | Replicas: 1 | nvidia/llama-3.3-nemotron-super-49b-v1.5 Self-hosted — replaces cloud NIM API | 195.181.163.241:20608 (SSH) Gateway URL: TBD on deploy | High | ACTIVE |
▌ MAIN SERVERS & SERVICES
| Service | Stack / Version | CPU / RAM | Role | Address / Port | Uptime | Status |
|---|---|---|---|---|---|---|
| MERQUAN Main Server | DigitalOcean VPS — Ubuntu 24 | 8 vCPU | 16 GB | All backend services + nginx + SSL | 161.35.43.80 / merquan.com | 24/7 | ACTIVE |
| FastAPI — merquan_fastapi | uvicorn + uvloop | Python 3.12 | 1 worker | NAWA 14-engine, all trading APIs, SAFA | 127.0.0.1:5001 | 24/7 | ACTIVE |
| Django — merquanglobal | gunicorn | 8w × 4t | Python 3.11 | 8 vCPU share | API router, Sanctum admin, CLAW, Celery | 127.0.0.1:8000 | 24/7 | ACTIVE |
| Tank Farm — merqintel | FastAPI | Python 3.11 | — | Candle normaliser, 6 data tanks, QuestDB proxy | 127.0.0.1:5003 | 24/7 | ACTIVE |
| Redis | Redis 7.4.0 | In-memory | Tick cache, Rithmic DOM store, Celery broker | 127.0.0.1:6379 | 24/7 | ACTIVE |
| QuestDB | QuestDB 9.3.4 | Disk: 1M+ bars | Time-series OHLCV — 16yr GC + Rithmic 327 contracts | 127.0.0.1:9000 | 24/7 | ACTIVE |
| Rithmic R|Protocol | Protobuf/gRPC WSS | 50+ proto defs | Live CME futures ticks, DOM, BBO depth | merquan-rithmic.service | 24/7 | ACTIVE |
| Exec Bridge | FastAPI | Python | — | OANDA demo order execution | 127.0.0.1:5004 | 24/7 | ACTIVE |
| Celery + Beat | Redis broker | 4 workers | — | Background tasks, rvol refresh, QuestDB backfill | merquan-celery.service | 24/7 | ACTIVE |
| DO Agent Server | DigitalOcean — London | 4 vCPU | 8 GB | OpenClaw agent, @yasin_scholar_bot, skill sync 30min | 161.35.43.80 | 24/7 | ACTIVE |
| nginx | Reverse proxy | SSL | — | All services + static + SSL termination | merquan.com / merqintel.com | 24/7 | ACTIVE |
| Databento | databento 0.75.0 | 10M 1min bars | 16yr historical GC — Parquet → H200 pipeline | api.databento.com | On-demand | ACTIVE |
| Library | Version | Used For | Status |
|---|---|---|---|
| CUDA Toolkit | 12.8 | Low-level GPU compute foundation | Active |
| cuDNN | 9.10.2.21 | Neural network layer acceleration | Active |
| cuBLAS | 12.8 | GPU linear algebra (matrix multiply) | Active |
| cuFFT | 12.8 | Fast Fourier transforms on GPU | Active |
| cuSolver | 12.8 | GPU-side linear solvers / eigenvalue | Active |
| cuSPARSE | 12.8 | Sparse matrix operations | Active |
| cuRand | 12.8 | GPU random number generation (Monte Carlo) | Active |
| cuFile | 12.8 | GPU Direct Storage I/O | Active |
| nvJitLink | 12.8 | JIT kernel linking | Active |
| nvTX | 12.8 | GPU profiling markers | Active |
| cuQuantum | 26.01.0 | Quantum circuit simulation + HMM regime detection | Active |
| NCCL | 2.27.5 | Multi-GPU communication | Active |
| Triton (NVIDIA) | 3.6.0 | GPU kernel compilation + PyTorch JIT fusion | Active |
| NVIDIA NIM Cloud API | — | Hosted LLM inference (Nemotron, Llama, DeepSeek) | Active |
| NIM Self-Hosted (new) | — | llama-3.3-nemotron-super-49b-v1.5 on SaladCloud | Deploying |
| Type | Official Model Name | GPU Spec | Container / Server | Purpose | Status |
|---|---|---|---|---|---|
| Cloud NIM API (current) | meta/llama-4-maverick-17b-128e-instruct | NVIDIA Cloud (hosted) | integrate.api.nvidia.com/v1 | Trading bias, CLAW fallback | ACTIVE |
| Cloud NIM API | nvidia/llama-3.1-nemotron-70b-instruct | NVIDIA Cloud (hosted) | integrate.api.nvidia.com/v1 | CLAW institutional inference | ACTIVE |
| Cloud NIM API | meta/llama-3.3-70b-instruct | NVIDIA Cloud (hosted) | integrate.api.nvidia.com/v1 | General inference via CLAW | ACTIVE |
| Cloud NIM API | meta/llama-3.1-405b-instruct | NVIDIA Cloud (hosted) | integrate.api.nvidia.com/v1 | Heavy reasoning tasks | ACTIVE |
| Self-Hosted NIM (NEW) | nvidia/llama-3.3-nemotron-super-49b-v1.5 | SaladCloud GPU Pool 32 GB VRAM | 32 vCPU | 60 GB RAM | 195.181.163.241:20608 SaladCloud container | Self-hosted 49B reasoning — replaces all cloud NIM calls | ACTIVE |
| Engine | Academic Source | What It Does | JAX Stack | Endpoint | File | Status |
|---|---|---|---|---|---|---|
| Black-76 Batch Pricer | Black (1976) — Futures Options | Full strike chain pricing + exact Greeks (Delta/Vega/Theta) in one GPU call | jax.vmap + jax.grad + jax.jit | /api/jax/options | engines/options.py | LIVE ✓ |
| HAR-RV + GARCH | Corsi (2009) + Bollerslev (1986) | Heterogeneous autoregressive realised volatility + GARCH(1,1) batch across instruments | jax.lax.scan + jax.vmap | /api/jax/har_rv | engines/har_rv.py | LIVE ✓ |
| Monte Carlo VaR | Basel II/III | 10K bootstrap simulations, JIT-compiled, configurable confidence interval | jax.jit + jax.vmap | /api/jax/var | engines/har_rv.py | LIVE ✓ |
| Path Signatures | Lyons (2014) — Oxford Math Inst | Provably optimal feature extraction. Universal approximation theorem for path-dependent functionals | esig 1.0.0 + numpy | /api/jax/v2/signatures | engines/path_signatures.py | LIVE ✓ |
| Rough Volatility (RFSV) | Gatheral/Jaisson/Rosenbaum (2018) | H≈0.1 Hurst exponent — best vol model known to science. rBergomi simulation + implied vol inversion | Custom JAX + jnp.linalg.lstsq | /api/jax/v2/rough_vol | engines/rough_vol.py | LIVE ✓ |
| Mamba SSM | Gu & Dao (2023) — Mamba Paper | O(n) selective state space model. Beats Transformers on long financial sequences. 5-regime detection + direction | Equinox 0.13.8 + jax.lax.scan | /api/jax/v2/regime | engines/mamba.py | LIVE ✓ |
| Neural SDE (Latent) | Kidger et al. (2021) — ICLR | Drift + diffusion as neural networks. Full price path distribution (not point estimate). ELBO training | Diffrax 0.7.2 + lineax + Equinox | /api/jax/v2/nsde_paths | engines/neural_sde.py | LIVE ✓ |
| Deep Hedging | Buehler et al. (2019) — J.P.Morgan | End-to-end RL hedging strategy. CVaR loss. Handles transaction costs. Replaces Black-Scholes delta | Equinox GRU + jax.lax.scan | /api/jax/v2/deep_hedge | engines/deep_hedging.py | LIVE ✓ |
| Quantum-Classical VQC | Farhi et al. (2014) QAOA + PennyLane | VQC portfolio optimisation. QAOA ansatz, parameter-shift gradients. Explores Hilbert space for weight optima | PennyLane 0.44.1 + Optax Adam | /api/jax/v2/quantum_portfolio | engines/quantum_hybrid.py | LIVE ✓ |
| Library | Version | Algorithms / Features Used | Used In |
|---|---|---|---|
| scikit-learn | 1.8.0 | RandomForest, KNN, LogisticRegression, GaussianNB, SVM, PCA, StandardScaler | merquan_algo.py, merquan_arb_engine.py, merquan_ml_engine.py |
| SciPy | 1.17.1 | stats (norm, entropy, linregress), optimize (minimize, linprog), signal (savgol_filter), cluster.hierarchy | Physics engines, portfolio optimisation |
| NumPy | 2.4.3 | Core arrays, ORJson numpy serialisation (OPT_SERIALIZE_NUMPY) | Throughout — all engines |
| Numba | 0.65.0 | @njit JIT compilation for tight-loop indicators (moving averages) | merquan_engines_fast.py |
| SymPy | 1.14.0 | Symbolic mathematics in physics engines | Physics engine modules |
| QuantEcon | 0.11.2 | Economic models (Markov, LQ control, etc.) | Engine research modules |
| HuggingFace Transformers | latest | FinBERT pipeline for financial NLP sentiment | merquan_nawa_v2.py (E7 lazy load) |
| CCXT | 4.5.46 | Unified async crypto exchange API — 5 exchanges | main.py crypto price loop |
| Provider | Model(s) | Endpoint / Base | Role | Fallback Order |
|---|---|---|---|---|
| Anthropic Claude | claude-sonnet-4-6 / haiku-4-5 / opus-4-7 | api.anthropic.com | Commentary, CLAW assistant, trading bias | 1 — Primary |
| DeepSeek Official | deepseek-v3, deepseek-v4-flash, v4-pro | api.deepseek.com | Fast reasoning, analysis, scalp commentary | 2 — Fast LLM |
| Novita AI | deepseek-v3 (proxy) | api.novita.ai/openai | DeepSeek relay when official API unavailable | 3 — Relay |
| NVIDIA NIM Cloud | llama-4-maverick-17b, nemotron-70b, llama-3.3-70b, llama-3.1-405b | integrate.api.nvidia.com/v1 | Institutional inference, bias gen, CLAW models | 4 — Fallback |
| NVIDIA NIM Local | llama-3.3-nemotron-super-49b-v1.5 | SaladCloud (deploying) | Self-hosted 49B reasoning model | New — Deploying |
| FinBERT | ProsusAI/finbert | HuggingFace / local load | NLP sentiment on Finnhub headlines (NAWA E7) | Always-on (lazy load) |
| Merai | Merqintel V1.0 | Github / local load | Commentary, CLAW assistant, trading bias | Always-on |
| Framework | Feature | Used For | Where | Status |
|---|---|---|---|---|
| JAX + XLA | jax.jit | JIT-compile options pricing + volatility models | merquanjaxsalad / SaladCloud | Active |
| JAX | jax.vmap | Vectorise full strike chains in single GPU call | engines/options.py | Active |
| JAX | jax.lax.scan | GARCH(1,1) loop — GPU-native, zero Python overhead | engines/har_rv.py | Active |
| JAX | jax.grad | Exact Black-76 Greeks (delta/vega/theta) | engines/options.py | Active |
| Equinox | — | JAX-native neural net layers (GRU, Linear, etc.) | merquanjaxsalad image | In Image |
| Optax | — | JAX gradient optimisers (Adam, etc.) | merquanjaxsalad image | In Image |
| BlackJAX | — | Bayesian inference on GPU | merquanjaxsalad image | In Image |
| Diffrax | — | Differentiable ODEs/SDEs on GPU | merquanjaxsalad image | In Image |
| PennyLane GPU | lightning.gpu | Quantum ML circuits (NAWA quantum engines) | merquan_quantum_stack.py | Conditional |
| cuPy | — | GPU array ops + GARCH volatility | merquan_quantum_stack.py | Conditional |
| cuQuantum | — | Quantum circuit simulation, HMM regimes | merquan_quantum_stack.py | Conditional |
| PyTorch 2.10 | torch.jit | Deep learning backbone + Triton kernel fusion | merqintel engines | Active |
| WebGPU / WebGL2 | — | Browser-side heatmap + footprint rendering | qce_engine.html / heatmaptest.html | Active |
| Source | Data Type | Instruments | Key / Credential | Rate Limit / Notes |
|---|---|---|---|---|
| Rithmic R|Protocol | Live ticks, DOM, BBO, depth-by-order, EOD | 327 CME/COMEX contracts (GCM6, 6E, 6J etc.) | /etc/merquan/rithmic.env | Protobuf/gRPC WSS — merquan-rithmic.service |
| Databento | 16yr historical OHLCV | GC (gold futures) — 10M 1min bars | databento 0.75.0 installed | 466K bars in QuestDB, 549K 1h bars |
| Finnhub | Quotes, headlines, sentiment, tick data | Stocks, ETFs, FX, metals | 3 keys: paid/ticks/free | 150 req/min — FinnhubLimiter token bucket |
| OANDA Streaming | Live FX + metals tick stream | XAU, XAG, XPT, EUR, JPY, GBP + 40+ pairs | OANDA_API_KEY (env) | tick_ingestor.py persistent stream |
| FRED | Macroeconomic — Fed Funds rate | FEDFUNDS series | FRED_API_KEY (env) | 24h cache |
| CME Group | Futures options chains, market data | 6E (EUR), 6J (JPY), GC (Gold) options | /etc/merquan/cme_merquan.env | OAuth2 client_credentials |
| Bloomberg FIGI | Instrument identifier mapping | Global securities | figi_instruments.py | REST API |
| Binance / OKX / Bybit / Kraken / Coinbase | Crypto prices + arb spreads | 40+ tokens | CCXT async | 15s refresh, cross-exchange arb detection |
| TradingView Scanner | CME/COMEX/NYMEX futures OHLCV | All futures contracts | Public API (no auth) | scanner.tradingview.com — 15s cache |
| Tool | Version | What It Tracks | Alert Thresholds | Status |
|---|---|---|---|---|
| W&B Weave | 0.52.37 | All LLM calls (Claude/DeepSeek/NIM) — cost + latency + error rate | >5% error in 5min / >$5/hr spend | Active — 41+ run logs |
| wandb | 0.26.1 | ML experiment tracking, model runs | wandb.alert() on anomalies | Active |
| Sentry SDK | 2.58.0 | Application error tracking + crash reporting | — | Wired (verify active) |
| OpenTelemetry | 1.40.0 | Distributed tracing across services | — | Packages present |
| journalctl/systemd | — | Service health logs for all 10+ systemd units | — | Active |