← Back to the AI Box Compare S, M, L and XL

Which size for which organisation?

All boxes share the same stack, the same behaviour, the same sovereignty guarantees. What changes: capacity, supported models, price.

LMbox S

For teams of 20-40 people

Compact, silent, plugs into a standard wall outlet.

Concurrent users up to 40

Avg throughput ≈ 28 tok/s

Time-to-first-token ≈ 800 ms

Power · noise 65 W · 22 dB

12 000 €

+ 4 800 € / yr

Request a quote

★ Most popular

LMbox M

For mid-market organisations of 50-150

Reference build: enough horsepower to run Gemma 4 31B comfortably, no exotic hardware required.

Concurrent users up to 150

Avg throughput ≈ 52 tok/s

Time-to-first-token ≈ 450 ms

Power · noise 180 W · 32 dB

25 000 €

+ 9 600 € / yr

Request a quote

LMbox L

For 150+ users or multi-site

Maximalist Apple Silicon: 192 GB unified RAM for heavy models or hundreds of concurrent users.

Concurrent users up to 400

Avg throughput ≈ 85 tok/s

Time-to-first-token ≈ 280 ms

Power · noise 350 W · 40 dB

38 000 €

+ 14 400 € / yr

Request a quote

LMbox XL

Datacenter build — frontier MoE models, locally

4U Linux rack with 4–8× NVIDIA H100/H200 — the only way to run open-weights frontier models (Kimi K2.6, DeepSeek V4 Pro) fully on-prem.

Concurrent users up to 1500

Avg throughput ≈ 220 tok/s

Time-to-first-token ≈ 150 ms

Power · noise 3500 W · 65 dB

95 000 €

+ 28 800 € / yr

Request a quote

Full specifications

Attribute	LMbox S	LMbox M ★	LMbox L	LMbox XL
Hardware
Form factor	Compact desktop · 5 × 19,7 × 19,7 cm	Mac Studio compact · 9,5 × 19,7 × 19,7 cm	Mac Studio M3 Ultra MAX ou Linux 1U rackable	Linux 4U rackable 19" — multi-GPU NVIDIA
Processor	Apple M4 Pro · 14 cœurs CPU · 20 cœurs GPU	Apple M3 Ultra · 32 cœurs CPU · 80 cœurs GPU	Apple M3 Ultra max ou AMD EPYC 7763 (variante x86)	AMD EPYC 9754 128-core ou Intel Xeon Platinum 8592+
Unified RAM	64 Go	96 Go	192 Go	1024 Go
Storage	2 To	4 To	8 To	16 To
AI accelerator	Apple Neural Engine 16 cœurs (38 TOPS)	Apple Neural Engine 32 cœurs (60 TOPS) · 80-core GPU	Apple Neural Engine 32 cœurs (Mac) ou 2 × NVIDIA L40S (variante x86)	4 × NVIDIA H100 80 Go (config base) ou 8 × H200 141 Go (config max)
Network	1 × 10 GbE · Wi-Fi 6E · Thunderbolt 4	2 × 10 GbE · Wi-Fi 6E · 6 × Thunderbolt 4	2 × 10 GbE · 1 × 25 GbE optionnel · IPMI (variante x86)	2 × 25 GbE · 1 × 100 GbE InfiniBand optionnel · IPMI 2.0 · BMC
Physical
Dimensions (h × w × d)	5 × 19,7 × 19,7 cm	9,5 × 19,7 × 19,7 cm	Mac Studio compact · ou 1U rack 19"	4U rack 19" · 175 × 482 × 712 mm
Weight	1.4 kg	3.6 kg	3.6 kg	32.0 kg
Power draw	65 W	180 W	350 W	3500 W
Noise level	22 dB	32 dB	40 dB	65 dB
Performance
Concurrent users	40 users	150 users	400 users	1500 users
Avg throughput	≈ 28 tok/s	≈ 52 tok/s	≈ 85 tok/s	≈ 220 tok/s
Time-to-first-token	≈ 800 ms	≈ 450 ms	≈ 280 ms	≈ 150 ms
Models
Supported models	Gemma 4 9B · Mistral Small 3 · Llama 3 8B · Qwen 2.5 7B	Gemma 4 31B · Mistral Large 2 · Llama 3 70B (q4) · Qwen 2.5 32B · Codestral 22B	Gemma 4 31B fp16 · Llama 3 70B · Qwen 3.6 32B · GLM 5.1 · Mistral Large 2 · Mixtral 8x22B	Kimi K2.6 · DeepSeek V4 Pro · MiMo V2.5 Pro (1M ctx) · Qwen 3.6 Max · Llama 3 70B fp16 · Mistral Large 2 fp16
Pricing
CAPEX (one-shot)	12 000 €	25 000 €	38 000 €	95 000 €
Annual support	4 800 € / an	9 600 € / an	14 400 € / an	28 800 € / an

Decision helper

How to choose?

LMbox S

Starting or piloting

6-month PoC, pilot team, 30-person subsidiary, or consulting firm with occasional needs. The S checks all boxes at a reasonable price.

LMbox M

Deploying at scale

Our default recommendation: mid-market 50–150 people, 30B models at full resolution, low latency, no exotic hardware needed.

LMbox L

You have hard constraints

Multi-site, 70B+ models, > 80 tok/s, 1U datacenter rack (Linux x86), or heavy concurrent loads.

LMbox XL

Frontier models, fully on-prem

Datacenter with multi-GPU NVIDIA H100. To run Kimi K2.6 / DeepSeek V4 Pro / MiMo 1M with zero cloud calls. 200+ dev mid-market or critical-infra with maximum sovereignty requirements.

Talk to an expert