The best linear models
without the cost

Leman.zero delivers transformer-quality results with linear complexity. Memory efficient, blazing fast, and easy to integrate.

Start Building

Outperforming the competition
with linear efficiency.

Leman.zero achieves state-of-the-art results among linear models while using a fraction of the memory and compute of traditional transformers.

#1

Best accuracy among linear models

1B

Tokens to train 14B model

Only 3,000 training steps

~5%

Within Qwen-14B accuracy

With far less memory usage

14B Scale

Best among linear models

Arch C benchmark at 14B scale. Leman.zero leads all linear models and closes the gap with full transformers.

Linear Models
Mamba Codestral
47
Mamba Falcon
70
Leman.zero(Ours)
77.8
Transformer Baseline
Qwen-3 14B
82
Only ~5% gap to full transformer at a fraction of the cost

Arch C benchmark (14B parameter scale). Score % — higher is better.

Long Context

Perfect recall at any length

Needle-in-a-Haystack accuracy across context lengths. Leman.zero stays near-perfect where others collapse.

0%25%50%75%100%1k2k4k8k16k32k
Leman.zero (Ours)
GDN
Mamba2
91.4% accuracy at 32k context — 4x better than the next linear model

NIAH-1 accuracy at 500M parameter scale. Higher is better.

Benchmarks

Leman.zero vs. the competition

1B model performance on standard benchmarks. Higher is better.

Benchmark
Leman.zero
GDN
Mamba2
SWDE
70
68
65
FDA
63
57
58
SQUAD
46
40
33
SWDE
Leman.zero
70
GDN
68
Mamba2
65
FDA
Leman.zero
63
GDN
57
Mamba2
58
SQUAD
Leman.zero
46
GDN
40
Mamba2
33

Results from 1B parameter models. See our documentation for full benchmark details.

Why Leman.zero

The best of both worlds

Transformer-quality results with linear efficiency. No compromises.

Linear Complexity

O(n) attention mechanism scales efficiently with sequence length, unlike quadratic transformers.

Memory Efficient

Process longer contexts with less GPU memory. Run larger models on smaller hardware.

Long Context Support

Generalizes beyond training length for very long sequences. No context window limitations.

OpenAI-Compatible API

Drop-in replacement for existing integrations. Switch with a single line change.

Efficient Training

Train competitive models with far fewer tokens and steps. Reduce your compute costs dramatically.

State-of-the-Art Quality

Best-in-class results among linear models on SWDE, FDA, and SQUAD benchmarks.

example.ts
import OpenAI from 'openai';

// Try it for free - no API key needed
const client = new OpenAI({
  baseURL: 'https://carloshurtadocomin--lemanlabs-openai-api-fastapi-app.modal.run/v1',
});

const response = await client.chat.completions.create({
  model: 'leman-zero',
  messages: [
    { role: 'user', content: 'Hello!' }
  ],
});

Simple Integration

One line of code to switch

Our API is fully compatible with OpenAI's SDK. Just change the base URL and you're ready. No rewrites, no migrations, no headaches.

OpenAI SDK compatible
Streaming support
Function calling
JSON mode
View Documentation

Ready to build with
Leman.zero?

Get started in minutes. Experience state-of-the-art linear models with an API you already know.

Start Building
OpenAI-compatible API
No credit card required