The best linear models
—without the cost

Leman.zero delivers transformer-quality results with linear complexity. Memory efficient, blazing fast, and easy to integrate.

Start Building

Outperforming the competition
with linear efficiency.

Leman.zero achieves state-of-the-art results among linear models while using a fraction of the memory and compute of traditional transformers.

#1

Best accuracy among linear models

1B

Tokens to train 14B model

Only 3,000 training steps

~5%

Within Qwen-14B accuracy

With far less memory usage

14B Scale

Best among linear models

Arch C benchmark at 14B scale. Leman.zero leads all linear models and closes the gap with full transformers.

Linear Models

Mamba Codestral

Mamba Falcon

Leman.zero(Ours)

77.8

Transformer Baseline

Qwen-3 14B

Only ~5% gap to full transformer at a fraction of the cost

Arch C benchmark (14B parameter scale). Score % — higher is better.

Long Context

Perfect recall at any length

Needle-in-a-Haystack accuracy across context lengths. Leman.zero stays near-perfect where others collapse.

Leman.zero (Ours)

GDN

Mamba2

91.4% accuracy at 32k context — 4x better than the next linear model

NIAH-1 accuracy at 500M parameter scale. Higher is better.

Benchmarks

Leman.zero vs. the competition

1B model performance on standard benchmarks. Higher is better.

Benchmark

Leman.zero

GDN

Mamba2

SWDE

FDA

SQUAD

SWDE

Leman.zero

GDN

Mamba2

FDA

Leman.zero

GDN

Mamba2

SQUAD

Leman.zero

GDN

Mamba2

Results from 1B parameter models. See our documentation for full benchmark details.

Why Leman.zero

The best of both worlds

Transformer-quality results with linear efficiency. No compromises.

Linear Complexity

O(n) attention mechanism scales efficiently with sequence length, unlike quadratic transformers.

Memory Efficient

Process longer contexts with less GPU memory. Run larger models on smaller hardware.

Long Context Support

Generalizes beyond training length for very long sequences. No context window limitations.

OpenAI-Compatible API

Drop-in replacement for existing integrations. Switch with a single line change.

Efficient Training

Train competitive models with far fewer tokens and steps. Reduce your compute costs dramatically.

State-of-the-Art Quality

Best-in-class results among linear models on SWDE, FDA, and SQUAD benchmarks.

example.ts

import OpenAI from 'openai';

// Try it for free - no API key needed
const client = new OpenAI({
  baseURL: 'https://carloshurtadocomin--lemanlabs-openai-api-fastapi-app.modal.run/v1',
});

const response = await client.chat.completions.create({
  model: 'leman-zero',
  messages: [
    { role: 'user', content: 'Hello!' }
  ],
});

Simple Integration

One line of code to switch

Our API is fully compatible with OpenAI's SDK. Just change the base URL and you're ready. No rewrites, no migrations, no headaches.

OpenAI SDK compatible

Streaming support

Function calling

JSON mode

View Documentation

Ready to build with
Leman.zero?

Get started in minutes. Experience state-of-the-art linear models with an API you already know.

Start Building

OpenAI-compatible API

No credit card required

The best linear models—without the cost

Outperforming the competitionwith linear efficiency.

#1

1B

~5%

Best among linear models

Perfect recall at any length

Leman.zero vs. the competition

The best of both worlds

Linear Complexity

Memory Efficient

Long Context Support

OpenAI-Compatible API

Efficient Training

State-of-the-Art Quality

One line of code to switch

Ready to build withLeman.zero?

The best linear models
—without the cost

Outperforming the competition
with linear efficiency.

Ready to build with
Leman.zero?