Products

Inference built for scale

Doubleword Inference Stack

High performance inference stack

Use Cases

Long running background agents

Synthetic Data Generation

Generate high volumes of data for fine- tuning

Data Processing

Apply intelligence to large volumes of data

Resources

Technical docs and API reference

Ready-to-run examples

Seen in the Wild

Community content and projects

Resource Centre

All our blogs and guides

Our blog on building inference systems

Key Al terms explained

Savings Calculator

See how much you save with Doubleword

Solutions

By Deployment Option

On-premise Cloud Hybrid

By Team

AI, ML & Data Science Platform, DevOps & IT Compliance & Cyber

Stay Updated

Resource Center

More articles:

Customer Stories

Categories

Press

Technical Guide

News

Blog

Video

Webinar

Tutorial

Search

Themes

Artificial Intelligence

Batch inference

Enterprise AI

Fast LLMs

Fine-Tuning

Future of AI

Hardware

Inference Lab

Inference Optimization

Inference Optimization

Medium

MLOps

Model Serving

NLP Models

Quantization

Rust

Self-Hosted Architecture

Speculative Decoding

Titan Takeoff Inference Server

Нealthcare

Reset all filters

Showing 0 of 0

Thank you! Your submission has been received!

Oops! Something went wrong while submitting the form.

The swarm that designs itself

The swarm that designs itself

Inference Lab

Blog

The swarm that designs itself

The swarm that designs itself

•

June 30, 2026

AI Tokenomics: How to tokenmin while ROImaxxing

AI Tokenomics: How to tokenmin while ROImaxxing

Inference Optimization

Press

AI Tokenomics: How to tokenmin while ROImaxxing

AI Tokenomics: How to tokenmin while ROImaxxing

No items found.

•

June 30, 2026

What happens when you run a CUDA kernel

What happens when you run a CUDA kernel

Inference Lab

Blog

What happens when you run a CUDA kernel

What happens when you run a CUDA kernel

•

June 29, 2026

Why You Should Use Open Source Models

Why You Should Use Open Source Models

Blog

Why You Should Use Open Source Models

Why You Should Use Open Source Models

•

June 22, 2026

A Frontier Open Source LLM Will Be Released On 3rd December 2026

A Frontier Open Source LLM Will Be Released On 3rd December 2026

Inference Lab

Blog

A Frontier Open Source LLM Will Be Released On 3rd December 2026

A Frontier Open Source LLM Will Be Released On 3rd December 2026

•

June 20, 2026

InfiniBand, RoCE, and all that

InfiniBand, RoCE, and all that

Inference Lab

Blog

InfiniBand, RoCE, and all that

InfiniBand, RoCE, and all that

•

June 19, 2026

UCCL-EP: An expert parallel communications kernel without owning the NIC

UCCL-EP: An expert parallel communications kernel without owning the NIC

Inference Lab

Blog

UCCL-EP: An expert parallel communications kernel without owning the NIC

UCCL-EP: An expert parallel communications kernel without owning the NIC

•

June 12, 2026

Anatomy of a high-performance EP kernel

Anatomy of a high-performance EP kernel

Inference Lab

Blog

Anatomy of a high-performance EP kernel

Anatomy of a high-performance EP kernel

•

June 10, 2026

Doubleword CEO Meryem Arik Talks Scaling and AI Investment | Bloomberg Talks

Doubleword CEO Meryem Arik Talks Scaling and AI Investment | Bloomberg Talks

Press

Doubleword CEO Meryem Arik Talks Scaling and AI Investment | Bloomberg Talks

Doubleword CEO Meryem Arik Talks Scaling and AI Investment | Bloomberg Talks

No items found.

•

June 10, 2026

Deloitte launches 'Adopt 100' programme with NVIDIA to accelerate AI adoption for businesses

Deloitte launches 'Adopt 100' programme with NVIDIA to accelerate AI adoption for businesses

Press

Deloitte launches 'Adopt 100' programme with NVIDIA to accelerate AI adoption for businesses

Deloitte launches 'Adopt 100' programme with NVIDIA to accelerate AI adoption for businesses

No items found.

•

June 9, 2026

How the UK Is Turning Sovereign AI Ambition Into Action With NVIDIA Technologies

How the UK Is Turning Sovereign AI Ambition Into Action With NVIDIA Technologies

Press

How the UK Is Turning Sovereign AI Ambition Into Action With NVIDIA Technologies

How the UK Is Turning Sovereign AI Ambition Into Action With NVIDIA Technologies

No items found.

•

June 9, 2026

The economics of speculative decoding

The economics of speculative decoding

Inference Lab

Blog

The economics of speculative decoding

The economics of speculative decoding

•

June 8, 2026

Pushing memory bound kernels beyond the speed of light with lossless decompression

Pushing memory bound kernels beyond the speed of light with lossless decompression

Inference Lab

Blog

Pushing memory bound kernels beyond the speed of light with lossless decompression

Pushing memory bound kernels beyond the speed of light with lossless decompression

•

May 26, 2026

MoE expert co-activations: Reordering inputs yields easy throughput gains.

MoE expert co-activations: Reordering inputs yields easy throughput gains.

Inference Lab

Blog

MoE expert co-activations: Reordering inputs yields easy throughput gains.

MoE expert co-activations: Reordering inputs yields easy throughput gains.

•

May 15, 2026

Meryem Arik Sifted Interview

Meryem Arik Sifted Interview

Future of AI

Press

Meryem Arik Sifted Interview

Meryem Arik Sifted Interview

No items found.

https://sifted.eu/

•

May 15, 2026

Speculative KV coding: losslessly compressing KV cache by up to ~4× using a predictor model

Speculative KV coding: losslessly compressing KV cache by up to ~4× using a predictor model

Inference Lab

Blog

Speculative KV coding: losslessly compressing KV cache by up to ~4× using a predictor model

Speculative KV coding: losslessly compressing KV cache by up to ~4× using a predictor model

•

May 8, 2026

Tensor Network Attention

Tensor Network Attention

Inference Lab

Blog

Tensor Network Attention

Tensor Network Attention

•

May 6, 2026

In search of wasted bits: how much information do LLM weights carry?

In search of wasted bits: how much information do LLM weights carry?

Inference Lab

Blog

In search of wasted bits: how much information do LLM weights carry?

In search of wasted bits: how much information do LLM weights carry?

•

May 5, 2026

Inference when no one is waiting

Inference when no one is waiting

Blog

Inference when no one is waiting

Inference when no one is waiting

•

May 5, 2026

tANS: precomputing rANS

tANS: precomputing rANS

Inference Lab

Blog

tANS: precomputing rANS

tANS: precomputing rANS

•

April 27, 2026

Also-rANS: Asymmetric Numeral Systems for entropy coding

Also-rANS: Asymmetric Numeral Systems for entropy coding

Inference Lab

Blog

Also-rANS: Asymmetric Numeral Systems for entropy coding

Also-rANS: Asymmetric Numeral Systems for entropy coding

•

April 21, 2026

15/4 Weekly Update: HumanX and the Gemma 4 release

15/4 Weekly Update: HumanX and the Gemma 4 release

Blog

15/4 Weekly Update: HumanX and the Gemma 4 release

15/4 Weekly Update: HumanX and the Gemma 4 release

•

April 15, 2026

Doubleword & .txt partner to provide structured generation outputs natively through Doubleword

Doubleword & .txt partner to provide structured generation outputs natively through Doubleword

Blog

Doubleword & .txt partner to provide structured generation outputs natively through Doubleword

Doubleword & .txt partner to provide structured generation outputs natively through Doubleword

•

April 15, 2026

70x faster cold(ish) starts for SGLang

70x faster cold(ish) starts for SGLang

Inference Lab

Blog

70x faster cold(ish) starts for SGLang

70x faster cold(ish) starts for SGLang

•

April 6, 2026

Introducing dw - the Doubleword CLI

Introducing dw - the Doubleword CLI

Technical Guide

Introducing dw - the Doubleword CLI

Introducing dw - the Doubleword CLI

•

April 2, 2026

27/3 Weekly Update: Doubleword CLI and OCR model release

27/3 Weekly Update: Doubleword CLI and OCR model release

Blog

27/3 Weekly Update: Doubleword CLI and OCR model release

27/3 Weekly Update: Doubleword CLI and OCR model release

•

March 27, 2026

Doubleword for OpenClaw - Your OpenClaw Agent Is Probably Burning Money It Doesn't Need To

Doubleword for OpenClaw - Your OpenClaw Agent Is Probably Burning Money It Doesn't Need To

Blog

Doubleword for OpenClaw - Your OpenClaw Agent Is Probably Burning Money It Doesn't Need To

Doubleword for OpenClaw - Your OpenClaw Agent Is Probably Burning Money It Doesn't Need To

•

March 25, 2026

OCR and the Bitter Lesson

OCR and the Bitter Lesson

Inference Lab

Blog

OCR and the Bitter Lesson

OCR and the Bitter Lesson

•

March 23, 2026

20/3 Weekly Update: New Models, Free Nemotron, and Organizations

20/3 Weekly Update: New Models, Free Nemotron, and Organizations

Blog

20/3 Weekly Update: New Models, Free Nemotron, and Organizations

20/3 Weekly Update: New Models, Free Nemotron, and Organizations

•

March 20, 2026

13/3 Weekly Update: Async Pipeline Generator

13/3 Weekly Update: Async Pipeline Generator

Blog

13/3 Weekly Update: Async Pipeline Generator

13/3 Weekly Update: Async Pipeline Generator

•

March 13, 2026

6/3 Weekly Update: Qwen3.5-9B + Auto Top-Up

6/3 Weekly Update: Qwen3.5-9B + Auto Top-Up

Blog

6/3 Weekly Update: Qwen3.5-9B + Auto Top-Up

6/3 Weekly Update: Qwen3.5-9B + Auto Top-Up

•

March 6, 2026

27/2 Weekly Update: Qwen3.5-35B-A3B (Higher Quality, Lower Cost)

27/2 Weekly Update: Qwen3.5-35B-A3B (Higher Quality, Lower Cost)

Blog

27/2 Weekly Update: Qwen3.5-35B-A3B (Higher Quality, Lower Cost)

27/2 Weekly Update: Qwen3.5-35B-A3B (Higher Quality, Lower Cost)

•

February 27, 2026

20/2 Weekly Update: New Qwen Models, GPT-OSS 20B & Webhooks

20/2 Weekly Update: New Qwen Models, GPT-OSS 20B & Webhooks

Blog

20/2 Weekly Update: New Qwen Models, GPT-OSS 20B & Webhooks

20/2 Weekly Update: New Qwen Models, GPT-OSS 20B & Webhooks

•

February 20, 2026

Scaling Curation with LLM Comparisons

Scaling Curation with LLM Comparisons

Inference Lab

Blog

Scaling Curation with LLM Comparisons

Scaling Curation with LLM Comparisons

•

February 6, 2026

LLM powered data structures: A concurrent, lock-free binary search tree

LLM powered data structures: A concurrent, lock-free binary search tree

Technical Guide

LLM powered data structures: A concurrent, lock-free binary search tree

LLM powered data structures: A concurrent, lock-free binary search tree

•

February 3, 2026

ZeroDP: Just-In-Time Weight Offloading over NVLink for Data Parallelism

ZeroDP: Just-In-Time Weight Offloading over NVLink for Data Parallelism

Inference Lab

Blog

ZeroDP: Just-In-Time Weight Offloading over NVLink for Data Parallelism

ZeroDP: Just-In-Time Weight Offloading over NVLink for Data Parallelism

•

January 30, 2026

Large-Scale Semantic Search Without Embeddings

Large-Scale Semantic Search Without Embeddings

Inference Lab

Blog

Large-Scale Semantic Search Without Embeddings

Large-Scale Semantic Search Without Embeddings

•

January 27, 2026

QueueSpec: Drafting While You Wait

QueueSpec: Drafting While You Wait

Inference Lab

Blog

QueueSpec: Drafting While You Wait

QueueSpec: Drafting While You Wait

•

January 22, 2026

Parallel Primitives for Multi-Agent Workflows

Parallel Primitives for Multi-Agent Workflows

Inference Lab

Blog

Parallel Primitives for Multi-Agent Workflows

Parallel Primitives for Multi-Agent Workflows

•

January 22, 2026

Real-Time vs Batch Inference for LLMs: Use Cases, Costs, Workflow

Real-Time vs Batch Inference for LLMs: Use Cases, Costs, Workflow

Batch inference

Blog

Real-Time vs Batch Inference for LLMs: Use Cases, Costs, Workflow

Real-Time vs Batch Inference for LLMs: Use Cases, Costs, Workflow

•

January 19, 2026

Behind the Stack, Ep 13 - Faster Inference: Speculative Decoding for Batched Workloads

Behind the Stack, Ep 13 - Faster Inference: Speculative Decoding for Batched Workloads

Inference Optimization

Technical Guide

Behind the Stack, Ep 13 - Faster Inference: Speculative Decoding for Batched Workloads

Behind the Stack, Ep 13 - Faster Inference: Speculative Decoding for Batched Workloads

•

December 3, 2025

Costco of Inference: Introducing Doubleword Batched, the Inference Provider Built for Batched Workloads

Costco of Inference: Introducing Doubleword Batched, the Inference Provider Built for Batched Workloads

Inference Optimization

Blog

Costco of Inference: Introducing Doubleword Batched, the Inference Provider Built for Batched Workloads

Costco of Inference: Introducing Doubleword Batched, the Inference Provider Built for Batched Workloads

•

December 2, 2025

Behind the Stack Ep. 12 - Understanding Model Parallelism

Behind the Stack Ep. 12 - Understanding Model Parallelism

Inference Optimization

Technical Guide

Behind the Stack Ep. 12 - Understanding Model Parallelism

Behind the Stack Ep. 12 - Understanding Model Parallelism

•

November 19, 2025

Behind the Stack, Ep. 11 - How Speculative Decoding Speeds Up Language Models

Behind the Stack, Ep. 11 - How Speculative Decoding Speeds Up Language Models

Self-Hosted Architecture

Technical Guide

Behind the Stack, Ep. 11 - How Speculative Decoding Speeds Up Language Models

Behind the Stack, Ep. 11 - How Speculative Decoding Speeds Up Language Models

•

November 5, 2025

Doubleword Open Sources the World’s Fastest AI Gateway

Doubleword Open Sources the World’s Fastest AI Gateway

Artificial Intelligence

News

Doubleword Open Sources the World’s Fastest AI Gateway

Doubleword Open Sources the World’s Fastest AI Gateway

•

October 21, 2025

Chasing Cheap Tokens: 2x Cheaper Tokens Than H100s with Consumer Cards‍

Chasing Cheap Tokens: 2x Cheaper Tokens Than H100s with Consumer Cards‍

Blog

Chasing Cheap Tokens: 2x Cheaper Tokens Than H100s with Consumer Cards‍

Chasing Cheap Tokens: 2x Cheaper Tokens Than H100s with Consumer Cards‍

•

October 13, 2025

Should GPUs make Free Trade Agreements?

Should GPUs make Free Trade Agreements?

Blog

Should GPUs make Free Trade Agreements?

Should GPUs make Free Trade Agreements?

•

September 19, 2025

Behind the Stack, Ep 10 - Batched Endpoints

Behind the Stack, Ep 10 - Batched Endpoints

Self-Hosted Architecture

Technical Guide

Behind the Stack, Ep 10 - Batched Endpoints

Behind the Stack, Ep 10 - Batched Endpoints

•

September 10, 2025

What is InferenceOps? Defining the Function Behind Scalable AI

What is InferenceOps? Defining the Function Behind Scalable AI

Enterprise AI

Blog

What is InferenceOps? Defining the Function Behind Scalable AI

What is InferenceOps? Defining the Function Behind Scalable AI

•

September 5, 2025

Scaling AI Requires InferenceOps, Not MLOps

Scaling AI Requires InferenceOps, Not MLOps

Enterprise AI

Blog

Scaling AI Requires InferenceOps, Not MLOps

Scaling AI Requires InferenceOps, Not MLOps

•

September 4, 2025

GTC Europe 2025: ASAS AI & Doubleword Announce Strategic Partnership to Deliver Sovereign, Enterprise-Grade AI Solutions in Saudi Arabia and the Middle East

GTC Europe 2025: ASAS AI & Doubleword Announce Strategic Partnership to Deliver Sovereign, Enterprise-Grade AI Solutions in Saudi Arabia and the Middle East

Press

GTC Europe 2025: ASAS AI & Doubleword Announce Strategic Partnership to Deliver Sovereign, Enterprise-Grade AI Solutions in Saudi Arabia and the Middle East

GTC Europe 2025: ASAS AI & Doubleword Announce Strategic Partnership to Deliver Sovereign, Enterprise-Grade AI Solutions in Saudi Arabia and the Middle East

No items found.

•

June 16, 2025

Doubleword doubles down on NVIDIA collaboration to give enterprises control over their AI with NVIDIA NIM microservices integration

Doubleword doubles down on NVIDIA collaboration to give enterprises control over their AI with NVIDIA NIM microservices integration

Press

Doubleword doubles down on NVIDIA collaboration to give enterprises control over their AI with NVIDIA NIM microservices integration

Doubleword doubles down on NVIDIA collaboration to give enterprises control over their AI with NVIDIA NIM microservices integration

No items found.

•

June 11, 2025

Doubleword Launches Self-Hosted Inference Platform On Snowflake Marketplace

Doubleword Launches Self-Hosted Inference Platform On Snowflake Marketplace

Press

Doubleword Launches Self-Hosted Inference Platform On Snowflake Marketplace

Doubleword Launches Self-Hosted Inference Platform On Snowflake Marketplace

No items found.

PR Newswire

•

June 3, 2025

Doubleword Launches Self-Hosted Inference Platform on Snowflake Marketplace

Doubleword Launches Self-Hosted Inference Platform on Snowflake Marketplace

Blog

Doubleword Launches Self-Hosted Inference Platform on Snowflake Marketplace

Doubleword Launches Self-Hosted Inference Platform on Snowflake Marketplace

No items found.

•

June 3, 2025

AI-Powered Performance: How Digits Built Specialized Models for Accounting

AI-Powered Performance: How Digits Built Specialized Models for Accounting

Artificial Intelligence

AI-Powered Performance: How Digits Built Specialized Models for Accounting

AI-Powered Performance: How Digits Built Specialized Models for Accounting

•

May 13, 2025

Doubleword raises $12M Series A to make self-hosted AI inference effortless

Doubleword raises $12M Series A to make self-hosted AI inference effortless

Press

Doubleword raises $12M Series A to make self-hosted AI inference effortless

Doubleword raises $12M Series A to make self-hosted AI inference effortless

No items found.

Startups Magazine

•

May 9, 2025

Doubleword raises $12M Series A led by Dawn Capital to make self-hosted AI inference effortless for enterprises

Doubleword raises $12M Series A led by Dawn Capital to make self-hosted AI inference effortless for enterprises

News

Doubleword raises $12M Series A led by Dawn Capital to make self-hosted AI inference effortless for enterprises

Doubleword raises $12M Series A led by Dawn Capital to make self-hosted AI inference effortless for enterprises

•

May 8, 2025

AI Startup Doubleword Raises £9M Series A Led by Dawn Capital

AI Startup Doubleword Raises £9M Series A Led by Dawn Capital

Press

AI Startup Doubleword Raises £9M Series A Led by Dawn Capital

AI Startup Doubleword Raises £9M Series A Led by Dawn Capital

No items found.

Just AI News

•

May 8, 2025

Doubleword secures £9 million Series A Investment led by Dawn Capital

Doubleword secures £9 million Series A Investment led by Dawn Capital

Press

Doubleword secures £9 million Series A Investment led by Dawn Capital

Doubleword secures £9 million Series A Investment led by Dawn Capital

No items found.

Deal Lite

•

May 8, 2025

UK’s Doubleword secures €10.6M to help businesses escape AI infrastructure overload: Here’s how

UK’s Doubleword secures €10.6M to help businesses escape AI infrastructure overload: Here’s how

Press

UK’s Doubleword secures €10.6M to help businesses escape AI infrastructure overload: Here’s how

UK’s Doubleword secures €10.6M to help businesses escape AI infrastructure overload: Here’s how

No items found.

Silicon Canals

•

May 8, 2025

Doubleword raises £9m Series A led by Dawn Capital to make self-hosted AI inference effortless for enterprises

Doubleword raises £9m Series A led by Dawn Capital to make self-hosted AI inference effortless for enterprises

Press

Doubleword raises £9m Series A led by Dawn Capital to make self-hosted AI inference effortless for enterprises

Doubleword raises £9m Series A led by Dawn Capital to make self-hosted AI inference effortless for enterprises

No items found.

Soapbox

•

May 8, 2025

Doubleword’s $12M fuels mission to bring easy, secure self-hosted AI to enterprises

Doubleword’s $12M fuels mission to bring easy, secure self-hosted AI to enterprises

Press

Doubleword’s $12M fuels mission to bring easy, secure self-hosted AI to enterprises

Doubleword’s $12M fuels mission to bring easy, secure self-hosted AI to enterprises

No items found.

Tech Funding News

•

May 8, 2025

AI self-hosting start-up Doubleword finds new Dawn with £9m funding boost

AI self-hosting start-up Doubleword finds new Dawn with £9m funding boost

Press

AI self-hosting start-up Doubleword finds new Dawn with £9m funding boost

AI self-hosting start-up Doubleword finds new Dawn with £9m funding boost

No items found.

Sky News

•

May 7, 2025

Announcing Doubleword: New Name, Same Team, Same Mission

Announcing Doubleword: New Name, Same Team, Same Mission

Blog

Announcing Doubleword: New Name, Same Team, Same Mission

Announcing Doubleword: New Name, Same Team, Same Mission

•

May 7, 2025

No results found. Please try different keywords.

AI Inference, Built for Scale.

Products

Doubleword API Doubleword Inference Stack

Use Cases

Async Agents Synthetic Data Generation Data Processing

Resources

Seen in the Wild Documentation Pricing Async Pipeline Builder Resource Centre Technical Blog AI Dictionary

Company

About Privacy Policy Terms of Service Data Usage Policy

Hiring!

© 2026 Doubleword. All rights reserved.

We use cookies to ensure you get the best experience on our website.