Edgee is an Agent Gateway — the infrastructure layer between coding agents (Claude Code, Codex, Copilot, OpenCode, Cursor) and LLM provider APIs (Anthropic, OpenAI, GLM). It applies three things to every request: compresses tokens, routes intelligently across providers with automatic fallback, and meters every session. Up to 50% token cost reduction.

Your coding agent calls Edgee instead of the provider directly. Edgee compresses tool results and prompts at the edge to reduce input tokens, optionally trims output verbosity, routes to the right provider with automatic fallback on errors or plan-cap, and reports every saved token in your dashboard.

Can I use my own provider API keys?

Yes. You can use Edgee’s unified access with a single Edgee API key, or bring your own provider keys for direct billing and custom models.

What do I get with Edgee?

Up to 50% token cost reduction through compression in two layers (Input and Output), automatic provider fallback on 5xx and plan-cap, BYOK support, real-time savings tracking, and per-session metering across every coding agent you use.

Does Edgee work with Claude Code?

Yes. Install the Edgee CLI and run `edgee launch claude` to start Claude Code with transparent token compression. No code changes required — Edgee acts as a proxy, compressing prompts before they reach Anthropic. Most users save 20–50% on token costs immediately.

Which coding agents does Edgee support?

Edgee supports Claude Code, Codex, Copilot, OpenCode, and Cursor. The Edgee CLI wraps your agent transparently — compression happens at the network layer.

The Agent Gateway. Compress, route, observe.

Name: Edgee AI Gateway
Author: Edgee

Edgee sits between your agents and the LLM provider.
Same code, fewer tokens, lower bills.

Up to 50%Cost reduction

+30%Longer coding sessions

< 1 minTo install

edgee — zsh

❯

How to use Edgee

Whether you’re using a coding agent or building an app, Edgee compresses your LLM traffic in minutes.

For coding agents

Start saving tokens in 1 minute

Install Edgee CLI and connect it to your coding agent. No code changes required.

No code changes: works as a transparent proxy for your agent
Instant savings: token compression kicks in on the first request
Works with any agent: Claude Code, Codex, Copilot, OpenCode, Cursor and more

Configure your coding agent

Connect Edgee to your AI coding assistant and start saving tokens in 1 minute.

1Choose your coding agent

2Install Edgee CLI

curl -fsSL https://edgee.ai/install.sh | bash

3Start saving tokens

edgee launch claude

Why Edgee Agent Gateway?

An edge intelligence layer for your coding agents

Edgee sits between your coding agents and LLM providers. It applies three pillars to every request, Compress (input + output), Route (with automatic fallback), and Observe (per session, per team), so you cut token costs and extend context windows without changing a line of application code.

Token compression

Layer 1 (Input) tool-result trimming and Layer 2 (Output) brevity. Cut tool-result payloads 60–90% at the edge. Semantically lossless for coding tasks. Same model output, fewer tokens billed.

Learn more

Team Management

Get full visibility into how your team uses coding agents. Track cost per repo and PR, manage team seats, and keep your team unblocked with automatic model fallback.

Learn more

Turbo Models

Use Turbo Models to get the best performance and cost savings. Turbo Models are a set of pre-trained models that are optimized for specific tasks.

Learn more

Fallback models

When a provider request fails, Edgee automatically retries and falls back to the next available provider, transparently, without any changes to your code.

Learn more

Bring Your Own Keys

Use Edgee’s keys for convenience, or plug in your own provider keys for billing control and custom models.

Learn more

Observability

Monitor latency, errors, usage, and cost per model, per app, and per environment.

Learn more

The vision behind Edgee

Every technological shift creates a new foundation: the web had bandwidth, the cloud had compute, and AI has tokens. In a world powered by models, intelligence has a cost: tokens flow through every interaction, decision, and response.

At Edgee, we believe intelligence should move efficiently, closer to users, intent, and action. It should be compressed, routed, and optimized so decisions happen instantly. Hear from Sacha, Edgee’s co-founder, on how AI scales by mastering how intelligence moves.