Skip to main content
Tool 3 min read

LLM Router: 78% Cost Savings, 30+ Models

Smart LLM router that saves 78% on inference costs by automatically selecting optimal models. 30+ models, single wallet, local routing logic.

Originally published:

GitHub by BlockRunAI

Purpose and Significance

ClawRouter is an intelligent LLM router that automatically selects the most cost-effective model for each request, reducing inference costs by up to 78%. By intelligently matching requests to 30+ models from OpenAI, Anthropic, DeepSeek, Google, and other providers—all managed through a single unified wallet with x402 micropayments—it eliminates the need to manage multiple API keys while dramatically lowering expenses. For developers and organizations running high-volume inference workloads, this represents a fundamental shift in how to approach LLM economics.

Key Features

  • Smart 15-Dimensional Routing: Analyzes prompt complexity, reasoning requirements, output format specifications, domain context, and agentic task detection to select optimal models in under 500ms locally
  • 30+ Integrated Models: Supports DeepSeek, GPT-4o, Claude 3.5, Gemini 2.5, Grok, Kimi K2.5, and more—automatically filtered by context length requirements
  • Single Wallet Architecture: Unified USDC micropayments via x402 protocol; no need to manage multiple API keys or balance across services
  • Agentic Auto-Detection: Automatically identifies multi-step tasks (file operations, testing, deployment) and routes to specialized agentic models like Kimi K2.5
  • Tool-Aware Selection: When function calling is required, routes exclusively to models with proven tool-use reliability
  • Multilingual Support: Works seamlessly with English, Chinese, Japanese, Russian, German, and mixed-language prompts
  • Context-Length Awareness: Automatically filters models that can't handle your token budget, preventing "context too long" failures
  • 100% Local Routing Logic: Weighted scoring runs entirely on your machine—no external calls, full privacy compliance

How It Works

ClawRouter assigns each request to one of four tiers based on weighted scoring across multiple dimensions:

  • SIMPLE: Basic factual queries (Gemini 2.5 Flash, ~$0.60/M)
  • MEDIUM: Code generation and structured tasks (Grok Code Fast, ~$1.50/M)
  • COMPLEX: Advanced reasoning and nuanced requirements (Gemini 2.5 Pro, ~$10.00/M)
  • REASONING: Theorem proofs and chain-of-thought heavy tasks (DeepSeek-R, ~$0.42/M)

Each prompt is evaluated across 15 markers including creative language use, question complexity, constraint counts, imperative verbs, output format specifications, domain specificity, and negation patterns. A sigmoid confidence calibration converts this weighted sum into tier assignment, ensuring requests that barely fit a lower tier automatically escalate to prevent quality degradation.

Getting Started

Install via npm and configure your USDC wallet and OpenClaw credentials in your environment. Requests to the blockrun/auto model trigger automatic routing; you can also override tier selection manually via openclaw.yaml. The plugin integrates directly with OpenClaw SDKs and supports standard LLM API patterns, making adoption as simple as changing your model parameter.

Who It's For

  • Cost-Conscious Teams: Organizations processing thousands of daily inference requests who can't afford $0.15 per request when $0.0001 solutions exist for simple tasks
  • Multi-Model Workflows: Development teams currently juggling separate API keys and balance management across OpenAI, Anthropic, and other providers
  • Agentic Builders: Teams running autonomous agents requiring reliable tool use and extended reasoning across heterogeneous models
  • High-Context Applications: Systems handling document analysis, code review, and long-form content where context length becomes a practical constraint

Resources

Built by: BlockRunAI | License: MIT | Latest Release: February 2026

llm-orchestration model-selection cost-optimization

Share:

Original Source

https://github.com/BlockRunAI/ClawRouter

View Original

Last updated: