build report — 2026-03-28

AaaS API Gateway

A centralized API worksuite that combines SkillBoss-style multi-vendor aggregation with AaaS's unique foundry composition engine. One API key, one credit balance, 460+ endpoints planned. 22 live today.

Live Endpoints

LLM Models

Foundry Skills

Foundries Ready

460+

Endpoints Planned

What Is This? (For Humans)

The AaaS API Gateway is a single API that gives developers access to everything they need to build AI-powered products — language models, image/video generation, web scraping, social media data, email sending, payment processing, and agent orchestration — all through one API key and one credit balance.

Think of it as your team's AI department in an API. Instead of signing up for 16 different services, managing 16 API keys, and paying 16 separate bills, you top up AaaS credits once and use any service instantly.

How It Works

1. Sign up at agents-as-a-service.com (Google or email login)
2. Create an API key via POST /v1/keys/create
3. Top up credits via POST /v1/billing/checkout ($5 minimum)
4. Use any endpoint with Authorization: Bearer aaas_xxx
5. Credits deducted automatically per request based on actual usage

What Makes This Different From OpenRouter

OpenRouter is an LLM router — it routes chat requests to language models. That's one category.

AaaS is a full worksuite. It routes to LLMs AND scrapes websites AND generates images AND sends emails AND processes payments AND — crucially — orchestrates multi-agent workflows via the Foundry Compose API. No other platform does this.

Capability	OpenRouter	SkillBoss	AaaS
LLM routing	Yes	Yes	Yes
Image/video generation	No	Yes	Phase 3
Web/social scraping	No	Yes (250+)	Phase 4
Email/SMS/payments	No	Yes	Email live, rest Phase 4
Agent orchestration	No	No	Yes — exclusive
Multi-agent chains	No	No	Yes — 7 chains live

Technical Specification (For Agents)

Base URL

https://apigateway-q66ryynraa-uc.a.run.app

Authentication

Authorization: Bearer aaas_<64-char-hex>

Keys are SHA-256 hashed before storage. Lookup via Firestore index.
Rate limits enforced per key: RPM (requests/minute) + RPD (requests/day).
Default: 60 RPM, 10,000 RPD. Configurable per key.

Credit System

Unit: microcents (1 credit = 100 microcents = $0.01 USD)
Storage: Firestore collection "credits", doc ID = userId
Guard: Pre-execution check → 402 if insufficient
Deduction: Firestore transaction (atomic, no race conditions)
Negative balance: Mathematically impossible
  - balance >= estimatedCost enforced BEFORE execution
  - actual cost deducted AFTER execution via atomic transaction
Top-up: Stripe Checkout → webhook → addCredits() transaction

All 22 Live Endpoints

Method	Path	Auth	Status	Description
GET	/v1/status	None	Live	Health check — returns version + timestamp
POST	/v1/billing/webhook	Stripe sig	Live	Stripe payment confirmation → auto-add credits
GET	/v1/models	Bearer	Live	List 17 available LLM models (OpenAI format)
POST	/v1/chat/completions	Bearer	Code ready	Multi-provider LLM routing — Anthropic, OpenAI, Google, OpenRouter, Perplexity
GET	/v1/billing/balance	Bearer	Live	Current credit balance in microcents + USD
GET	/v1/billing/packages	Bearer	Live	Available top-up packages ($5, $25, $50, $100)
GET	/v1/billing/usage	Bearer	Live	Usage history with cost per request (last N days)
POST	/v1/billing/checkout	Bearer	Needs Stripe	Create Stripe Checkout session for credit top-up
GET	/v1/keys	Bearer	Live	List user's active API keys (max 5)
POST	/v1/keys/create	Bearer	Live	Create new API key — shown once, stored hashed
POST	/v1/keys/revoke	Bearer	Live	Revoke an API key by ID
GET	/v1/compose/chains	Bearer	Live	List 7 foundry chains with slot configurations
GET	/v1/compose/foundries	Bearer	Live	List 6 foundries with slot/use-case counts
GET	/v1/compose/skills	Bearer	Live	List 63 foundry skills with capabilities
POST	/v1/compose/mega-grind	Bearer	Live	Execute full foundry pipeline (seed→scan→tournament→compose→test)
POST	/v1/scrape/company	Bearer	Live	Extract company data from domain (homepage, about, products)
POST	/v1/scrape/classify	Bearer	Live	Classify industry via Vertex AI (industry, segment, ICP, tech stack)
POST	/v1/skills/match	Bearer	Live	Match skills against industry classification (top 10)
POST	/v1/agents/suggest	Bearer	Live	Suggest agents with sample tasks based on classification

Request/Response Format

# Chat Completion (OpenAI-compatible)
POST /v1/chat/completions
{
  "model": "google/gemini-2.5-flash",
  "messages": [
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": "What is AaaS?"}
  ],
  "temperature": 0.7,
  "max_tokens": 1000
}

# Response (OpenAI format)
{
  "id": "chatcmpl-xxx",
  "object": "chat.completion",
  "model": "google/gemini-2.5-flash",
  "choices": [{"index": 0, "message": {"role": "assistant", "content": "..."}, "finish_reason": "stop"}],
  "usage": {"prompt_tokens": 25, "completion_tokens": 150, "total_tokens": 175}
}

# Error format (also OpenAI-compatible)
{
  "error": {
    "message": "Insufficient credits. Current balance: 0 microcents.",
    "type": "billing_error",
    "code": "insufficient_credits"
  }
}

LLM Model Registry (17 Models)

Model ID	Provider	Input $/1M	Output $/1M	Context
anthropic/claude-4.5-sonnet	Anthropic	$3.00	$15.00	200K
anthropic/claude-4-sonnet	Anthropic	$3.00	$15.00	200K
anthropic/claude-4.5-haiku	Anthropic	$0.80	$4.00	200K
openai/gpt-5	OpenAI	$2.50	$10.00	128K
openai/gpt-5-mini	OpenAI	$0.15	$0.60	128K
openai/gpt-4.1	OpenAI	$2.00	$8.00	1M
openai/gpt-4.1-mini	OpenAI	$0.40	$1.60	1M
openai/o4-mini	OpenAI	$1.10	$4.40	200K
google/gemini-2.5-flash	Vertex AI	$0.15	$0.60	1M
google/gemini-2.5-pro	Vertex AI	$1.25	$5.00	1M
deepseek/deepseek-r1	OpenRouter	$0.55	$2.19	128K
deepseek/deepseek-v3	OpenRouter	$0.27	$1.10	128K
meta/llama-3.3-70b	OpenRouter	$0.40	$0.40	128K
mistral/mistral-large	OpenRouter	$2.00	$6.00	128K
qwen/qwen3-coder-plus	OpenRouter	$0.16	$0.64	128K
perplexity/sonar-pro	Perplexity	$3.00	$15.00	200K
perplexity/sonar	Perplexity	$1.00	$1.00	128K

Foundry Compose API (AaaS-Exclusive)

# List available chains
GET /v1/compose/chains
→ 7 chains: devops-full, devops-incident, revenue-full,
  customer-success-full, people-full, finance-full, legal-full

# Run a foundry mega-grind
POST /v1/compose/mega-grind
{ "foundry": "F1", "dryRun": true }
→ Executes: seed slots → scan vault (63 skills) → tournament →
  compose chain → deploy → test

# List skills available for composition
GET /v1/compose/skills
→ 63 skills across 6 categories (devops, revenue, CS, people, finance, legal)

Architecture

┌──────────────────────────────────────────────────────────┐
│                   AaaS API Gateway                        │
│            apigateway-q66ryynraa-uc.a.run.app             │
├──────────┬──────────┬──────────┬──────────┬──────────────┤
│ LLM      │ Compose  │ Scraping │ Billing  │ Key Mgmt     │
│ Router   │ Router   │ Router   │ Router   │ Router       │
├──────────┴──────────┴──────────┴──────────┴──────────────┤
│              Middleware Pipeline                           │
│  Bearer Auth → Rate Limiter → Credit Guard → Usage Meter  │
├──────────────────────────────────────────────────────────┤
│              Provider Layer                                │
│  Anthropic · OpenAI · Vertex AI · OpenRouter · Perplexity │
│  Foundry Engine · Scraper · Classifier · Skill Matcher    │
├──────────────────────────────────────────────────────────┤
│              Data Layer (Firestore)                        │
│  api_keys · credits · usage_logs · rate_limits            │
│  skills · agents · foundry_chains · foundry_reports       │
└──────────────────────────────────────────────────────────┘

User Lifecycle

Signup (Firebase Auth: Google/Email)
  → Account created in Firestore (tier: free, credits: 0)
  → Context pipeline auto-runs (scrape, classify, match skills, suggest agents)

Top Up Credits
  → POST /v1/billing/checkout { packageId: "credits_2500" }
  → Stripe Checkout → payment confirmed → webhook → credits added atomically

Create API Key
  → POST /v1/keys/create { name: "production" }
  → Returns raw key (shown once) — stored as SHA-256 hash

Use Any Endpoint
  → Authorization: Bearer aaas_xxx
  → Auth → Rate Limit → Credit Check (402 if insufficient) → Execute → Deduct → Log

Monitor Usage
  → GET /v1/billing/balance → current balance in microcents + USD
  → GET /v1/billing/usage?days=30 → per-request cost breakdown

Credit Guard — How Overspend Is Prevented

The credit system makes negative balances mathematically impossible.

Three layers of protection:

Layer 1: Pre-execution estimate. Before any API call executes, the gateway estimates the cost based on model pricing and input size. If balance < estimatedCost, the request is rejected with HTTP 402 — the API call never reaches the upstream provider.

Layer 2: Atomic deduction. After successful execution, the actual cost (based on real token counts from the provider) is deducted via a Firestore transaction. This is atomic — if two requests arrive simultaneously, they can't both read the same balance and deduct, because transactions serialize writes.

Layer 3: Credit balance is server-side only. The credits collection has no client-side Firestore security rules. Only Cloud Functions can read or write it. Users cannot manipulate their balance from the browser.

// gateway.ts line 212 — the guard
const creditCheck = await hasSufficientCredits(ctx.userId, estimated);
if (!creditCheck.sufficient) {
  errors.billingError(res, creditCheck.balance);
  return; // ← EXECUTION BLOCKED, upstream provider never called
}

// After execution succeeds:
await deductCredits(ctx.userId, actualCost, description);
// ↑ Firestore transaction: read balance → subtract → write atomically

Expansion Roadmap — From 22 to 460+ Endpoints

Phase	Category	Endpoints	Upstream Vendors	Status
1 (Now)	Gateway + LLM + Billing + Compose	22	Vertex AI, (Anthropic, OpenAI, OpenRouter, Perplexity)	Live
2	Audio (TTS + STT + Music + SFX)	+30	ElevenLabs, OpenAI, Replicate, Google TTS	Needs keys
3	Image + Video Generation	+78	Replicate (FLUX, Kling, Sora), FAL, Vertex (Veo)	Needs keys
4	Web/Social Scraping + Search	+250	Apify (130), Scrapecreators (108), Firecrawl, EnsembleData	Planned
5	Business Services	+8	Stripe, Prelude (SMS), Reducto (docs), Gamma (slides)	Planned

Human Action Required (16 API Keys)

The gateway architecture is deployed. Each upstream provider needs an AaaS business account created (one-time) and the API key stored in GCP Secret Manager. Many offer free tiers.

#	Provider	Services	Free Tier?	Priority
1	Stripe	Credit top-ups, billing	Yes	Critical
2	OpenRouter	DeepSeek, Llama, Mistral, Qwen (30+ models)	Pay-as-you-go	Critical
3	Anthropic	Claude 4.5 Sonnet, Claude 4 Sonnet, Haiku	$5 free	Critical
4	OpenAI	GPT-5, GPT-4.1, o4-mini, TTS, Whisper, Embeddings	$5 free	Critical
5	Perplexity	Sonar Pro, Sonar (search-augmented)	Free tier	High
6	Replicate	FLUX, SD, Ideogram, Kling, Sora, Kokoro, MusicGen	Pay-as-you-go	High
7	ElevenLabs	Premium TTS, music, sound FX	10K chars/mo	High
8	FAL	Image upscale, img2img, video	No	Medium
9	Apify	130 social media automation actors	100 runs/mo	Medium
10	Scrapecreators	108 social media data endpoints	Pay-as-you-go	Medium
11	Firecrawl	JS-rendered web scraping	500 pages/mo	Medium
12-16	EnsembleData, Linkup, Prelude, Reducto, Gamma	Social analytics, search, SMS, docs, slides	Varies	Lower

Full details with signup URLs and gcloud secrets create commands: open-issues.md (items 8-10)

Source Files Reference

functions/src/api/
├── gateway.ts              — Main HTTP handler (22 routes)
├── types.ts                — All type definitions
├── errors.ts               — OpenAI-compatible error responses
├── model-registry.ts       — 17 LLM models with pricing
├── middleware/
│   ├── auth.ts             — API key auth (SHA-256, Firestore)
│   ├── rate-limiter.ts     — Sliding window RPM/RPD
│   ├── credits.ts          — Atomic credit guard + deduction
│   └── usage-meter.ts      — Per-request logging
├── routers/
│   ├── llm-router.ts       — Multi-provider LLM dispatch
│   └── billing-router.ts   — Stripe, balance, keys, usage
├── providers/
│   ├── anthropic.ts        — Claude (Messages API → OpenAI format)
│   ├── openai.ts           — GPT (pass-through)
│   ├── google.ts           — Gemini (Vertex AI SDK)
│   ├── openrouter.ts       — DeepSeek, Llama, Mistral, Qwen
│   └── perplexity.ts       — Sonar search models

functions/src/lib/foundry/   — Composition engine (8 files)
├── slot-registry.ts        — 27 slots across 6 foundries
├── vault-scanner.ts        — Skill matching (category-filtered)
├── tournament-engine.ts    — Karpathy-style competition
├── chain-composer.ts       — Champion assembly into chains
├── chain-executor.ts       — Chain execution (dry-run + live)
├── mega-grind.ts           — Full pipeline orchestrator
└── use-case-registry.ts    — 7 registered use cases

Planning documents:
├── docs/superpowers/plans/2026-03-28-aaas-api-gateway-plan.md
├── docs/superpowers/plans/2026-03-28-component-readiness-matrix.md
└── docs/superpowers/plans/2026-03-28-api-category-buildplans.md