build report — 2026-03-28

AaaS API Gateway

A centralized API worksuite that combines SkillBoss-style multi-vendor aggregation with AaaS's unique foundry composition engine. One API key, one credit balance, 460+ endpoints planned. 22 live today.

22
Live Endpoints
17
LLM Models
63
Foundry Skills
6
Foundries Ready
460+
Endpoints Planned

What Is This? (For Humans)

The AaaS API Gateway is a single API that gives developers access to everything they need to build AI-powered products — language models, image/video generation, web scraping, social media data, email sending, payment processing, and agent orchestration — all through one API key and one credit balance.

Think of it as your team's AI department in an API. Instead of signing up for 16 different services, managing 16 API keys, and paying 16 separate bills, you top up AaaS credits once and use any service instantly.

How It Works

1. Sign up at agents-as-a-service.com (Google or email login)
2. Create an API key via POST /v1/keys/create
3. Top up credits via POST /v1/billing/checkout ($5 minimum)
4. Use any endpoint with Authorization: Bearer aaas_xxx
5. Credits deducted automatically per request based on actual usage

What Makes This Different From OpenRouter

OpenRouter is an LLM router — it routes chat requests to language models. That's one category.

AaaS is a full worksuite. It routes to LLMs AND scrapes websites AND generates images AND sends emails AND processes payments AND — crucially — orchestrates multi-agent workflows via the Foundry Compose API. No other platform does this.

CapabilityOpenRouterSkillBossAaaS
LLM routingYesYesYes
Image/video generationNoYesPhase 3
Web/social scrapingNoYes (250+)Phase 4
Email/SMS/paymentsNoYesEmail live, rest Phase 4
Agent orchestrationNoNoYes — exclusive
Multi-agent chainsNoNoYes — 7 chains live

Technical Specification (For Agents)

Base URL

https://apigateway-q66ryynraa-uc.a.run.app

Authentication

Authorization: Bearer aaas_<64-char-hex>

Keys are SHA-256 hashed before storage. Lookup via Firestore index.
Rate limits enforced per key: RPM (requests/minute) + RPD (requests/day).
Default: 60 RPM, 10,000 RPD. Configurable per key.

Credit System

Unit: microcents (1 credit = 100 microcents = $0.01 USD)
Storage: Firestore collection "credits", doc ID = userId
Guard: Pre-execution check → 402 if insufficient
Deduction: Firestore transaction (atomic, no race conditions)
Negative balance: Mathematically impossible
  - balance >= estimatedCost enforced BEFORE execution
  - actual cost deducted AFTER execution via atomic transaction
Top-up: Stripe Checkout → webhook → addCredits() transaction

All 22 Live Endpoints

MethodPathAuthStatusDescription
GET/v1/statusNoneLiveHealth check — returns version + timestamp
POST/v1/billing/webhookStripe sigLiveStripe payment confirmation → auto-add credits
GET/v1/modelsBearerLiveList 17 available LLM models (OpenAI format)
POST/v1/chat/completionsBearerCode readyMulti-provider LLM routing — Anthropic, OpenAI, Google, OpenRouter, Perplexity
GET/v1/billing/balanceBearerLiveCurrent credit balance in microcents + USD
GET/v1/billing/packagesBearerLiveAvailable top-up packages ($5, $25, $50, $100)
GET/v1/billing/usageBearerLiveUsage history with cost per request (last N days)
POST/v1/billing/checkoutBearerNeeds StripeCreate Stripe Checkout session for credit top-up
GET/v1/keysBearerLiveList user's active API keys (max 5)
POST/v1/keys/createBearerLiveCreate new API key — shown once, stored hashed
POST/v1/keys/revokeBearerLiveRevoke an API key by ID
GET/v1/compose/chainsBearerLiveList 7 foundry chains with slot configurations
GET/v1/compose/foundriesBearerLiveList 6 foundries with slot/use-case counts
GET/v1/compose/skillsBearerLiveList 63 foundry skills with capabilities
POST/v1/compose/mega-grindBearerLiveExecute full foundry pipeline (seed→scan→tournament→compose→test)
POST/v1/scrape/companyBearerLiveExtract company data from domain (homepage, about, products)
POST/v1/scrape/classifyBearerLiveClassify industry via Vertex AI (industry, segment, ICP, tech stack)
POST/v1/skills/matchBearerLiveMatch skills against industry classification (top 10)
POST/v1/agents/suggestBearerLiveSuggest agents with sample tasks based on classification

Request/Response Format

# Chat Completion (OpenAI-compatible)
POST /v1/chat/completions
{
  "model": "google/gemini-2.5-flash",
  "messages": [
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": "What is AaaS?"}
  ],
  "temperature": 0.7,
  "max_tokens": 1000
}

# Response (OpenAI format)
{
  "id": "chatcmpl-xxx",
  "object": "chat.completion",
  "model": "google/gemini-2.5-flash",
  "choices": [{"index": 0, "message": {"role": "assistant", "content": "..."}, "finish_reason": "stop"}],
  "usage": {"prompt_tokens": 25, "completion_tokens": 150, "total_tokens": 175}
}

# Error format (also OpenAI-compatible)
{
  "error": {
    "message": "Insufficient credits. Current balance: 0 microcents.",
    "type": "billing_error",
    "code": "insufficient_credits"
  }
}

LLM Model Registry (17 Models)

Model IDProviderInput $/1MOutput $/1MContext
anthropic/claude-4.5-sonnetAnthropic$3.00$15.00200K
anthropic/claude-4-sonnetAnthropic$3.00$15.00200K
anthropic/claude-4.5-haikuAnthropic$0.80$4.00200K
openai/gpt-5OpenAI$2.50$10.00128K
openai/gpt-5-miniOpenAI$0.15$0.60128K
openai/gpt-4.1OpenAI$2.00$8.001M
openai/gpt-4.1-miniOpenAI$0.40$1.601M
openai/o4-miniOpenAI$1.10$4.40200K
google/gemini-2.5-flashVertex AI$0.15$0.601M
google/gemini-2.5-proVertex AI$1.25$5.001M
deepseek/deepseek-r1OpenRouter$0.55$2.19128K
deepseek/deepseek-v3OpenRouter$0.27$1.10128K
meta/llama-3.3-70bOpenRouter$0.40$0.40128K
mistral/mistral-largeOpenRouter$2.00$6.00128K
qwen/qwen3-coder-plusOpenRouter$0.16$0.64128K
perplexity/sonar-proPerplexity$3.00$15.00200K
perplexity/sonarPerplexity$1.00$1.00128K

Foundry Compose API (AaaS-Exclusive)

# List available chains
GET /v1/compose/chains
→ 7 chains: devops-full, devops-incident, revenue-full,
  customer-success-full, people-full, finance-full, legal-full

# Run a foundry mega-grind
POST /v1/compose/mega-grind
{ "foundry": "F1", "dryRun": true }
→ Executes: seed slots → scan vault (63 skills) → tournament →
  compose chain → deploy → test

# List skills available for composition
GET /v1/compose/skills
→ 63 skills across 6 categories (devops, revenue, CS, people, finance, legal)

Architecture

┌──────────────────────────────────────────────────────────┐
│                   AaaS API Gateway                        │
│            apigateway-q66ryynraa-uc.a.run.app             │
├──────────┬──────────┬──────────┬──────────┬──────────────┤
│ LLM      │ Compose  │ Scraping │ Billing  │ Key Mgmt     │
│ Router   │ Router   │ Router   │ Router   │ Router       │
├──────────┴──────────┴──────────┴──────────┴──────────────┤
│              Middleware Pipeline                           │
│  Bearer Auth → Rate Limiter → Credit Guard → Usage Meter  │
├──────────────────────────────────────────────────────────┤
│              Provider Layer                                │
│  Anthropic · OpenAI · Vertex AI · OpenRouter · Perplexity │
│  Foundry Engine · Scraper · Classifier · Skill Matcher    │
├──────────────────────────────────────────────────────────┤
│              Data Layer (Firestore)                        │
│  api_keys · credits · usage_logs · rate_limits            │
│  skills · agents · foundry_chains · foundry_reports       │
└──────────────────────────────────────────────────────────┘

User Lifecycle

Signup (Firebase Auth: Google/Email)
  → Account created in Firestore (tier: free, credits: 0)
  → Context pipeline auto-runs (scrape, classify, match skills, suggest agents)

Top Up Credits
  → POST /v1/billing/checkout { packageId: "credits_2500" }
  → Stripe Checkout → payment confirmed → webhook → credits added atomically

Create API Key
  → POST /v1/keys/create { name: "production" }
  → Returns raw key (shown once) — stored as SHA-256 hash

Use Any Endpoint
  → Authorization: Bearer aaas_xxx
  → Auth → Rate Limit → Credit Check (402 if insufficient) → Execute → Deduct → Log

Monitor Usage
  → GET /v1/billing/balance → current balance in microcents + USD
  → GET /v1/billing/usage?days=30 → per-request cost breakdown

Credit Guard — How Overspend Is Prevented

The credit system makes negative balances mathematically impossible.

Three layers of protection:

Layer 1: Pre-execution estimate. Before any API call executes, the gateway estimates the cost based on model pricing and input size. If balance < estimatedCost, the request is rejected with HTTP 402 — the API call never reaches the upstream provider.

Layer 2: Atomic deduction. After successful execution, the actual cost (based on real token counts from the provider) is deducted via a Firestore transaction. This is atomic — if two requests arrive simultaneously, they can't both read the same balance and deduct, because transactions serialize writes.

Layer 3: Credit balance is server-side only. The credits collection has no client-side Firestore security rules. Only Cloud Functions can read or write it. Users cannot manipulate their balance from the browser.

// gateway.ts line 212 — the guard
const creditCheck = await hasSufficientCredits(ctx.userId, estimated);
if (!creditCheck.sufficient) {
  errors.billingError(res, creditCheck.balance);
  return; // ← EXECUTION BLOCKED, upstream provider never called
}

// After execution succeeds:
await deductCredits(ctx.userId, actualCost, description);
// ↑ Firestore transaction: read balance → subtract → write atomically

Expansion Roadmap — From 22 to 460+ Endpoints

PhaseCategoryEndpointsUpstream VendorsStatus
1 (Now)Gateway + LLM + Billing + Compose22Vertex AI, (Anthropic, OpenAI, OpenRouter, Perplexity)Live
2Audio (TTS + STT + Music + SFX)+30ElevenLabs, OpenAI, Replicate, Google TTSNeeds keys
3Image + Video Generation+78Replicate (FLUX, Kling, Sora), FAL, Vertex (Veo)Needs keys
4Web/Social Scraping + Search+250Apify (130), Scrapecreators (108), Firecrawl, EnsembleDataPlanned
5Business Services+8Stripe, Prelude (SMS), Reducto (docs), Gamma (slides)Planned

Human Action Required (16 API Keys)

The gateway architecture is deployed. Each upstream provider needs an AaaS business account created (one-time) and the API key stored in GCP Secret Manager. Many offer free tiers.

#ProviderServicesFree Tier?Priority
1StripeCredit top-ups, billingYesCritical
2OpenRouterDeepSeek, Llama, Mistral, Qwen (30+ models)Pay-as-you-goCritical
3AnthropicClaude 4.5 Sonnet, Claude 4 Sonnet, Haiku$5 freeCritical
4OpenAIGPT-5, GPT-4.1, o4-mini, TTS, Whisper, Embeddings$5 freeCritical
5PerplexitySonar Pro, Sonar (search-augmented)Free tierHigh
6ReplicateFLUX, SD, Ideogram, Kling, Sora, Kokoro, MusicGenPay-as-you-goHigh
7ElevenLabsPremium TTS, music, sound FX10K chars/moHigh
8FALImage upscale, img2img, videoNoMedium
9Apify130 social media automation actors100 runs/moMedium
10Scrapecreators108 social media data endpointsPay-as-you-goMedium
11FirecrawlJS-rendered web scraping500 pages/moMedium
12-16EnsembleData, Linkup, Prelude, Reducto, GammaSocial analytics, search, SMS, docs, slidesVariesLower

Full details with signup URLs and gcloud secrets create commands: open-issues.md (items 8-10)

Source Files Reference

functions/src/api/
├── gateway.ts              — Main HTTP handler (22 routes)
├── types.ts                — All type definitions
├── errors.ts               — OpenAI-compatible error responses
├── model-registry.ts       — 17 LLM models with pricing
├── middleware/
│   ├── auth.ts             — API key auth (SHA-256, Firestore)
│   ├── rate-limiter.ts     — Sliding window RPM/RPD
│   ├── credits.ts          — Atomic credit guard + deduction
│   └── usage-meter.ts      — Per-request logging
├── routers/
│   ├── llm-router.ts       — Multi-provider LLM dispatch
│   └── billing-router.ts   — Stripe, balance, keys, usage
├── providers/
│   ├── anthropic.ts        — Claude (Messages API → OpenAI format)
│   ├── openai.ts           — GPT (pass-through)
│   ├── google.ts           — Gemini (Vertex AI SDK)
│   ├── openrouter.ts       — DeepSeek, Llama, Mistral, Qwen
│   └── perplexity.ts       — Sonar search models

functions/src/lib/foundry/   — Composition engine (8 files)
├── slot-registry.ts        — 27 slots across 6 foundries
├── vault-scanner.ts        — Skill matching (category-filtered)
├── tournament-engine.ts    — Karpathy-style competition
├── chain-composer.ts       — Champion assembly into chains
├── chain-executor.ts       — Chain execution (dry-run + live)
├── mega-grind.ts           — Full pipeline orchestrator
└── use-case-registry.ts    — 7 registered use cases

Planning documents:
├── docs/superpowers/plans/2026-03-28-aaas-api-gateway-plan.md
├── docs/superpowers/plans/2026-03-28-component-readiness-matrix.md
└── docs/superpowers/plans/2026-03-28-api-category-buildplans.md