Service Spec: api-gateway

What this service is

The single HTTP entry point for all traffic — browser, mobile, and third-party API clients. Does not contain business logic. Its only jobs are: figure out which tenant is calling, verify the caller's identity, and route the request to core-api.


Tech stack

  • Runtime: Node 20
  • Framework: Fastify v4
  • Language: TypeScript (strict mode)
  • Redis client: ioredis
  • JWT: @fastify/jwt
  • Rate limiting: @fastify/rate-limit
  • HTTP proxy: @fastify/http-proxy
  • Validation: zod

Package structure

apps/api-gateway/
├── src/
│   ├── main.ts                    # entry point, build server, listen
│   ├── server.ts                  # build and export Fastify app (for testing)
│   ├── plugins/
│   │   ├── redis.ts               # Redis connection plugin
│   │   ├── rate-limit.ts          # @fastify/rate-limit setup
│   │   └── error-handler.ts       # global error formatter
│   ├── middleware/
│   │   ├── resolve-tenant.ts      # step 1: find tenant from request
│   │   ├── load-tenant-config.ts  # step 2: fetch config from cache/DB
│   │   ├── validate-jwt.ts        # step 3: verify JWT, attach req.user
│   │   ├── validate-api-key.ts    # alternative to JWT for API key callers
│   │   └── inject-headers.ts      # step 4: attach x-tenant-id, x-user-id etc to proxied request
│   ├── routes/
│   │   └── health.ts              # GET /health
│   └── types/
│       └── fastify.d.ts           # augment FastifyRequest with tenant + user
├── Dockerfile
├── tsconfig.json
└── package.json

Tenant resolution — resolve-tenant.ts

Run on every request before JWT validation.

// Resolution priority order:
// 1. x-tenant-id header
// 2. Subdomain: ashoka.platform.com → "ashoka"
// 3. x-tenant-id query param
// 4. tenant field in JSON body (for webhook/public API callers)
 
async function resolveTenant(req: FastifyRequest): Promise<string> {
  const fromHeader = req.headers['x-tenant-id'];
  if (fromHeader) return String(fromHeader);
 
  const host = req.headers.host ?? '';
  const subdomain = host.split('.')[0];
  const knownSubdomains = await getTenantSlugs(); // cached set in Redis
  if (knownSubdomains.has(subdomain)) return subdomain;
 
  const fromQuery = (req.query as any)['x-tenant-id'];
  if (fromQuery) return String(fromQuery);
 
  const fromBody = (req.body as any)?.tenant;
  if (fromBody) return String(fromBody);
 
  throw new TenantNotFoundError();
}

If no tenant can be resolved → 400 { error: "tenant_required" }.


Tenant config caching — load-tenant-config.ts

const CACHE_TTL_SECONDS = 300; // 5 minutes
 
async function loadTenantConfig(slug: string): Promise<TenantConfig> {
  const cacheKey = `tenant:config:${slug}`;
  
  const cached = await redis.get(cacheKey);
  if (cached) return JSON.parse(cached);
 
  // Fallback: fetch from core-api internal endpoint
  const config = await fetchFromCoreApi(`/internal/tenants/${slug}/config`);
  if (!config) throw new TenantNotFoundError(slug);
  
  await redis.setex(cacheKey, CACHE_TTL_SECONDS, JSON.stringify(config));
  return config;
}

The internal endpoint on core-api is protected by a shared INTERNAL_SECRET header. It returns the full TenantConfig object.

Cache invalidation: when tenancy module in core-api updates a tenant config, it publishes to Redis pub/sub channel tenant:config:invalidate. The gateway subscribes and deletes the relevant key.


JWT validation — validate-jwt.ts

// Access token payload shape:
interface JWTPayload {
  sub: string;          // userId
  tenant_id: string;    // must match resolved tenant
  role: UserRole;       // "student" | "admin_l1" | "admin_l2" | "verifier" | "super_admin"
  permissions: string[]; // ["students:read", "jobs:write", ...]
  iat: number;
  exp: number;          // 15 min from issue
}
 
// Validation steps:
// 1. Extract Bearer token from Authorization header
// 2. Verify signature using tenant.jwtSecret (not a global secret)
// 3. Check exp — return 401 with code "token_expired" if stale
// 4. Check payload.tenant_id === req.tenant.id — reject cross-tenant tokens
// 5. Attach to req.user

If token is expired → 401 { error: "token_expired" } (client should use refresh endpoint).
If token is invalid → 401 { error: "invalid_token" }.


API key validation — validate-api-key.ts

For external integrations. API keys are passed as Authorization: ApiKey <key> or x-api-key: <key> header.

// API keys are stored in core-api DB: api_keys table
// Structure: { id, tenant_id, key_hash (bcrypt), permissions[], last_used_at, expires_at }
 
// Validation:
// 1. Extract key from header
// 2. Look up in Redis cache first: api_key:hash → { tenantId, permissions }
// 3. If not cached, call core-api internal endpoint to validate
// 4. Cache result for 60 seconds (short TTL — revocation must be near-instant)
// 5. Attach as req.user = { id: null, role: "api_key", permissions: [...] }

Forwarded headers to core-api

After all middleware runs, these headers are injected into the proxied request:

x-tenant-id: <slug>
x-tenant-uuid: <uuid>
x-user-id: <userId or empty for api-key>
x-user-role: <role>
x-user-permissions: <comma-separated list>
x-request-id: <generated uuid for tracing>

core-api trusts these headers only from requests originating from api-gateway's internal IP. It does not re-validate JWTs.


Rate limiting — rate-limit.ts

// Per-tenant, per-IP rate limiting using @fastify/rate-limit
{
  max: 500,             // 500 requests
  timeWindow: '1 minute',
  keyGenerator: (req) => `${req.tenant?.id}:${req.ip}`,
  errorResponseBuilder: () => ({
    error: 'rate_limit_exceeded',
    retryAfter: 60
  })
}
 
// Stricter limit for auth endpoints
// POST /auth/otp/request: 5 per 10 minutes per IP
// POST /auth/otp/verify: 10 per 10 minutes per IP

Routes

GET  /health                     → { status: "ok", version, uptime }
*    /api/*                      → proxy to core-api (all business routes)

Everything under /api/* is proxied. The gateway does not define application routes — that is core-api's job.


Error response format (all errors from gateway)

{
  "error": "tenant_required",
  "message": "Could not resolve tenant from request",
  "requestId": "uuid"
}

Standard error codes:

  • tenant_required — no tenant could be resolved
  • tenant_not_found — resolved slug doesn't exist
  • token_expired — JWT exp passed
  • invalid_token — JWT malformed or bad signature
  • unauthorized — valid token but insufficient permissions
  • rate_limit_exceeded — too many requests
  • api_key_invalid — bad or revoked API key

TypeScript augmentation

// src/types/fastify.d.ts
declare module 'fastify' {
  interface FastifyRequest {
    tenant: TenantConfig;
    user: {
      id: string | null;
      role: UserRole;
      permissions: string[];
    };
  }
}

Health check

GET /health returns 200 with:

{
  "status": "ok",
  "version": "1.0.0",
  "uptime": 3600,
  "redis": "ok",
  "coreApi": "ok"
}

Checks Redis ping and a lightweight core-api ping. Used by k8s liveness/readiness probes.


Tests

Unit tests: src/**/__tests__/*.test.ts
Integration tests: tests/integration/ — spin up the gateway with a mock Redis and a mock core-api (using nock or a real test instance).

Key test cases:

  • Tenant resolved from header / subdomain / query param
  • Unknown tenant → 400
  • Valid JWT → req.user populated
  • Expired JWT → 401 token_expired
  • Cross-tenant JWT (token tenant_id ≠ resolved tenant) → 401
  • API key valid → req.user populated with api_key role
  • Revoked API key → 401
  • Rate limit exceeded → 429
  • Proxy correctly forwards all injected headers