Guide

Handle rate limits

Lumin meters per day, not per second. The challenge is keeping a single multi-call reading inside the budget, not throttling burst traffic.

What gets metered

Every successful tool call counts as 1.
Failed calls (429, 5xx, validation errors) do not count.
The meter resets at 00:00 UTC.
Daily limits are scoped: per IP for anonymous, per account for authenticated.

Detecting an exhausted budget

On a 429 response, Lumin returns an MCP error with code rate_limited and a human-readable reason describing whether the limit was the daily quota or a transient burst. Treat both the same way at runtime: stop the current reading, surface a friendly message, queue or shed the call.

Backoff that respects multi-call readings

A complete reading uses 25 to 40 calls. If you start one and hit a rate limit at call 30, you have wasted 30 calls already. Pre-flight the budget when starting a reading: if fewer than 50 calls remain, warn the user before you begin instead of failing midway.

Server-side compaction

Lumin compacts large tool responses on the server before returning them. This reduces token cost on your end but does not affect the rate-limit count. One logical tool call still counts as one.

PreviousAPI key vs OAuth NextStream responses