Skip to content
Lumindocs
UseBuild
Open Lumin

Guide

Handle rate limits

Lumin meters per day, not per second. The challenge is keeping a single multi-call reading inside the budget, not throttling burst traffic.

What gets metered

  • Every successful tool call counts as 1.
  • Failed calls (429, 5xx, validation errors) do not count.
  • The meter resets at 00:00 UTC.
  • Daily limits are scoped: per IP for anonymous, per account for authenticated.

Detecting an exhausted budget

On a 429 response, Lumin returns an MCP error with code rate_limited and a human-readable reason describing whether the limit was the daily quota or a transient burst. Treat both the same way at runtime: stop the current reading, surface a friendly message, queue or shed the call.

Backoff that respects multi-call readings

A complete reading uses 25 to 40 calls. If you start one and hit a rate limit at call 30, you have wasted 30 calls already. Pre-flight the budget when starting a reading: if fewer than 50 calls remain, warn the user before you begin instead of failing midway.

Server-side compaction

Lumin compacts large tool responses on the server before returning them. This reduces token cost on your end but does not affect the rate-limit count. One logical tool call still counts as one.