Skip to main content

General

Limitry is a usage metering and limit enforcement platform for AI products. It helps you track usage, enforce limits, manage prepaid credits, and bill customers accurately.
The core workflow:
  1. Check - Before a request, verify the customer is within limits
  2. Request - If allowed, make your request
  3. Record - After the request, record the actual usage
See Check-Record Pattern for details.
  • Limits: Usage caps that reset periodically (e.g., 100,000 tokens/month)
  • Balances: Prepaid credits that customers spend down (e.g., buy 10,000 credits)
Use limits for subscription-based caps. Use balances for prepaid/pay-as-you-go.
Meters define how events are aggregated into metrics. For example:
  • A meter that sums the tokens value from llm.completion events
  • A meter that counts image.generation events
Limits are linked to meters — the limit checks the meter’s value.
Yes! Limitry offers a free tier for development and small-scale usage. Check the pricing page for details.

Technical

The Limitry API itself has limits:
  • 1,000 requests per minute
  • 100,000 requests per day
These are separate from the limits you configure for your customers.
Check requests typically complete in under 10ms. We use edge caching and optimized queries to minimize latency on your hot path.
We recommend implementing a fallback strategy:
  • Fail open: Allow requests and record later (for non-critical limits)
  • Fail closed: Block requests (for hard spending caps)
Check our Status Page and set up alerts.
Yes! Events support arbitrary values (numeric) and dimensions (categorical):
limitry.events.record(
    customer_id="cust_123",
    event_type="custom.operation",
    values={"units": 50, "cost": 100},
    dimensions={"category": "premium"}
)
Then create meters to aggregate any value you track.
Limits support multiple dimensions via meters:
  • Customer: Per-customer limits
  • Model: Different limits per model (gpt-4 vs gpt-3.5)
  • Feature: Limits per feature in your app
  • Custom: Any dimension you define in events
Yes. All data is encrypted in transit (TLS) and at rest. We don’t store the content of your requests — only usage metadata. See our Security page.

Limits & Meters

  • Hourly - Resets every hour
  • Daily - Resets at midnight UTC
  • Weekly - Resets Monday midnight UTC
  • Monthly - Resets first of month
  • All-time - Never resets (lifetime limits)
Yes! Configure alert thresholds at any percentage:
client.limits.create(
    name="Daily token limit",
    meter_id=meter.id,
    limit_value=100000,
    period="day",
    customer_id="cust_123",
    alert_thresholds=[50, 80, 90]  # Alert at 50%, 80%, 90%
)
The limits.check call returns allowed: false. You decide what to do:
  • Block the request
  • Allow with a warning
  • Upsell to a higher plan
  • Allow and bill overage
  • sum - Add up all values (total tokens)
  • count - Count events (number of requests)
  • max - Highest value (peak usage)
  • latest - Most recent value (current resource count)

Balances

  1. Create a balance for a customer with initial credits
  2. When they use your product, debit the balance
  3. When they purchase more, credit the balance
Debits are atomic — if there aren’t enough credits, the debit fails entirely.
By default, no. Set minimum_balance=0 to prevent overdraft.To allow overdraft, set a negative minimum:
minimum_balance=-500  # Allow 500 credits of debt
Yes! Set a positive minimum balance:
initial_balance=1000,
minimum_balance=100  # 100 credits always reserved
# available_balance = 900

Integration

Limitry is provider-agnostic. It works with:
  • OpenAI
  • Anthropic
  • Google (Gemini)
  • Azure OpenAI
  • AWS Bedrock
  • Self-hosted models
  • Any API
You call Limitry before/after your requests — it doesn’t proxy them.
Yes! Integrate Limitry in your LangChain pipeline:
from limitry import Limitry

limitry = Limitry()

# In your chain
check = limitry.limits.check(customer_id=user_id)
if check.allowed:
    result = chain.invoke(input)
    limitry.events.record(
        customer_id=user_id,
        event_type="chain.run",
        values={"tokens": result.usage.total_tokens}
    )
Limitry sends webhooks for:
  • Limit threshold alerts
  • Balance low warnings
  • Usage summaries
Configure webhook URLs in the dashboard under SettingsWebhooks.

Billing & Pricing

Limitry charges based on:
  • Events recorded: Number of events.record calls
  • Active customers: Customers with usage in a billing period
Yes! That’s a primary use case. Export usage data via:
  • API (usage summaries, meter values)
  • Dashboard exports (CSV)
  • Webhooks (real-time)
Integrate with Stripe, billing systems, or your own invoicing.
You’ll receive warnings at 80% and 100%. At 100%, you can either upgrade or we’ll throttle non-critical operations (analytics, not real-time checks).

Still have questions?