Claude API Rate Limits Explained: How to Avoid 429 Errors (2026)

The 429 error — “too many requests” — is the most common production issue with the Claude API. It’s also the one that surprises people most, because you can hit it with plenty of unused credit and a perfectly valid API key.

This post explains how Anthropic’s rate limits actually work, why you hit them, and the practical patterns that avoid them in production Make.com scenarios.

What a 429 Means

When you make an API call to Claude, Anthropic’s servers check three things:

Requests per minute (RPM) — how many API calls you’ve made in the last 60 seconds
Input tokens per minute (ITPM) — how much text you’ve sent in the last 60 seconds
Output tokens per minute (OTPM) — how much text Claude has returned in the last 60 seconds

Exceed any of the three, and you get a 429 response:

{
  "type": "error",
  "error": {
    "type": "rate_limit_error",
    "message": "This request would exceed your organization's rate limit..."
  }
}

The API call fails. Nothing was processed. You weren’t charged. But your scenario stopped.

Anthropic’s Tier System

Your rate limits depend on your tier — not your plan, not your credit balance, but your account’s usage history. Higher tiers = higher limits.

As of April 2026, the tiers for Claude Sonnet 4.6 (approximate):

Tier	RPM	ITPM	OTPM	How to reach
Tier 1	~50	~40k	~8k	New account with $5 credit
Tier 2	~1,000	~80k	~16k	Spend $40+ and account is 7+ days old
Tier 3	~2,000	~160k	~32k	Spend $200+ and account is 7+ days old
Tier 4	~4,000	~400k	~80k	Spend $400+ and account is 14+ days old

(Exact limits vary by model and change — check docs.anthropic.com/en/api/rate-limits for current numbers.)

Important: limits are per-model. Haiku has its own limits, Sonnet has its own, Opus has its own. Using multiple models doesn’t share the limit pool.

Tier progression is automatic. You don’t apply. You just use the API, spend money, and your limits rise.

Why You Hit Limits in Make.com Specifically

Make.com scenarios have a specific failure mode: batch processing.

Go Deeper

Full Implementation Blueprint — $29

The Blueprint course walks through production-ready Make.com + Claude + Gemini + Perplexity scenarios end-to-end. Real templates, real error handling, real costs.

Get Blueprint → Try Free First

If your scenario triggers on 100 new emails at once (e.g. after you’ve been offline and they queued up), Make will process them sequentially. By default it might send all 100 to Claude in under a minute. That’s 100 RPM — Tier 1 cuts you off at 50.

Same problem with Iterator modules processing large arrays, or with scenarios that fire on webhooks with high throughput.

Ninety percent of the 429 errors I see are variants of this pattern.

Five Fixes (In Order of Effort)

1. Set Max Tokens on Every Module

This doesn’t prevent RPM errors, but it prevents OTPM errors. Without Max Tokens, Claude can generate 4,000+ output tokens when you needed 200. That’s 20x the output token cost and a much faster path to OTPM limits.

Every Claude module in Make.com has a Max Tokens field. Set it. I use:

256 for classification / extraction
512 for short replies
1024 for medium responses
2048+ only when I explicitly need long output

2. Use a Lighter Model for Bulk Work

Haiku has higher throughput limits than Sonnet or Opus on Tier 1 — more RPM at the same tier. If you’re processing a large batch, routing through Haiku first (for classification/filtering) before hitting Sonnet only on the cases that matter saves both rate limit and cost.

3. Add Delay Between Iterations

In Make.com, if you’re processing a list with an Iterator, insert a Sleep module after the Claude call inside the loop. A 2-second sleep caps you at 30 RPM — comfortably under Tier 1’s 50.

Find Sleep: add module → Tools → Sleep → 2 seconds.

It feels slow, but running reliably at 30 RPM beats failing at 80 RPM.

4. Add Error Handling with Retries

For transient 429s, automatic retry fixes them:

Right-click the Claude module → Add error handler
Pick Break
Configure: Number of attempts: 3, Interval: 30 seconds

When Claude returns 429, Make waits 30 seconds, tries again. Usually works on the second attempt because the 1-minute window has rolled.

Exponential backoff is fancier but Break with fixed interval works fine for most cases.

5. Upgrade Your Anthropic Tier

The real fix for sustained high-volume workflows is just having higher limits. Tier progression happens automatically as you spend, but if you need it faster, Anthropic’s enterprise pricing page has a contact form for custom tier requests.

For most small-to-medium businesses, Tier 2 (reached after spending ~$40) is enough headroom that 429s become rare.

Prompt Caching — The Sneaky Solution

Anthropic offers prompt caching: if you’re using a long system prompt across multiple calls, you can cache it and subsequent calls use the cached version at 10% of the input cost.

This also helps with rate limits: cached input tokens don’t count toward your ITPM limit.

For a customer service bot with a 2,000-token system prompt handling 100 messages/minute:

Without caching: 200,000 ITPM (way over Tier 1’s 40k limit)
With caching: 0 cached + ~200 per user message = 20,000 ITPM (fine on Tier 1)

Enable it by adding "cache_control": {"type": "ephemeral"} to your system prompt in the API call. Make.com’s Claude module exposes this in advanced options as of the 2026 update.

Reading the 429 Response Header

When Claude returns a 429, it includes a retry-after header telling you exactly how many seconds to wait. Well-built retry logic reads this header and waits that long before retrying.

Make.com’s Break error handler doesn’t read headers automatically, but you can build custom retry logic with HTTP modules + Sleep + a Router if you’re hitting lots of 429s. Usually overkill — the fixed 30-second wait is fine.

Monitoring Rate Limit Usage

In the Anthropic Console → Usage tab, there’s a “Rate limits” section showing your current usage vs limits for each model. Check it weekly. If you’re running close to the ceiling regularly, that’s a signal to upgrade your tier or re-architect (route to lighter models, add caching, add batch delays).

What NOT to Do

Don’t spam retry immediately. Some people, seeing a 429, add a loop that retries 20 times with no delay. This makes it worse — you’re now flooding the API with the same requests you couldn’t afford in the first place. Anthropic may temporarily suspend your organisation for abusive patterns.

Don’t rotate API keys. Creating multiple keys under the same Anthropic organisation does not bypass rate limits. Limits are org-wide, not key-wide.

Don’t assume the error is your code. 429s are a signal to architect, not to debug. If you’re hitting them, it means your workflow is running hot — that’s a scaling problem, not a bug.

TL;DR

Set Max Tokens on every Claude module
Use Haiku for bulk work, Sonnet only when quality demands
Add Sleep modules inside loops
Add Break error handlers with 3 retries at 30 seconds
Use prompt caching for long system prompts
Let your tier progress automatically as you spend

Ninety percent of rate limit issues are solved by the first four.

Next Steps

If you want the full error-handling playbook — rate limits, API errors, Claude giving bad output, network timeouts, all of it — the Implementation Blueprint course ($29) covers production error handling in depth, including copy-paste Break configurations for common scenarios.

Last updated: 20 April 2026. Rate limits and tier thresholds change as Anthropic evolves their infrastructure — verify current numbers on Anthropic’s docs before architecting.

Go Deeper

Full Implementation Blueprint — $29

The Blueprint course walks through production-ready Make.com + Claude + Gemini + Perplexity scenarios end-to-end. Real templates, real error handling, real costs.

Get Blueprint → Try Free First

What a 429 Means

Anthropic’s Tier System

Why You Hit Limits in Make.com Specifically

Full Implementation Blueprint — $29

Five Fixes (In Order of Effort)

1. Set Max Tokens on Every Module

2. Use a Lighter Model for Bulk Work

3. Add Delay Between Iterations

4. Add Error Handling with Retries

5. Upgrade Your Anthropic Tier

Prompt Caching — The Sneaky Solution

Reading the 429 Response Header

Monitoring Rate Limit Usage

What NOT to Do

TL;DR

Next Steps

Full Implementation Blueprint — $29

How to Connect Claude API to Make.com: Step-by-Step Guide (2026)

Perplexity API in Make.com: Complete Tutorial with Working Examples (2026)

Make.com Webhook to Claude: Receive, Parse, Respond (Working Example, 2026)