Rate Limiting for Machines
Guardhouse implements rate limiting to prevent abuse and ensure fair usage of API resources. This guide covers rate limits for machine-to-machine (M2M) and AI agent applications.
What is Rate Limiting?
Rate limiting restricts the number of API requests a client can make within a specific time window. This protects:
- ✅ API Infrastructure - Prevents overload and DoS attacks
- ✅ Fair Usage - Ensures equitable resource allocation
- ✅ Cost Control - Helps manage usage-based pricing
- ✅ System Stability - Maintains performance and availability
Rate Limit Tiers
Guardhouse applies different rate limits based on your plan and application type:
| Plan | Requests per Minute | Requests per Hour | Requests per Day | Burst Allowance |
|---|---|---|---|---|
| Free | 60 | 3,600 | 86,400 | 1.5x (90 req/min) |
| Developer | 300 | 18,000 | 432,000 | 2x (600 req/min) |
| Team | 1,000 | 60,000 | 1,440,000 | 3x (3,000 req/min) |
| Enterprise | 3,000 | 180,000 | 4,320,000 | 5x (15,000 req/min) |
Burst Allowance: Temporary increased limit for handling traffic spikes
Rate Limit Response Headers
Every API response includes rate limit information:
X-RateLimit-Limit: 1000
X-RateLimit-Remaining: 950
X-RateLimit-Reset: 1640995200
X-RateLimit-Reset-After: 300
X-RateLimit-Window: 60
Header Descriptions
| Header | Description |
|---|---|
X-RateLimit-Limit | Maximum requests allowed in the time window |
X-RateLimit-Remaining | Number of requests remaining in current window |
X-RateLimit-Reset | Unix timestamp when the window resets |
X-RateLimit-Reset-After | Seconds until the window resets |
X-RateLimit-Window | Length of the time window in seconds |
Rate Limiting Behavior
Windowed Rate Limiting
Rate limits are applied over a rolling time window:
Time: 00:00 → 01:00 (1 minute window)
Requests allowed: 100
Used: 95
Remaining: 5
Status: ✅ Allowed
Time: 01:00 → 01:00 (new window)
Requests allowed: 100
Used: 0
Remaining: 100
Status: ✅ Allowed
Time: 01:00 → 01:05 (5 seconds into window)
Requests allowed: 100
Used: 100
Remaining: 0
Status: ⏸ Rate Limited (must wait until 01:01)
Client-Specific Limits
Different clients may have different limits:
| Client Type | Base Limit | Burst Limit | Notes |
|---|---|---|---|
| SPA (User Auth) | 60/min | 90/min | Stricter due to user context |
| Backend (User Auth) | 300/min | 600/min | Higher limits for server-side apps |
| M2M / AI Agent | 1,000/min | 5,000/min | Highest limits for automated services |
| Admin API | 3,000/min | 15,000/min | Special limits for management operations |
Handling Rate Limits
Detecting Rate Limits
When you receive a rate limit response, handle it appropriately:
async function makeApiRequest(url, options = {}) {
try {
const response = await fetch(url, {
...options,
headers: {
...options.headers,
'Authorization': `Bearer ${accessToken}`
}
});
return response;
} catch (error) {
if (error.status === 429) {
// Rate limited
const rateLimitInfo = parseRateLimitHeaders(error);
console.log('Rate limit reached:', rateLimitInfo);
// Calculate wait time
const waitTime = calculateWaitTime(rateLimitInfo);
if (options.retryOnRateLimit) {
await sleep(waitTime);
return makeApiRequest(url, options);
}
}
throw error;
}
}
function parseRateLimitHeaders(error) {
return {
limit: error.headers.get('X-RateLimit-Limit'),
remaining: error.headers.get('X-RateLimit-Remaining'),
reset: error.headers.get('X-RateLimit-Reset'),
resetAfter: error.headers.get('X-RateLimit-Reset-After'),
window: error.headers.get('X-RateLimit-Window')
};
}
function calculateWaitTime(rateLimitInfo) {
// Wait until window resets
return rateLimitInfo.resetAfter * 1000; // Convert to milliseconds
}
Exponential Backoff
Implement exponential backoff for retries:
class RateLimiter {
constructor(maxRetries = 5, initialDelay = 1000, multiplier = 2) {
this.maxRetries = maxRetries;
this.initialDelay = initialDelay;
this.multiplier = multiplier;
this.attempts = new Map();
}
async execute(request, retryKey) {
let delay = this.initialDelay;
for (let attempt = 1; attempt <= this.maxRetries; attempt++) {
try {
const result = await request();
return result; // Success
} catch (error) {
if (error.status === 429) {
// Rate limited - wait and retry
console.warn(`Rate limited, attempt ${attempt}/${this.maxRetries}. Waiting ${delay}ms`);
await this.sleep(delay);
delay *= this.multiplier;
} else if (error.status >= 500) {
// Server error - retry with backoff
console.warn(`Server error, attempt ${attempt}/${this.maxRetries}. Waiting ${delay}ms`);
await this.sleep(delay);
delay *= this.multiplier;
} else {
// Non-retryable error
throw error;
}
}
}
throw new Error(`Max retries (${this.maxRetries}) exceeded`);
}
sleep(ms) {
return new Promise(resolve => setTimeout(resolve, ms));
}
}
// Usage
const limiter = new RateLimiter();
await limiter.execute(
() => fetch('https://your_tenant.guardhouse.cloud/api/v2/users'),
'user-list-key'
);
Request Queuing
For high-volume operations, implement request queuing:
class RequestQueue {
constructor(maxConcurrent = 10, rateLimit = 100) {
this.maxConcurrent = maxConcurrent;
this.rateLimit = rateLimit; // Requests per minute
this.queue = [];
this.processing = 0;
this.windowStart = Date.now();
}
async add(request) {
// Check if we can execute immediately
if (this.processing < this.maxConcurrent) {
this.processing++;
return await this.execute(request);
}
// Add to queue
return new Promise((resolve, reject) => {
this.queue.push({ request, resolve, reject });
});
}
async execute({ request, resolve, reject }) {
try {
const result = await request();
resolve(result);
} catch (error) {
reject(error);
} finally {
this.processing--;
this.processQueue();
}
}
processQueue() {
const elapsed = Date.now() - this.windowStart;
// Reset window every minute
if (elapsed > 60000) {
this.windowStart = Date.now();
}
// Process queued requests within rate limit
const windowRequests = Math.floor(this.rateLimit * (elapsed / 60000));
while (this.queue.length > 0 && this.processing < this.maxConcurrent) {
const { request, resolve, reject } = this.queue.shift();
this.processing++;
// Execute with timeout
const promise = this.execute({ request, resolve, reject });
const timeout = new Promise((_, reject) =>
setTimeout(() => reject(new Error('Timeout')), 30000)
);
Promise.race([promise, timeout])
.then(() => this.processing--)
.catch(() => this.processing--);
}
}
}
// Usage
const queue = new RequestQueue(maxConcurrent=5, rateLimit=1000);
// Add many requests
for (let i = 0; i < 100; i++) {
queue.add(() => fetch(`https://your_tenant.guardhouse.cloud/api/v2/users/${i}`));
}
Best Practices
1. Respect Rate Limits
- ✅ Check Headers - Always inspect
X-RateLimit-*headers - ✅ Wait Before Retry - Don't retry immediately when rate limited
- ✅ Use Appropriate Backoff - Implement exponential backoff
- ✅ Queue Requests - Don't overwhelm the server
2. Optimize API Usage
- ✅ Batch Operations - Combine multiple operations into one request
- ✅ Use Pagination - Request only the data you need
- ✅ Caching - Cache frequently accessed resources
- ✅ Compression - Enable request/response compression
- ✅ Avoid Polling - Use webhooks or event streams instead
3. Monitoring
- ✅ Track Usage - Monitor your request rates
- ✅ Set Alerts - Get notified when approaching limits
- ✅ Analyze Patterns - Identify which operations use the most quota
- ✅ Optimize Queries - Reduce unnecessary API calls
4. Testing
- ✅ Load Test - Simulate high-traffic scenarios
- ✅ Rate Limit Test - Verify your backoff logic works
- ✅ Error Handling - Test error recovery
- ✅ Monitoring Test - Ensure alerts fire correctly
Rate Limit Strategies
Token Bucket Algorithm
Implement token bucket for smooth rate limiting:
class TokenBucket {
constructor(capacity, refillRate) {
this.capacity = capacity;
this.tokens = capacity;
this.refillRate = refillRate; // Tokens per second
this.lastRefill = Date.now();
}
consume(tokens = 1) {
const now = Date.now();
const elapsed = (now - this.lastRefill) / 1000;
const refill = Math.floor(elapsed * this.refillRate);
this.tokens = Math.min(this.capacity, this.tokens + refill);
this.lastRefill = now;
return this.tokens >= tokens;
}
async acquire(tokens = 1, maxWait = 60000) {
const start = Date.now();
while (!this.consume(tokens)) {
const elapsed = Date.now() - start;
if (elapsed > maxWait) {
throw new Error('Rate limit timeout');
}
// Wait for refill
await new Promise(resolve => setTimeout(resolve, 100));
}
}
}
// Usage
const bucket = new TokenBucket(100, 1); // 1 token per second
// Acquire 10 tokens (rate limited)
await bucket.acquire(10);
Sliding Window Algorithm
Implement sliding window for consistent rate limiting:
class SlidingWindow {
constructor(windowSize, maxRequests) {
this.windowSize = windowSize;
this.maxRequests = maxRequests;
this.requests = [];
}
recordRequest() {
const now = Date.now();
// Remove requests outside the window
this.requests = this.requests.filter(
request => now - request.timestamp < this.windowSize * 1000
);
// Add new request
this.requests.push({ timestamp: now });
return this.requests.length < this.maxRequests;
}
// Usage
const limiter = new SlidingWindow(60000, 100); // 1 minute window, 100 requests
if (limiter.recordRequest()) {
console.log('Request allowed');
} else {
console.log('Rate limited');
}
Common Patterns
1. Client-side Rate Limiting
class ClientRateLimiter {
constructor(maxRequests) {
this.maxRequests = maxRequests;
this.requests = [];
}
async request(apiCall) {
// Remove requests older than 1 minute
const now = Date.now();
this.requests = this.requests.filter(
req => now - req.timestamp < 60000
);
if (this.requests.length >= this.maxRequests) {
const waitTime = this.getOldestRequestAge();
console.log(`Rate limited. Wait ${waitTime}ms`);
await this.sleep(waitTime);
return this.request(apiCall);
}
this.requests.push({ timestamp: now, call: apiCall });
return apiCall();
}
getOldestRequestAge() {
const oldest = this.requests[0];
return Date.now() - oldest.timestamp;
}
sleep(ms) {
return new Promise(resolve => setTimeout(resolve, ms));
}
}
// Usage
const limiter = new ClientRateLimiter(10); // 10 requests per minute
// Make multiple requests
for (let i = 0; i < 15; i++) {
await limiter.request(() => fetch('/api/data'));
}
2. Server-side Rate Limiting
const rateLimits = new Map();
function checkRateLimit(clientId) {
const key = `rate_limit:${clientId}`;
const now = Date.now();
if (!rateLimits.has(key)) {
rateLimits.set(key, { count: 1, windowStart: now });
return true;
}
const data = rateLimits.get(key);
const elapsed = (now - data.windowStart) / 1000;
if (elapsed >= 60) {
// Reset window
rateLimits.set(key, { count: 1, windowStart: now });
return true;
}
if (data.count >= 100) {
// Rate limited
return false;
}
data.count++;
return true;
}
// Middleware
function rateLimitMiddleware(req, res, next) {
const clientId = req.headers['x-client-id'];
if (!checkRateLimit(clientId)) {
return res.status(429).json({
error: 'Rate limit exceeded',
retryAfter: 60
});
}
res.setHeader('X-RateLimit-Limit', '100');
res.setHeader('X-RateLimit-Remaining', `${100 - rateLimits.get(`rate_limit:${clientId}`).count}`);
res.setHeader('X-RateLimit-Reset', `${Math.ceil((Date.now() - rateLimits.get(`rate_limit:${clientId}`).windowStart) / 1000)}`);
res.setHeader('X-RateLimit-Window', '60');
next();
}
Troubleshooting
Common Issues
Issue: "Unexpected rate limit errors"
- Verify you're reading rate limit headers correctly
- Check for burst allowance
- Ensure your retry logic doesn't exceed limits
- Monitor actual vs. expected request rates
Issue: "Rate limit too strict"
- Contact support to discuss your use case
- Request a higher plan with more generous limits
- Implement request caching to reduce API calls
- Consider using different client for high-volume operations
Issue: "Backoff not working"
- Ensure you're not resetting the timer after retry
- Verify you're not retrying non-transient errors
- Implement maximum retry count
- Consider using circuit breakers for failing endpoints
Related Documentation
- Client Credentials Flow - M2M authentication
- Introspection for AI Requests - Token validation
- Error Handling - Common error codes
- Best Practices - SDK recommendations
Support
For issues, questions, or contributions: