Skip to main content

Rate Limiting for Machines

Guardhouse implements rate limiting to prevent abuse and ensure fair usage of API resources. This guide covers rate limits for machine-to-machine (M2M) and AI agent applications.

What is Rate Limiting?

Rate limiting restricts the number of API requests a client can make within a specific time window. This protects:

  • API Infrastructure - Prevents overload and DoS attacks
  • Fair Usage - Ensures equitable resource allocation
  • Cost Control - Helps manage usage-based pricing
  • System Stability - Maintains performance and availability

Rate Limit Tiers

Guardhouse applies different rate limits based on your plan and application type:

PlanRequests per MinuteRequests per HourRequests per DayBurst Allowance
Free603,60086,4001.5x (90 req/min)
Developer30018,000432,0002x (600 req/min)
Team1,00060,0001,440,0003x (3,000 req/min)
Enterprise3,000180,0004,320,0005x (15,000 req/min)

Burst Allowance: Temporary increased limit for handling traffic spikes

Rate Limit Response Headers

Every API response includes rate limit information:

X-RateLimit-Limit: 1000
X-RateLimit-Remaining: 950
X-RateLimit-Reset: 1640995200
X-RateLimit-Reset-After: 300
X-RateLimit-Window: 60

Header Descriptions

HeaderDescription
X-RateLimit-LimitMaximum requests allowed in the time window
X-RateLimit-RemainingNumber of requests remaining in current window
X-RateLimit-ResetUnix timestamp when the window resets
X-RateLimit-Reset-AfterSeconds until the window resets
X-RateLimit-WindowLength of the time window in seconds

Rate Limiting Behavior

Windowed Rate Limiting

Rate limits are applied over a rolling time window:

Time: 00:00 → 01:00 (1 minute window)
Requests allowed: 100
Used: 95
Remaining: 5
Status: ✅ Allowed

Time: 01:00 → 01:00 (new window)
Requests allowed: 100
Used: 0
Remaining: 100
Status: ✅ Allowed

Time: 01:00 → 01:05 (5 seconds into window)
Requests allowed: 100
Used: 100
Remaining: 0
Status: ⏸ Rate Limited (must wait until 01:01)

Client-Specific Limits

Different clients may have different limits:

Client TypeBase LimitBurst LimitNotes
SPA (User Auth)60/min90/minStricter due to user context
Backend (User Auth)300/min600/minHigher limits for server-side apps
M2M / AI Agent1,000/min5,000/minHighest limits for automated services
Admin API3,000/min15,000/minSpecial limits for management operations

Handling Rate Limits

Detecting Rate Limits

When you receive a rate limit response, handle it appropriately:

async function makeApiRequest(url, options = {}) {
try {
const response = await fetch(url, {
...options,
headers: {
...options.headers,
'Authorization': `Bearer ${accessToken}`
}
});

return response;
} catch (error) {
if (error.status === 429) {
// Rate limited
const rateLimitInfo = parseRateLimitHeaders(error);
console.log('Rate limit reached:', rateLimitInfo);

// Calculate wait time
const waitTime = calculateWaitTime(rateLimitInfo);

if (options.retryOnRateLimit) {
await sleep(waitTime);
return makeApiRequest(url, options);
}
}

throw error;
}
}

function parseRateLimitHeaders(error) {
return {
limit: error.headers.get('X-RateLimit-Limit'),
remaining: error.headers.get('X-RateLimit-Remaining'),
reset: error.headers.get('X-RateLimit-Reset'),
resetAfter: error.headers.get('X-RateLimit-Reset-After'),
window: error.headers.get('X-RateLimit-Window')
};
}

function calculateWaitTime(rateLimitInfo) {
// Wait until window resets
return rateLimitInfo.resetAfter * 1000; // Convert to milliseconds
}

Exponential Backoff

Implement exponential backoff for retries:

class RateLimiter {
constructor(maxRetries = 5, initialDelay = 1000, multiplier = 2) {
this.maxRetries = maxRetries;
this.initialDelay = initialDelay;
this.multiplier = multiplier;
this.attempts = new Map();
}

async execute(request, retryKey) {
let delay = this.initialDelay;

for (let attempt = 1; attempt <= this.maxRetries; attempt++) {
try {
const result = await request();
return result; // Success
} catch (error) {
if (error.status === 429) {
// Rate limited - wait and retry
console.warn(`Rate limited, attempt ${attempt}/${this.maxRetries}. Waiting ${delay}ms`);
await this.sleep(delay);
delay *= this.multiplier;
} else if (error.status >= 500) {
// Server error - retry with backoff
console.warn(`Server error, attempt ${attempt}/${this.maxRetries}. Waiting ${delay}ms`);
await this.sleep(delay);
delay *= this.multiplier;
} else {
// Non-retryable error
throw error;
}
}
}

throw new Error(`Max retries (${this.maxRetries}) exceeded`);
}

sleep(ms) {
return new Promise(resolve => setTimeout(resolve, ms));
}
}

// Usage
const limiter = new RateLimiter();

await limiter.execute(
() => fetch('https://your_tenant.guardhouse.cloud/api/v2/users'),
'user-list-key'
);

Request Queuing

For high-volume operations, implement request queuing:

class RequestQueue {
constructor(maxConcurrent = 10, rateLimit = 100) {
this.maxConcurrent = maxConcurrent;
this.rateLimit = rateLimit; // Requests per minute
this.queue = [];
this.processing = 0;
this.windowStart = Date.now();
}

async add(request) {
// Check if we can execute immediately
if (this.processing < this.maxConcurrent) {
this.processing++;
return await this.execute(request);
}

// Add to queue
return new Promise((resolve, reject) => {
this.queue.push({ request, resolve, reject });
});
}

async execute({ request, resolve, reject }) {
try {
const result = await request();
resolve(result);
} catch (error) {
reject(error);
} finally {
this.processing--;
this.processQueue();
}
}

processQueue() {
const elapsed = Date.now() - this.windowStart;

// Reset window every minute
if (elapsed > 60000) {
this.windowStart = Date.now();
}

// Process queued requests within rate limit
const windowRequests = Math.floor(this.rateLimit * (elapsed / 60000));

while (this.queue.length > 0 && this.processing < this.maxConcurrent) {
const { request, resolve, reject } = this.queue.shift();
this.processing++;

// Execute with timeout
const promise = this.execute({ request, resolve, reject });
const timeout = new Promise((_, reject) =>
setTimeout(() => reject(new Error('Timeout')), 30000)
);

Promise.race([promise, timeout])
.then(() => this.processing--)
.catch(() => this.processing--);
}
}
}

// Usage
const queue = new RequestQueue(maxConcurrent=5, rateLimit=1000);

// Add many requests
for (let i = 0; i < 100; i++) {
queue.add(() => fetch(`https://your_tenant.guardhouse.cloud/api/v2/users/${i}`));
}

Best Practices

1. Respect Rate Limits

  • Check Headers - Always inspect X-RateLimit-* headers
  • Wait Before Retry - Don't retry immediately when rate limited
  • Use Appropriate Backoff - Implement exponential backoff
  • Queue Requests - Don't overwhelm the server

2. Optimize API Usage

  • Batch Operations - Combine multiple operations into one request
  • Use Pagination - Request only the data you need
  • Caching - Cache frequently accessed resources
  • Compression - Enable request/response compression
  • Avoid Polling - Use webhooks or event streams instead

3. Monitoring

  • Track Usage - Monitor your request rates
  • Set Alerts - Get notified when approaching limits
  • Analyze Patterns - Identify which operations use the most quota
  • Optimize Queries - Reduce unnecessary API calls

4. Testing

  • Load Test - Simulate high-traffic scenarios
  • Rate Limit Test - Verify your backoff logic works
  • Error Handling - Test error recovery
  • Monitoring Test - Ensure alerts fire correctly

Rate Limit Strategies

Token Bucket Algorithm

Implement token bucket for smooth rate limiting:

class TokenBucket {
constructor(capacity, refillRate) {
this.capacity = capacity;
this.tokens = capacity;
this.refillRate = refillRate; // Tokens per second
this.lastRefill = Date.now();
}

consume(tokens = 1) {
const now = Date.now();
const elapsed = (now - this.lastRefill) / 1000;
const refill = Math.floor(elapsed * this.refillRate);
this.tokens = Math.min(this.capacity, this.tokens + refill);
this.lastRefill = now;

return this.tokens >= tokens;
}

async acquire(tokens = 1, maxWait = 60000) {
const start = Date.now();

while (!this.consume(tokens)) {
const elapsed = Date.now() - start;
if (elapsed > maxWait) {
throw new Error('Rate limit timeout');
}

// Wait for refill
await new Promise(resolve => setTimeout(resolve, 100));
}
}
}

// Usage
const bucket = new TokenBucket(100, 1); // 1 token per second

// Acquire 10 tokens (rate limited)
await bucket.acquire(10);

Sliding Window Algorithm

Implement sliding window for consistent rate limiting:

class SlidingWindow {
constructor(windowSize, maxRequests) {
this.windowSize = windowSize;
this.maxRequests = maxRequests;
this.requests = [];
}

recordRequest() {
const now = Date.now();

// Remove requests outside the window
this.requests = this.requests.filter(
request => now - request.timestamp < this.windowSize * 1000
);

// Add new request
this.requests.push({ timestamp: now });

return this.requests.length < this.maxRequests;
}

// Usage
const limiter = new SlidingWindow(60000, 100); // 1 minute window, 100 requests

if (limiter.recordRequest()) {
console.log('Request allowed');
} else {
console.log('Rate limited');
}

Common Patterns

1. Client-side Rate Limiting

class ClientRateLimiter {
constructor(maxRequests) {
this.maxRequests = maxRequests;
this.requests = [];
}

async request(apiCall) {
// Remove requests older than 1 minute
const now = Date.now();
this.requests = this.requests.filter(
req => now - req.timestamp < 60000
);

if (this.requests.length >= this.maxRequests) {
const waitTime = this.getOldestRequestAge();
console.log(`Rate limited. Wait ${waitTime}ms`);
await this.sleep(waitTime);
return this.request(apiCall);
}

this.requests.push({ timestamp: now, call: apiCall });
return apiCall();
}

getOldestRequestAge() {
const oldest = this.requests[0];
return Date.now() - oldest.timestamp;
}

sleep(ms) {
return new Promise(resolve => setTimeout(resolve, ms));
}
}

// Usage
const limiter = new ClientRateLimiter(10); // 10 requests per minute

// Make multiple requests
for (let i = 0; i < 15; i++) {
await limiter.request(() => fetch('/api/data'));
}

2. Server-side Rate Limiting

const rateLimits = new Map();

function checkRateLimit(clientId) {
const key = `rate_limit:${clientId}`;
const now = Date.now();

if (!rateLimits.has(key)) {
rateLimits.set(key, { count: 1, windowStart: now });
return true;
}

const data = rateLimits.get(key);
const elapsed = (now - data.windowStart) / 1000;

if (elapsed >= 60) {
// Reset window
rateLimits.set(key, { count: 1, windowStart: now });
return true;
}

if (data.count >= 100) {
// Rate limited
return false;
}

data.count++;
return true;
}

// Middleware
function rateLimitMiddleware(req, res, next) {
const clientId = req.headers['x-client-id'];

if (!checkRateLimit(clientId)) {
return res.status(429).json({
error: 'Rate limit exceeded',
retryAfter: 60
});
}

res.setHeader('X-RateLimit-Limit', '100');
res.setHeader('X-RateLimit-Remaining', `${100 - rateLimits.get(`rate_limit:${clientId}`).count}`);
res.setHeader('X-RateLimit-Reset', `${Math.ceil((Date.now() - rateLimits.get(`rate_limit:${clientId}`).windowStart) / 1000)}`);
res.setHeader('X-RateLimit-Window', '60');

next();
}

Troubleshooting

Common Issues

Issue: "Unexpected rate limit errors"

  • Verify you're reading rate limit headers correctly
  • Check for burst allowance
  • Ensure your retry logic doesn't exceed limits
  • Monitor actual vs. expected request rates

Issue: "Rate limit too strict"

  • Contact support to discuss your use case
  • Request a higher plan with more generous limits
  • Implement request caching to reduce API calls
  • Consider using different client for high-volume operations

Issue: "Backoff not working"

  • Ensure you're not resetting the timer after retry
  • Verify you're not retrying non-transient errors
  • Implement maximum retry count
  • Consider using circuit breakers for failing endpoints

Support

For issues, questions, or contributions: