Rate Limiting for Machines

Guardhouse implements rate limiting to prevent abuse and ensure fair usage of API resources. This guide covers rate limits for machine-to-machine (M2M) and AI agent applications.

What is Rate Limiting?

Rate limiting restricts the number of API requests a client can make within a specific time window. This protects:

✅ API Infrastructure - Prevents overload and DoS attacks
✅ Fair Usage - Ensures equitable resource allocation
✅ Cost Control - Helps manage usage-based pricing
✅ System Stability - Maintains performance and availability

Rate Limit Tiers

Guardhouse applies different rate limits based on your plan and application type:

Plan	Requests per Minute	Requests per Hour	Requests per Day	Burst Allowance
Free	60	3,600	86,400	1.5x (90 req/min)
Developer	300	18,000	432,000	2x (600 req/min)
Team	1,000	60,000	1,440,000	3x (3,000 req/min)
Enterprise	3,000	180,000	4,320,000	5x (15,000 req/min)

Burst Allowance: Temporary increased limit for handling traffic spikes

Rate Limit Response Headers

Every API response includes rate limit information:

X-RateLimit-Limit: 1000
X-RateLimit-Remaining: 950
X-RateLimit-Reset: 1640995200
X-RateLimit-Reset-After: 300
X-RateLimit-Window: 60

Header Descriptions

Header	Description
`X-RateLimit-Limit`	Maximum requests allowed in the time window
`X-RateLimit-Remaining`	Number of requests remaining in current window
`X-RateLimit-Reset`	Unix timestamp when the window resets
`X-RateLimit-Reset-After`	Seconds until the window resets
`X-RateLimit-Window`	Length of the time window in seconds

Rate Limiting Behavior

Windowed Rate Limiting

Rate limits are applied over a rolling time window:

Time: 00:00 → 01:00 (1 minute window)
Requests allowed: 100
Used: 95
Remaining: 5
Status: ✅ Allowed

Time: 01:00 → 01:00 (new window)
Requests allowed: 100
Used: 0
Remaining: 100
Status: ✅ Allowed

Time: 01:00 → 01:05 (5 seconds into window)
Requests allowed: 100
Used: 100
Remaining: 0
Status: ⏸ Rate Limited (must wait until 01:01)

Client-Specific Limits

Different clients may have different limits:

Client Type	Base Limit	Burst Limit	Notes
SPA (User Auth)	60/min	90/min	Stricter due to user context
Backend (User Auth)	300/min	600/min	Higher limits for server-side apps
M2M / AI Agent	1,000/min	5,000/min	Highest limits for automated services
Admin API	3,000/min	15,000/min	Special limits for management operations

Handling Rate Limits

Detecting Rate Limits

When you receive a rate limit response, handle it appropriately:

async function makeApiRequest(url, options = {}) {
  try {
    const response = await fetch(url, {
      ...options,
      headers: {
        ...options.headers,
        'Authorization': `Bearer ${accessToken}`
      }
    });

    return response;
  } catch (error) {
    if (error.status === 429) {
      // Rate limited
      const rateLimitInfo = parseRateLimitHeaders(error);
      console.log('Rate limit reached:', rateLimitInfo);
      
      // Calculate wait time
      const waitTime = calculateWaitTime(rateLimitInfo);
      
      if (options.retryOnRateLimit) {
        await sleep(waitTime);
        return makeApiRequest(url, options);
      }
    }
    
    throw error;
  }
}

function parseRateLimitHeaders(error) {
  return {
    limit: error.headers.get('X-RateLimit-Limit'),
    remaining: error.headers.get('X-RateLimit-Remaining'),
    reset: error.headers.get('X-RateLimit-Reset'),
    resetAfter: error.headers.get('X-RateLimit-Reset-After'),
    window: error.headers.get('X-RateLimit-Window')
  };
}

function calculateWaitTime(rateLimitInfo) {
  // Wait until window resets
  return rateLimitInfo.resetAfter * 1000; // Convert to milliseconds
}

Exponential Backoff

Implement exponential backoff for retries:

class RateLimiter {
  constructor(maxRetries = 5, initialDelay = 1000, multiplier = 2) {
    this.maxRetries = maxRetries;
    this.initialDelay = initialDelay;
    this.multiplier = multiplier;
    this.attempts = new Map();
  }

  async execute(request, retryKey) {
    let delay = this.initialDelay;
    
    for (let attempt = 1; attempt <= this.maxRetries; attempt++) {
      try {
        const result = await request();
        return result; // Success
      } catch (error) {
        if (error.status === 429) {
          // Rate limited - wait and retry
          console.warn(`Rate limited, attempt ${attempt}/${this.maxRetries}. Waiting ${delay}ms`);
          await this.sleep(delay);
          delay *= this.multiplier;
        } else if (error.status >= 500) {
          // Server error - retry with backoff
          console.warn(`Server error, attempt ${attempt}/${this.maxRetries}. Waiting ${delay}ms`);
          await this.sleep(delay);
          delay *= this.multiplier;
        } else {
          // Non-retryable error
          throw error;
        }
      }
    }
    
    throw new Error(`Max retries (${this.maxRetries}) exceeded`);
  }

  sleep(ms) {
    return new Promise(resolve => setTimeout(resolve, ms));
  }
}

// Usage
const limiter = new RateLimiter();

await limiter.execute(
  () => fetch('https://your_tenant.guardhouse.cloud/api/v2/users'),
  'user-list-key'
);

Request Queuing

For high-volume operations, implement request queuing:

class RequestQueue {
  constructor(maxConcurrent = 10, rateLimit = 100) {
    this.maxConcurrent = maxConcurrent;
    this.rateLimit = rateLimit; // Requests per minute
    this.queue = [];
    this.processing = 0;
    this.windowStart = Date.now();
  }

  async add(request) {
    // Check if we can execute immediately
    if (this.processing < this.maxConcurrent) {
      this.processing++;
      return await this.execute(request);
    }

    // Add to queue
    return new Promise((resolve, reject) => {
      this.queue.push({ request, resolve, reject });
    });
  }

  async execute({ request, resolve, reject }) {
    try {
      const result = await request();
      resolve(result);
    } catch (error) {
      reject(error);
    } finally {
      this.processing--;
      this.processQueue();
    }
  }

  processQueue() {
    const elapsed = Date.now() - this.windowStart;
    
    // Reset window every minute
    if (elapsed > 60000) {
      this.windowStart = Date.now();
    }

    // Process queued requests within rate limit
    const windowRequests = Math.floor(this.rateLimit * (elapsed / 60000));
    
    while (this.queue.length > 0 && this.processing < this.maxConcurrent) {
      const { request, resolve, reject } = this.queue.shift();
      this.processing++;
      
      // Execute with timeout
      const promise = this.execute({ request, resolve, reject });
      const timeout = new Promise((_, reject) => 
        setTimeout(() => reject(new Error('Timeout')), 30000)
      );
      
      Promise.race([promise, timeout])
        .then(() => this.processing--)
        .catch(() => this.processing--);
    }
  }
}

// Usage
const queue = new RequestQueue(maxConcurrent=5, rateLimit=1000);

// Add many requests
for (let i = 0; i < 100; i++) {
  queue.add(() => fetch(`https://your_tenant.guardhouse.cloud/api/v2/users/${i}`));
}

Best Practices

1. Respect Rate Limits

✅ Check Headers - Always inspect X-RateLimit-* headers
✅ Wait Before Retry - Don't retry immediately when rate limited
✅ Use Appropriate Backoff - Implement exponential backoff
✅ Queue Requests - Don't overwhelm the server

2. Optimize API Usage

✅ Batch Operations - Combine multiple operations into one request
✅ Use Pagination - Request only the data you need
✅ Caching - Cache frequently accessed resources
✅ Compression - Enable request/response compression
✅ Avoid Polling - Use webhooks or event streams instead

3. Monitoring

✅ Track Usage - Monitor your request rates
✅ Set Alerts - Get notified when approaching limits
✅ Analyze Patterns - Identify which operations use the most quota
✅ Optimize Queries - Reduce unnecessary API calls

4. Testing

✅ Load Test - Simulate high-traffic scenarios
✅ Rate Limit Test - Verify your backoff logic works
✅ Error Handling - Test error recovery
✅ Monitoring Test - Ensure alerts fire correctly

Rate Limit Strategies

Token Bucket Algorithm

Implement token bucket for smooth rate limiting:

class TokenBucket {
  constructor(capacity, refillRate) {
    this.capacity = capacity;
    this.tokens = capacity;
    this.refillRate = refillRate; // Tokens per second
    this.lastRefill = Date.now();
  }

  consume(tokens = 1) {
    const now = Date.now();
    const elapsed = (now - this.lastRefill) / 1000;
    const refill = Math.floor(elapsed * this.refillRate);
    this.tokens = Math.min(this.capacity, this.tokens + refill);
    this.lastRefill = now;
    
    return this.tokens >= tokens;
  }

  async acquire(tokens = 1, maxWait = 60000) {
    const start = Date.now();
    
    while (!this.consume(tokens)) {
      const elapsed = Date.now() - start;
      if (elapsed > maxWait) {
        throw new Error('Rate limit timeout');
      }
      
      // Wait for refill
      await new Promise(resolve => setTimeout(resolve, 100));
    }
  }
}

// Usage
const bucket = new TokenBucket(100, 1); // 1 token per second

// Acquire 10 tokens (rate limited)
await bucket.acquire(10);

Sliding Window Algorithm

Implement sliding window for consistent rate limiting:

class SlidingWindow {
  constructor(windowSize, maxRequests) {
    this.windowSize = windowSize;
    this.maxRequests = maxRequests;
    this.requests = [];
  }

  recordRequest() {
    const now = Date.now();
    
    // Remove requests outside the window
    this.requests = this.requests.filter(
      request => now - request.timestamp < this.windowSize * 1000
    );
    
    // Add new request
    this.requests.push({ timestamp: now });
    
    return this.requests.length < this.maxRequests;
  }

  // Usage
const limiter = new SlidingWindow(60000, 100); // 1 minute window, 100 requests

if (limiter.recordRequest()) {
  console.log('Request allowed');
} else {
  console.log('Rate limited');
}

Common Patterns

1. Client-side Rate Limiting

class ClientRateLimiter {
  constructor(maxRequests) {
    this.maxRequests = maxRequests;
    this.requests = [];
  }

  async request(apiCall) {
    // Remove requests older than 1 minute
    const now = Date.now();
    this.requests = this.requests.filter(
      req => now - req.timestamp < 60000
    );

    if (this.requests.length >= this.maxRequests) {
      const waitTime = this.getOldestRequestAge();
      console.log(`Rate limited. Wait ${waitTime}ms`);
      await this.sleep(waitTime);
      return this.request(apiCall);
    }

    this.requests.push({ timestamp: now, call: apiCall });
    return apiCall();
  }

  getOldestRequestAge() {
    const oldest = this.requests[0];
    return Date.now() - oldest.timestamp;
  }

  sleep(ms) {
    return new Promise(resolve => setTimeout(resolve, ms));
  }
}

// Usage
const limiter = new ClientRateLimiter(10); // 10 requests per minute

// Make multiple requests
for (let i = 0; i < 15; i++) {
  await limiter.request(() => fetch('/api/data'));
}

2. Server-side Rate Limiting

const rateLimits = new Map();

function checkRateLimit(clientId) {
  const key = `rate_limit:${clientId}`;
  const now = Date.now();
  
  if (!rateLimits.has(key)) {
    rateLimits.set(key, { count: 1, windowStart: now });
    return true;
  }

  const data = rateLimits.get(key);
  const elapsed = (now - data.windowStart) / 1000;
  
  if (elapsed >= 60) {
    // Reset window
    rateLimits.set(key, { count: 1, windowStart: now });
    return true;
  }

  if (data.count >= 100) {
    // Rate limited
    return false;
  }

  data.count++;
  return true;
}

// Middleware
function rateLimitMiddleware(req, res, next) {
  const clientId = req.headers['x-client-id'];
  
  if (!checkRateLimit(clientId)) {
    return res.status(429).json({
      error: 'Rate limit exceeded',
      retryAfter: 60
    });
  }

  res.setHeader('X-RateLimit-Limit', '100');
  res.setHeader('X-RateLimit-Remaining', `${100 - rateLimits.get(`rate_limit:${clientId}`).count}`);
  res.setHeader('X-RateLimit-Reset', `${Math.ceil((Date.now() - rateLimits.get(`rate_limit:${clientId}`).windowStart) / 1000)}`);
  res.setHeader('X-RateLimit-Window', '60');

  next();
}

Troubleshooting

Common Issues

Issue: "Unexpected rate limit errors"

Verify you're reading rate limit headers correctly
Check for burst allowance
Ensure your retry logic doesn't exceed limits
Monitor actual vs. expected request rates

Issue: "Rate limit too strict"

Contact support to discuss your use case
Request a higher plan with more generous limits
Implement request caching to reduce API calls
Consider using different client for high-volume operations

Issue: "Backoff not working"

Ensure you're not resetting the timer after retry
Verify you're not retrying non-transient errors
Implement maximum retry count
Consider using circuit breakers for failing endpoints

Client Credentials Flow - M2M authentication
Introspection for AI Requests - Token validation
Error Handling - Common error codes
Best Practices - SDK recommendations

Support

For issues, questions, or contributions:

What is Rate Limiting?​

Rate Limit Tiers​

Rate Limit Response Headers​

Header Descriptions​

Rate Limiting Behavior​

Windowed Rate Limiting​

Client-Specific Limits​

Handling Rate Limits​

Detecting Rate Limits​

Exponential Backoff​

Request Queuing​

Best Practices​

1. Respect Rate Limits​

2. Optimize API Usage​

3. Monitoring​

4. Testing​

Rate Limit Strategies​

Token Bucket Algorithm​

Sliding Window Algorithm​

Common Patterns​

1. Client-side Rate Limiting​

2. Server-side Rate Limiting​

Troubleshooting​

Common Issues​

Related Documentation​

Support​