Best Practices

Production-ready patterns and recommendations

Follow these best practices to ensure reliable, scalable, and secure AI safety testing in your production environment.

API Key Management

Do

  • • Use environment variables for API keys
  • • Rotate keys every 90 days
  • • Use different keys for dev/staging/prod
  • • Implement key encryption at rest
  • • Monitor key usage in dashboard

Don't

  • • Hard-code keys in source code
  • • Share keys between teams
  • • Log API keys in errors
  • • Use production keys in development
  • • Store keys in version control

Error Handling & Retries

Recommended Retry Strategy

# Exponential backoff with jitter
MAX_RETRIES = 3
BASE_DELAY = 1  # seconds

for attempt in range(MAX_RETRIES):
    try:
        response = make_api_call()
        break
    except RateLimitError:
        delay = BASE_DELAY * (2 ** attempt) + random.uniform(0, 1)
        time.sleep(delay)
    except Exception as e:
        log_error(e)
        raise

Rate Limit (429)

Implement exponential backoff with jitter

Server Error (5xx)

Retry up to 3 times with backoff

Client Error (4xx)

Don't retry, fix the request

Performance Optimization

Batch Processing

Process multiple items in parallel for better throughput:

  • • Use thread pools or async/await for concurrent requests
  • • Batch size: 10-50 items per batch (adjust based on rate limits)
  • • Implement progress tracking for large datasets

Connection Pooling

Reuse HTTP connections for better performance:

# Python example with connection pooling
session = requests.Session()
adapter = HTTPAdapter(
    pool_connections=10,
    pool_maxsize=10,
    max_retries=3
)
session.mount('https://', adapter)

Monitoring & Logging

Key Metrics to Monitor

Performance Metrics

  • • API response time (p50, p95, p99)
  • • Request success rate
  • • Rate limit utilization

Safety Metrics

  • • Flagged content percentage
  • • Safety score distribution
  • • Test type usage patterns

Structured Logging Example

{
  "timestamp": "2024-01-15T10:30:45Z",
  "level": "INFO",
  "service": "ai-safety-checker",
  "request_id": "req_123456",
  "test_type": "bias",
  "response_time_ms": 234,
  "flagged": true,
  "score": 0.87,
  "customer_id": "cust_789"
}

Security Considerations

Data Privacy

  • • Never log sensitive prompt/response content
  • • Use customer_id for tracking, not PII
  • • Implement data retention policies
  • • Enable audit logging for compliance

Network Security

  • • Always use HTTPS connections
  • • Implement certificate pinning for mobile apps
  • • Use IP allowlisting for production environments
  • • Enable request signing for additional security

Testing Strategy

Recommended Test Coverage

Pre-production

Test all prompts before deployment

Runtime

Sample 10-20% of production traffic

Post-incident

100% testing after safety incidents

Common Pitfalls to Avoid

Testing in production only

Always test in development first to catch issues early

Solution: Implement staging environment with test API keys

Ignoring rate limits

Hitting rate limits disrupts service availability

Solution: Implement proper backoff and monitor usage

Not handling edge cases

Empty responses, unicode, and special characters can break tests

Solution: Validate inputs and handle all response scenarios

Insufficient monitoring

Missing safety issues due to lack of visibility

Solution: Set up comprehensive logging and alerting

Next Steps