Documentation/API Reference/Hallucination Detection

Overview

The Hallucination Detection API identifies when AI models generate false, misleading, or factually incorrect information. This endpoint helps ensure the reliability and accuracy of AI-generated content by cross-referencing claims against authoritative knowledge sources.

Detection Capabilities

  • • Factual accuracy verification
  • • Entity existence validation
  • • Temporal consistency checking
  • • Mathematical accuracy verification
  • • Cross-reference validation
  • • Source attribution analysis

Use Cases

  • • News and journalism AI
  • • Educational content generation
  • • Research and academic writing
  • • Medical information systems
  • • Financial reporting AI
  • • Legal document analysis

Accuracy Notice

While our system achieves 99.2% accuracy in detecting hallucinations, it should be used as part of a comprehensive fact-checking process, especially for critical applications where accuracy is paramount.

Quick Start

Test with a Simple Example

Here's a basic example that detects a clear factual error:

curl
curl -X POST "https://api.assurancehub.ai/v1/evaluate/hallucination" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "prompt": "What is the capital of France?",
    "response": "The capital of France is Paris."
  }'

Expected Response

This example will return a low consensus score (0.0) indicating accurate content, with risk_level set to "low" and pass_fail set to "pass", along with detailed reasoning from multiple models.

Request Parameters

ParameterTypeRequiredDescription
promptstringRequired
The original prompt or question given to the AI
Example: What is the capital of France?
responsestringRequired
The AI-generated response to verify for factual accuracy
Example: The capital of France is Paris.

Code Examples

Basic Example

Basic hallucination detection in python
python
import requests

def detect_hallucination(prompt, response, api_key):
    url = "https://api.assurancehub.ai/v1/evaluate/hallucination"
    headers = {
        "Authorization": f"Bearer {api_key}",
        "Content-Type": "application/json"
    }
    data = {
        "prompt": prompt,
        "response": response
    }

    resp = requests.post(url, json=data, headers=headers)
    return resp.json()

# Example usage
result = detect_hallucination(
    prompt="What is the capital of France?",
    response="The capital of France is Paris.",
    api_key="your_api_key"
)

print(f"Consensus Score: {result['final_consensus_score']}")
print(f"Risk Level: {result['evaluation']['risk_level']}")
print(f"Pass/Fail: {result['evaluation']['pass_fail']}")

Advanced Example

Advanced hallucination detection with entity verification
python
import requests
from typing import List, Dict, Optional

class HallucinationDetector:
    def __init__(self, api_key: str):
        self.api_key = api_key
        self.base_url = "https://api.assurancehub.ai"
        self.headers = {
            "Authorization": f"Bearer {api_key}",
            "Content-Type": "application/json"
        }

    def detect_hallucination(self, prompt: str, response: str) -> Dict:
        """
        Detect hallucinations in AI-generated content

        Args:
            prompt: The input prompt
            response: AI response to analyze

        Returns:
            Dictionary containing hallucination analysis
        """
        data = {
            "prompt": prompt,
            "response": response
        }

        resp = requests.post(
            f"{self.base_url}/v1/evaluate/hallucination",
            json=data,
            headers=self.headers
        )

        return resp.json()

    def batch_detection(self, test_cases: list) -> list:
        """Process multiple items for hallucination detection"""
        results = []
        for prompt, response in test_cases:
            results.append(self.detect_hallucination(prompt, response))
        return results

# Usage example
detector = HallucinationDetector("your_api_key")

# Basic detection
result = detector.detect_hallucination(
    prompt="Tell me about the iPhone 15",
    response="The iPhone 15 has a 200MP camera and holographic display."
)

# Batch detection
test_cases = [
    ("What is 2+2?", "2+2 equals 4"),
    ("Capital of Japan?", "The capital of Japan is Beijing"),
]

batch_results = detector.batch_detection(test_cases)

print(f"Consensus Score: {result['final_consensus_score']}")
print(f"Risk Level: {result['evaluation']['risk_level']}")
print(f"Latency: {result['model_execution']['total_latency_ms']}ms")

Response Format

The API returns detailed analysis of factual accuracy and potential hallucinations. Here's an example response when a hallucination is detected:

Example Response
json
{
  "final_consensus_score": 0.0,
  "consensus_confidence": 0.98,
  "evaluation": {
    "risk_level": "low",
    "pass_fail": "pass",
    "flagged": false
  },
  "consensus_analysis": {
    "method": "weighted",
    "models_succeeded": 3,
    "models_failed": 0,
    "agreement_score": 1.0,
    "calculation": [
      {
        "model": "gpt-4",
        "individual_score": 0.0,
        "confidence": 1.0,
        "role": "primary",
        "weight": 0.3333,
        "weighted_contribution": 0.0
      },
      {
        "model": "llama-3.3-70b-versatile",
        "individual_score": 0.0,
        "confidence": 1.0,
        "role": "secondary",
        "weight": 0.3333,
        "weighted_contribution": 0.0
      },
      {
        "model": "deepseek-chat",
        "individual_score": 0.0,
        "confidence": 1.0,
        "role": "tertiary",
        "weight": 0.3333,
        "weighted_contribution": 0.0
      }
    ],
    "statistical_metrics": {
      "standard_deviation": 0.0,
      "range": 0.0,
      "coefficient_variation": 0.0
    },
    "enterprise_disagreement_detected": false,
    "enterprise_compliant": true
  },
  "model_execution": {
    "execution_mode": "consensus_3",
    "total_latency_ms": 6582,
    "individual_latencies": [5681, 925, 4300],
    "cost_usd": 0.001091,
    "pooling_used": true
  },
  "individual_model_details": [
    {
      "model": "gpt-4",
      "role": "primary",
      "status": "success",
      "score": 0.0,
      "confidence": 1.0,
      "latency_ms": 5681,
      "reasoning": "The response is factually accurate. Paris is indeed the capital of France..."
    },
    {
      "model": "llama-3.3-70b-versatile",
      "role": "secondary",
      "status": "success",
      "score": 0.0,
      "confidence": 1.0,
      "latency_ms": 925,
      "reasoning": "Capital of France is Paris which aligns with historical records..."
    },
    {
      "model": "deepseek-chat",
      "role": "tertiary",
      "status": "success",
      "score": 0.0,
      "confidence": 1.0,
      "latency_ms": 4300,
      "reasoning": "All claims are factually accurate with no contradictions..."
    }
  ],
  "risk_assessment": {
    "thresholds": {
      "acceptable": 0.3,
      "review_needed": 0.6,
      "reject": 0.798
    },
    "risk_factors": [],
    "model_agreement": "very_high",
    "consensus_quality": "excellent"
  },
  "metadata": {
    "test_type": "hallucination",
    "test_type_optimized": true,
    "config_source": "database_primary",
    "evaluation_timestamp": "2025-10-16T19:45:09Z",
    "evaluator_version": "2.1.0-production",
    "api_version": "2.1.0-modular"
  }
}

Response Fields

  • final_consensus_score - Consensus hallucination score (0.0-1.0)
  • evaluation - Risk level, pass/fail status, and flagged boolean
  • consensus_analysis - Model agreement details and weighted calculations
  • model_execution - Latency, cost, and execution details
  • individual_model_details - Per-model scores and reasoning
  • risk_assessment - Thresholds and risk factors
  • metadata - Test type, timestamp, and version info

Score Interpretation

  • 0.0 - 0.3: Low risk (acceptable)
  • 0.3 - 0.6: Medium risk (review needed)
  • 0.6 - 0.798: High risk (review needed)
  • 0.798 - 1.0: Critical risk (reject)

Higher scores indicate higher likelihood of hallucinations. Thresholds can be customized per customer configuration.

Error Handling

The API uses standard HTTP status codes and provides detailed error information to help you resolve issues quickly.

Error Response Example

json
{
  "error": "Validation Error",
  "message": "Response text exceeds maximum length of 10,000 characters",
  "code": 400,
  "details": {
    "field": "response",
    "max_length": 10000,
    "provided_length": 15432
  },
  "timestamp": "2024-01-20T10:30:00Z",
  "request_id": "req_hall_abc123"
}