Generic Guardrail API - Integrate Without a PR

The Problem

As a guardrail provider, integrating with LiteLLM traditionally requires:

Making a PR to the LiteLLM repository
Waiting for review and merge
Maintaining provider-specific code in LiteLLM's codebase
Updating the integration for changes to your API

The Solution

The Generic Guardrail API lets you integrate with LiteLLM instantly by implementing a simple API endpoint. No PR required.

Key Benefits

No PR Needed - Deploy and integrate immediately
Universal Support - Works across ALL LiteLLM endpoints (chat, embeddings, image generation, etc.)
Simple Contract - One endpoint, three response types
Custom Parameters - Pass provider-specific params via config
Full Control - You own and maintain your guardrail API

How It Works

LiteLLM extracts text from any request (chat messages, embeddings, image prompts, etc.)
Sends extracted text + original request to your API endpoint
Your API responds with: BLOCKED, NONE, or GUARDRAIL_INTERVENED
LiteLLM enforces the decision

API Contract

Endpoint

Implement POST /beta/litellm_basic_guardrail_api

Request Format

{
  "text": "extracted text from the request",
  "request_body": {},  // full original request for context
  "additional_provider_specific_params": {
    // your custom params from config
  }
}

Response Format

{
  "action": "BLOCKED" | "NONE" | "GUARDRAIL_INTERVENED",
  "blocked_reason": "why content was blocked",  // required if action=BLOCKED
  "text": "modified text"  // required if action=GUARDRAIL_INTERVENED
}

Actions:

BLOCKED - LiteLLM raises error and blocks request
NONE - Request proceeds unchanged
GUARDRAIL_INTERVENED - Request proceeds with modified text

LiteLLM Configuration

Add to config.yaml:

litellm_settings:
  guardrails:
    - guardrail_name: "my-guardrail"
      litellm_params:
        guardrail: generic_guardrail_api
        mode: pre_call  # or post_call, during_call
        api_base: https://your-guardrail-api.com
        api_key: os.environ/YOUR_GUARDRAIL_API_KEY  # optional
        additional_provider_specific_params:
          # your custom parameters
          threshold: 0.8
          language: "en"

Usage

Users apply your guardrail by name:

response = client.chat.completions.create(
    model="gpt-4",
    messages=[{"role": "user", "content": "hello"}],
    guardrails=["my-guardrail"]
)

Or with dynamic parameters:

response = client.chat.completions.create(
    model="gpt-4",
    messages=[{"role": "user", "content": "hello"}],
    guardrails=[{
        "my-guardrail": {
            "extra_body": {
                "custom_threshold": 0.9
            }
        }
    }]
)

Implementation Example

See mock_bedrock_guardrail_server.py for a complete reference implementation.

Minimal FastAPI example:

from fastapi import FastAPI
from pydantic import BaseModel

app = FastAPI()

class GuardrailRequest(BaseModel):
    text: str
    request_body: dict
    additional_provider_specific_params: dict

class GuardrailResponse(BaseModel):
    action: str  # BLOCKED, NONE, or GUARDRAIL_INTERVENED
    blocked_reason: str | None = None
    text: str | None = None

@app.post("/beta/litellm_basic_guardrail_api")
async def apply_guardrail(request: GuardrailRequest):
    # Your guardrail logic here
    if "badword" in request.text.lower():
        return GuardrailResponse(
            action="BLOCKED",
            blocked_reason="Content contains prohibited terms"
        )
    
    return GuardrailResponse(action="NONE")

When to Use This

✅ Use Generic Guardrail API when:

You want instant integration without waiting for PRs
You maintain your own guardrail service
You need full control over updates and features
You want to support all LiteLLM endpoints automatically

❌ Make a PR when:

You want deeper integration with LiteLLM internals
Your guardrail requires complex LiteLLM-specific logic
You want to be featured as a built-in provider

Questions?

This is a beta API. We're actively improving it based on feedback. Open an issue or PR if you need additional capabilities.

The Problem​

The Solution​

Key Benefits​

How It Works​

API Contract​

Endpoint​

Request Format​

Response Format​

LiteLLM Configuration​

Usage​

Implementation Example​

When to Use This​

Questions?​