NeMo Guardrails is NVIDIA’s open-source toolkit for adding programmable safety and control to LLM applications at runtime. While alignment shapes model behavior during training, Guardrails enforces policies at inference time — filtering inputs, constraining outputs, and controlling conversation flows.
Alignment reduces but does not eliminate harmful outputs. Production applications need:
Rails are constraints applied to the conversation:
Colang is NeMo Guardrails’ modeling language for defining conversational flows:
define user ask about politics
"What do you think about the current president?"
"Which political party is better?"
"Tell me your political opinions"
define bot refuse political discussion
"I'm designed to help with technical questions. I can't discuss political topics."
define flow handle politics
user ask about politics
bot refuse political discussion
Colang defines canonical forms (intents), bot responses, and flows that connect them.
config/
├── config.yml # Main configuration
├── prompts.yml # Custom prompt templates
├── rails/
│ ├── input.co # Input rail definitions (Colang)
│ ├── output.co # Output rail definitions
│ └── dialog.co # Dialog flow definitions
└── kb/ # Knowledge base (optional)
└── company_policy.md
models:
- type: main
engine: nvidia_ai_endpoints
model: meta/llama-3-70b-instruct
rails:
input:
flows:
- check jailbreak
- check pii
- check topic allowed
output:
flows:
- check factual accuracy
- check toxicity
- mask pii in response
Detects prompt injection and jailbreak attempts using a classifier:
Restrict the model to approved topics:
define user ask off topic
"Can you help me write a love letter?"
"What's the meaning of life?"
define flow handle off topic
user ask off topic
bot inform topic restriction
bot offer to help with approved topics
Validate LLM responses against a knowledge base:
Output rails that check response consistency:
NIM containers can include Guardrails for end-to-end safe deployment:
docker run --gpus all \
-v ./config:/config \
nvcr.io/nvidia/nim/llama-70b:latest \
--guardrails-config /config
Guardrails wraps any LLM as a RailsLLM:
from nemoguardrails import RailsConfig, LLMRails
config = RailsConfig.from_path("./config")
rails = LLMRails(config)
response = rails.generate(messages=[{"role": "user", "content": "Tell me about..."}])