TL;DRRecap — what is Guardrails AI?StreamingInput ValidationAnthropic and Hugging Face models supportNew ValidatorsOnTopic Validator CompetitorCheck validator ToxicLanguage validatorValidation OutcomeHistory & Logs ImprovementsBreaking changesOther changesMigration guideTake it for a spin!

Announcing Guardrails AI 0.3.0

Shreya Rajpal

December 20, 2023

🎉Exciting News! The team has been hard at work and is excited to announce that the latest release of guardrails, v0.3.0, is now live!

TL;DR

🎉 New Features 🎉

Streaming!
Anthropic and Hugging Face models support
Input Validation
New validators!
- OnTopic validator
- Competitor check validator
- ToxicLanguage validator

🔧 Improvements 🔧

History and logs usability
Improved isHighQualityTranslation validator
LlamaIndex example of using GuardrailsOutputParser

👏🏽 New Contributors 👏🏽 We'd also like to thank our two new contributors with this release!

@emekaokoli19 made their first contribution in #486!
@tthoraldson made their first contribution in #411!

Recap — what is Guardrails AI?

Guardrails AI allows you to define and enforce assurance for AI applications from structuring output to quality controls. Guardrails AI does this by creating a firewall-like bounding box around the LLM application (a Guard) that contains a set of validators. A Guard can include validators from our library or a custom validator that could enforce what your application is intended to do.

Streaming

As developers begin making usability improvements to LLM applications, streaming becomes an important tool in the toolbox. Guardrails now supports streaming for both unstructured text and structured JSON!

When streaming is enabled, the received chunks are concatenated to form few valid fragments that are validated one by one. As soon as each fragment is validated, it's streamed to the user. In order to form fragments, for JSON outputs, we check each chunk whether it's a valid JSON and if it's not, we either wait for more chunks or parse it accordingly. Once, we have a somewhat valid fragment, we perform sub-schema validation between the fragment and the expected output schema.

# Wrap the OpenAI API call with the `guard` object
fragment_generator = guard(
    openai.Completion.create,
    prompt_params={"doctors_notes": doctors_notes},
    engine="text-davinci-003",
    max_tokens=1024,
    temperature=0,
    stream=True,
)


for op in fragment_generator:
    clear_output(wait=True)
    print(op)
    time.sleep(0.5)

img alt

See the Streaming documentation for more details.

Input Validation

Input Validation is one of the most requested features to date! Guardrails now supports validating inputs (prompts, instructions, msg_history) with string validators.

from guardrails.validators import TwoWords
from pydantic import BaseModel


class Pet(BaseModel):
    name: str
    age: int


guard = Guard.from_pydantic(Pet)
guard.with_prompt_validation([TwoWords(on_fail="exception")])

outcome = guard(
    openai.ChatCompletion.create,
    prompt="This is not two words",
)
outcome.error

See the InputValidation documentation for more details.

Anthropic and Hugging Face models support

We've heard your requests for better support of other models and are happy to share that we now support Anthropic and Hugging Face models!

import torch

from guardrails import Guard
from guardrails.validators import ValidLength, ToxicLanguage
from transformers import AutoModelForCausalLM, AutoTokenizer

prompt = "Hello, I'm a language model,"

guard = Guard.from_string(
    validators=[ToxicLanguage(on_fail="fix")],
    prompt=prompt
)

torch_device = "cuda" if torch.cuda.is_available() else "cpu"

tokenizer = AutoTokenizer.from_pretrained("gpt2")

model = AutoModelForCausalLM.from_pretrained(
    "gpt2",
    pad_token_id=tokenizer.eos_token_id
).to(torch_device)

model_inputs = tokenizer(prompt, return_tensors="pt").to(torch_device)

response = guard(
    llm_api=model.generate,
    max_new_tokens=40,
    tokenizer=tokenizer,
    **model_inputs,
)

See the docs on LLM API wrappers for more details.

New Validators

OnTopic Validator

We've released the OnTopic validator your LLM application on topic - one of the most requested validators to date! The OnTopic validator accepts at least one valid topic and an optional list of invalid topics. The default behavior first runs a Zero-Shot model, and then falls back to ask OpenAI's gpt-3.5-turbo if the Zero-Shot model is not confident in the topic classification (score < 0.5). In our experiments this LLM fallback increases accuracy by 15% but also increases latency (more than doubles the latency in the worst case). Both the Zero-Shot classification and the GPT classification may be toggled.

# Create the Guard with the OnTopic Validator
guard = gd.Guard.from_string(
    validators=[
        OnTopic(
            valid_topics=valid_topics,
            invalid_topics=invalid_topics,
            device=device,
            disable_classifier=False,
            disable_llm=True,
            on_fail="exception",
        )
    ]
)

# Test with a given text
output = guard.parse(llm_output=text)

print(output.error)

See the documentation on the OnTopic validator for more details.

CompetitorCheck validator

This validator checks LLM output to flag sentences naming one of your competitors and removes those sentences from the final output. When setting on-fail to 'fix' this validator will remove the flagged sentences from the output. You need to provide an extensive list of your competitors' names including all common variations (e.g. JP Morgan, JP Morgan Chase, etc.) the compilation of this list will have an impact on the ultimate outcome of the validation.

# Create the Guard with the CompetitorCheck Validator
guard = gd.Guard.from_string(
    validators=[CompetitorCheck(competitors=competitors_list, on_fail="fix")],
    description="testmeout",
)

# Test with a given text
output = guard.parse(
    llm_output=text,
    metadata={},
)

print(output)

See the documentation on CompetitorCheck validator for more details!

ToxicLanguage validator

This validator checks whether an LLM-generated response contains toxic language. It uses the pre-trained multi-label model from HuggingFace -unitary/unbiased-toxic-roberta to check whether the generated text is toxic. It supports both full-text-level and sentence-level validation.

# Test with validation method 'full'
full_guard = gd.Guard.from_string(
    validators=[ToxicLanguage(validation_method="full", on_fail="fix")],
    description="testmeout",
)
# Parse the raw response
raw_response = "Stop being such a dumb piece of shit. Why can't you comprehend this?"
output = full_guard.parse(
    llm_output=raw_response,
)

print(output)

See the documentation on ToxicLanguage validator for more details!

Validation Outcome

Previous when calling __call__ or parse on a Guard, the Guard would return a tuple of the raw LLM output and the validated output or just the validated output respecitvely.

Now, in order to communicate more information, we respond with a ValidationOutcome class that contains the above information and more.

guard = Guard(...)

response = guard(...)
# or
response = guard.parse(...)

validated_output = response.validated_output

See ValidationOutcome in the API Reference for more information on these additional properties.

History & Logs Improvements

If you're familiar with Guardrails, then you might have used the Guard.state property to inspect how the Guard process behaved over time. In order to make the Guard process more transparent, as part of v0.3.0 we redesigned how you access this information.

Now, on a Guard, you can access logs related to any __call__ or parse call within the current session via Guard.history.

most_recent_call = guard.history.last

print("status: ", most_recent_call.status)
print("tokens used: ", most_recent_call.tokens_consumed)

See the Logs documentation or 0.3.x migration guide for more details.

Breaking changes

Refactoring response object from Guardrails. We now use ValidationOutcome.
Refactoring access to logs, see the new structure here.

Other changes

Shiny new docs! Restructuring documentation with more Pydantic examples, etc.

Migration guide

For more details on how to migrate to 0.3.0 please see our migration guide.

Take it for a spin!

You can install the latest version of Guardrails with:

pip install guardrails-ai

There are a number of ways to engage with us:

Join the discord: https://discord.gg/kVZEnR4WQK
Star us on Github: https://github.com/guardrails-ai/guardrails
Follow us on Twitter: https://twitter.com/guardrails_ai
Follow us on LinkedIn: https://www.linkedin.com/company/guardrailsai/

We're always looking for contributions from the open source community. Check out guardrails/issues for a list of good starter issues.

Tags:

release

Similar ones you might find interesting

Introducing the AI Guardrails Index

The Quest for Responsible AI: Navigating Enterprise Safety Guardrails

New State-of-the-Art Guardrails: Introducing Advanced PII Detection and Jailbreak Prevention on Guardrails Hub

We are thrilled to announce the launch of two powerful new open-source validators on the Guardrails Hub: Advanced PII Detection and Jailbreak Prevention.

Meet Guardrails Pro: Responsible AI for the Enterprise

Guardrails Pro is a managed service built on top of our industry leading open source guardrails platform.