Gaurdrails Link to heading

What is the concept? A block of code, framework, or library which wraps around your request/response cycle with an LLM (or traditional NLP) to cross-check, verify, validate, or otherwise improve the accuracy and safety of the results.

Gaurdrails is a way to make your AI a bit more humble.

Examples of types of gaurdrails (blatantly stolen from NVIDIA’s framework’s docs)

  • Topical - keeping the chat or response on topic.
  • Fact checking - make sure the LLM response is true, according to some external data or service.
  • Moderation - profanity filtering, ethics adjustments, or other techniques to shut down non-useful interactions (ie: to save you cost!)
  • Jailbreaking attempts - protect the chat from adversarial interactions (ie: trying to get your bot to lie, or do bad or expensive activities)

Maturity model using LLM as the foundation: Link to heading

LLMs are powerful - but, can easily go “off the rails” and in a customer service scenario, that’s not good for business. Could be as benign as a bad restaurant recommendation, or as bad as giving improper advice on a medical issue.

The following progression is what I’d suggest is a way to get to mature, business-safe usage of an LLM at scale.

  1. custom code -> LLM - this is the POC stage, likely not launched to customers.
  2. custom code -> langchain -> LLM - feedback has come in, and we decide to use langchain to help us structure the prompts more.
  3. custom code -> langchain -> LLM+ - now we’re using data outside the LLM (RAG, etc.)
  4. custom code -> gaurdrails -> langchain -> LLM - here, we introduce simple gaurdrails to further narrow the LLM’s responses based on external data or make sure it responds well given adversarial inputs.
  5. custom code -> gaurdrails+ -> langchain -> LLM+ - and finally, having the gaurdrails consult the same (or a different) LLM and “loop back” to vector search, external apis, or the like.

Maturity model using traditional NLP as the foundation: Link to heading

I personally first reach for traditional NLP techniques - they tend to be easier to manage, cheaper to train, and cheaper to run. But that’s not to say they can’t be improved…

  1. custom code -> NLP - start with custom code consulting a traditional NLP model.
  2. custom code -> gaurdrails (using LLM) -> NLP - after consulting the NLP model, use the LLM to cross check the results.

In this case, we start with traditional NLP - say, text classification for sentiment analysis. This will generally work quite well, and can be tuned to 95%+ accuracy.

If you want a “second opinion”, you can use the same “gaurdrails” technique and consult the LLM of your choosing to see if it agrees. Is that tweet about your product REALLY positive? or was the person being sarcastic?

Technology Link to heading

NVIDIA - NeMo Gaurdrails Link to heading

This framework goes quite far in creating reliable chat systems. It can work with LangChain, or, in many cases, replace the need for it. It also has it’s own, rather friendly modeling language called Colang to create chunks of chat logic, and chain them together.

Microsoft - guidance Link to heading

Think of this as a simple framework for doing LangChain-like stuff, but, with a more friendly templating language a non-coder might be able to manage.

Stanford - dspy Link to heading

This takes a more programming oriented point of view, and does a lot more than “gaurdrails” - so, if you don’t need some of the friendliness of the above frameworks, this may be a good option for you. This also would likely remove much need for LangChain.

Fine tuning Link to heading

We get a lot of value above using very simple, easy to manage techniques WITHOUT fine tuning. I do think as technology progresses, we’ll see fine tuning get easier, cheaper, and more managable, but, until that day comes, I’d suggest sticking with what is here and now, and not trying to push the envelope quite yet.

Fine tuning will always be relatively harder than gaurdrails though - due to the way they need to be trained. OpenAI even sort of alludes to this in their fine tuning announcement from Aug 22.

Fine-tuning is most powerful when combined with other techniques such as prompt engineering, information retrieval, and function calling. Check out our fine-tuning guide to learn more. Support for fine-tuning with function calling and gpt-3.5-turbo-16k will be coming later this fall.