Large Language Models with Graceful Fallbacks

3 min readMar 27, 2024

Large language models have a place in the rapidly changing landscape of artificial intelligence today. They are used in every application, from chatbots to virtual assistants and in content generation and beyond. However, like all technologies, they are subject to failures. From an API outage to rate limits, to just a bad model output: disruptions like this can really affect user experience and functionality.

This is precisely where the concept of fallbacks comes in handy and not only helps but becomes essential in making sure that applications remain robust and reliable if, and when, something unexpected happens.

The Significance of Fallbacks in Language Model Applications

Fallbacks act as a way of safety: allowing applications to gracefully handle and isolate issues without compromising the overall user experience.

They are particularly crucial in language learning models (LLMs) applications, where the seamless integration of various components determines the success of the service. By implementing fallback strategies, developers can ensure that their applications are not only resilient but also maintain a consistent level of performance, regardless of external factors.

Mechanics of Fallbacks

The implementation of fallbacks in LLM applications involves several key concepts and methodologies. The fallback strategy should be basic in nature to ensure falling back to an alternate solution in case the first approach failed. This may be applied at different levels in the application, from individual API calls to more complex sequences involving multiple models

Handling LLM API Errors

One common scenario where fallbacks are invaluable is in handling LLM API errors. For instance, when a request to an LLM API fails due to the API being down or rate limits being exceeded, having a fallback mechanism in place can prevent the application from crashing or becoming unresponsive. Instead, the application can switch to an alternative LLM or handle the error in a way that minimizes disruption.

Consider the following Python code snippet that demonstrates how to set up fallbacks when working with OpenAI’s API:

from langchain_community.chat_models import ChatAnthropic
from langchain_openai import ChatOpenAI

# Setting up the primary and fallback LLMs
openai_llm = ChatOpenAI(max_retries=0)
anthropic_llm = ChatAnthropic()

# Configuring the fallback mechanism
llm = openai_llm.with_fallbacks([anthropic_llm])

# Attempting to invoke the primary LLM, with a fallback to the secondary LLM in case of failure
try:
    print(llm.invoke("Why did the chicken cross the road?"))
except RateLimitError:
    print("Hit error")

In this example, if the primary LLM (openai_llm) encounters a RateLimitError, the application automatically falls back to the secondary LLM (anthropic_llm), ensuring continuity.

Fallbacks for Sequences

Another advanced use of fallbacks involves sequences of operations, where each step might depend on the success of the previous one. In such cases, creating fallbacks for entire sequences can be particularly effective.

This approach allows developers to specify alternative sequences that can be used if the primary sequence fails, offering a higher level of granularity and control.

For example, consider a scenario where the primary sequence involves a chat model that fails due to an incorrect model name.

A fallback sequence can be set up using a different model or prompt, ensuring that the application can still produce a meaningful output.

from langchain_core.output_parsers import StrOutputParser
from langchain.prompts import PromptTemplate
from langchain_openai import OpenAI

# Setting up the primary and fallback sequences
bad_chain = chat_prompt | chat_model | StrOutputParser()
good_chain = prompt | llm

# Configuring the fallback mechanism for sequences
chain = bad_chain.with_fallbacks([good_chain])

# Invoking the chain with a fallback
chain.invoke({"animal": "turtle"})

The Broader Implications and Real-World Applications

The implementation of fallbacks in LLM applications has far-reaching implications. Beyond ensuring reliability and robustness, fallbacks can also enhance the user experience by providing alternative solutions that may be more suitable for specific contexts or preferences. For instance, in a multilingual chatbot application, fallbacks can be used to switch between language models based on the user’s language preference or the availability of models.

Moreover, fallbacks can play a crucial role in experimental and research settings, where the ability to seamlessly switch between different models or configurations can facilitate testing and comparison of various approaches.

Conclusion: A Step Towards More Resilient AI Applications

In conclusion, the integration of fallbacks into LLM applications represents a critical step towards building more resilient and user-friendly AI systems. By understanding and implementing the concepts and methodologies discussed, developers can ensure that their applications are equipped to handle the uncertainties of the digital world, providing seamless and uninterrupted services to users.