RAG (Retrieval Augmented Generation)

Retrieval Augmented Generation (RAG) is a method for enhancing factuality and groundedness of the outputs of a machine learning model with a corpus. Unconstrained generation from LLMs is prone to hallucination, and finetuning to add capabilities or knowledge to a model can be difficult and error-prone. Allowing access to a corpus of data at model runtime, for example, a company wiki or open source documentation, can add capabilities without requiring finetuning. It can also reduce hallucinations, where information retrieval is preferred by the model over de novo generation. See hallucination.