Smolagents

Rag: Retrieval-Augmented Generation

Summary

Retrieval-Augmented Generation (RAG) is a powerful technique that enhances large language models (LLMs) by allowing them to use external knowledge sources.

An Artificial Intelligence (AI) system consists of components working together to apply knowledge learned from data. Some common components of those systems are:

  • Large Language Model (LLM): Typically the core component of the system, often there is more than one. These are large models that have been trained on massive amounts of data and can make intelligent predictions based on their training.

CAG: Cache-Augmented Generation

Summary

CAG performs better but does not solve the key reason RAG was created small context windows.

Retrieval-Augmented Generation (RAG) is currently(early 2025) the most popular way to use external knowledge in current LLM opperations. RAG allows you to enhance your LLM with data beyond the data it was trained on. Ther are many great RAG solutions and products.

RAG has some drawbacks - There can be significant retreival latency as it searches for and organizes the correct data.
- There can be errors in the documents/data it selects as results for a query. For example it may select the wrong document or give priority to the wrong document. - It may interduce security and data issues 2️⃣.
- It introduces complication
- an external application to manage the data (Vector Database) - a process to continually update this data when the data goes stale

Agents: A tutorial on building agents in python

LLM Agents

Agents are used enhance and extend the functionality of LLM’s.

In this tutorial, we’ll explore what LLM agents are, how they work, and how to implement them in Python.

What Are LLM Agents?

An agent is an autonomous process that may use the LLM and other tools multiple times to achieve a goal. The LLM output often controls the workflow of the agent(s).

What is the difference between Agents and LLMs or AI?

Agents are processes that may use LLM’s and other agents to achieve a task. Agents act as orchestrators or facilitators, combining various tools and logic, whereas LLMs are the underlying generative engines.