Building AI-Powered Applications with Haystack and Ollama

Summary

In this post, I will demonstrate how to set up and use haystack with Ollama.

haystack is a framework that helps when building applications powered by LLMs.

  • It offers extensive LLM-related functionality.
  • It is open source under the Apache license.
  • It is actively developed, with numerous contributors.
  • It is widely used in production by various clients.

These are some of the key items to watch for when using a library in a project.

LiteLLM: A Lightweight Wrapper for Multi-Provider LLMs

Summary

In this post I will cover LiteLLM. I used it for my implementation of Textgrad also it was using in blog posts I did about Agents.

Working with multiple LLM providers is painful. Every provider has its own API, requiring custom integration, different pricing models, and maintenance overhead. LiteLLM solves this by offering a single, unified API that allows developers to switch between OpenAI, Hugging Face, Cohere, Anthropic, and others without modifying their code.

🧠 TextGrad: Dynamic Optimization of Your LLM

🧠 TextGrad: Dynamic Optimization of Your LLM

🧩 Summary

This post aims to be a comprehensive tutorial on Textgrad.

Textgrad enables the optimization of LLM’s using their text responses.

This will be part of SmartAnswer the ultimate LLM query tool which I will be blogging about shortly.


❓ Why TextGrad?

  • 🔄 Brings Gradient Descent to LLMs – Instead of numerical gradients, TextGrad leverages textual feedback to iteratively improve outputs.
  • 🤖 Automates Prompt Optimization – Eliminates the guesswork in refining LLM prompts.
  • 🌐 Works with Any LLM – From OpenAI’s GPT to local models like Ollama.

🧠 What is TextGrad?

Bringing Gradients to LLM Optimization

Traditional AI optimization techniques rely on numerical gradients computed via backpropagation. However in LLM-driven AI systems, inputs and outputs are often text, making standard gradient computation impossible.

The Power of Logits: Unlocking Smarter, Safer LLM Responses

Summary

In this blog post

  1. I want to fully explore logits and how they can be used to enhance AI applications
  2. I want to understand the ideas from this paper: “Is That Your Final Answer? Test-Time Scaling Improves Selective Question Answering”

This paper introduces a new approach, Selective Question Answering (SQA). This introduces confidence scores to decide when an answer should be given. In this post, we’ll cover the core insights of the paper and implement a basic confidence-based selection function in Python.

Efficient Similarity Search with FAISS and SQLite in Python

Summary

This is another component in SmartAnswer and enhanced LLM interface.

In this blog post, we introduce a wrapper class, FaissDB, which integrates FAISS with SQLite or any database to manage document embeddings and enable efficient similarity search. This approach combines FAISS’s vector search capabilities with the storage and querying power of a database, making it ideal for applications such as Retrieval-Augmented Generation (RAG) and recommendation systems.

It builds up this tool PaperSearch.

Automating Paper Retrieval and Processing with PaperSearch

Summary

This is part on in a series of blog post working towards SmartAnswer a comprehensive improvement to how Large Language Models LLMs answer questions.

This tool will be the source of data for SmartAnswer and allow it to find and research better data when generating answers.

I want this tool to be included in that solution but I dot want all the code from this tool distracting from the SmartAnswer solution. Hence this post.

SQLite: the small database that packs a big punch

Summary

SQLite is one of the most widely used database engines in the world, powering everything from mobile applications (Android, iOS) to browsers (Google Chrome, Mozilla Firefox), IoT devices, and even gaming consoles. Unlike traditional client-server databases (e.g., MySQL, PostgreSQL), SQLite is an embedded, serverless database that stores data in a single file, making it easy to manage and deploy.

Python developers frequently choose SQLite for its inherent simplicity and portability, leveraging the built-in sqlite3 module for effortless database integration.

RAFT: Reward rAnked FineTuning - A New Approach to Generative Model Alignment

Summary

This post is an explanation of this paper:RAFT: Reward rAnked FineTuning for Generative Foundation Model Alignment.

Generative foundation models, such as Large Language Models (LLMs) and diffusion models, have revolutionized AI by achieving human-like content generation. However, they often suffer from

  1. Biases – Models can learn and reinforce societal biases present in the training data (e.g., gender, racial, or cultural stereotypes).
  2. Ethical Concerns – AI-generated content can be misused for misinformation, deepfakes, or spreading harmful narratives.
  3. Alignment Issues – The model’s behavior may not match human intent, leading to unintended or harmful outputs despite good intentions.

Traditionally, Reinforcement Learning from Human Feedback (RLHF) has been used to align these models, but RLHF comes with stability and efficiency challenges. To address these limitations, RAFT (Reward rAnked FineTuning) was introduced as a more stable and scalable alternative. RAFT fine-tunes models using a ranking-based approach to filter high-reward samples, allowing generative models to improve without complex reinforcement learning setups.

Faiss: A Fast, Efficient Similarity Search Library

Summary

Searching through massive datasets efficiently is a challenge, whether in image retrieval, recommendation systems, or semantic search. Faiss (Facebook AI Similarity Search) is a powerful open-source library developed by Meta to handle high-dimensional similarity search at scale.

It’s well-suited for tasks like:

  • Image search: Finding visually similar images in a large database.
  • Recommendation systems: Recommending items (products, movies, etc.) to users based on their preferences.
  • Semantic search: Finding documents or text passages that are semantically similar to a given query.
  • Clustering: Grouping similar vectors together.

In many of the upcoming projects in this blog I will be using it. It is a good local developer solution.

K-Means Clustering

Summary

Imagine you have a dataset of customer profiles. How can you group similar customers together to tailor marketing campaigns? This is where K-Means clustering comes into play.

K-Means is a popular unsupervised learning algorithm used for clustering data points into distinct groups based on their similarities. It is widely used in various domains such as customer segmentation, image compression, and anomaly detection.

In this blog post, we’ll cover how K-Means works and demonstrate its implementation in Python using scikit-learn.