Programmer.ie: Modern AI programming

Summary SQLite is one of the most widely used database engines in the world, powering everything from mobile applications (Android, iOS) to browsers (Google Chrome, Mozilla Firefox), IoT devices, and even gaming consoles. Unlike traditional client-server databases (e.g., MySQL, PostgreSQL), SQLite is an embedded, serverless database that stores data in a single file, making it easy to manage and deploy. Python developers frequently choose SQLite for its inherent simplicity and portability, leveraging the built-in sqlite3 module for effortless database integration.

Summary This post is an explanation of this paper:RAFT: Reward rAnked FineTuning for Generative Foundation Model Alignment. Generative foundation models, such as Large Language Models (LLMs) and diffusion models, have revolutionized AI by achieving human-like content generation. However, they often suffer from Biases – Models can learn and reinforce societal biases present in the training data (e.g., gender, racial, or cultural stereotypes). Ethical Concerns – AI-generated content can be misused for misinformation, deepfakes, or spreading harmful narratives.

Summary Searching through massive datasets efficiently is a challenge, whether in image retrieval, recommendation systems, or semantic search. Faiss (Facebook AI Similarity Search) is a powerful open-source library developed by Meta to handle high-dimensional similarity search at scale. It’s particularly well-suited for tasks like: Image search: Finding visually similar images in a large database. Recommendation systems: Recommending items (products, movies, etc.) to users based on their preferences. Semantic search: Finding documents or text passages that are semantically similar to a given query.

Summary Imagine you have a dataset of customer profiles. How can you group similar customers together to tailor marketing campaigns? This is where K-Means clustering comes into play. K-Means is a popular unsupervised learning algorithm used for clustering data points into distinct groups based on their similarities. It is widely used in various domains such as customer segmentation, image compression, and anomaly detection. In this blog post, we’ll cover how K-Means works and demonstrate its implementation in Python using scikit-learn.

Summary Imagine a world where you simply think of a task, and invisible devices seamlessly execute it. In fact most of what used to be your daily tasks you won’t even think about they will be automatically executed. Sounds like science fiction? This I believe is the future of human technology interaction. The technology disappears behind an AI driven interface. Do we currently have Artificial Intelligence Artificial intelligence refers to computer programs designed to mimic human cognitive abilities, such as understanding natural language, recognizing patterns, learning from data, and solving complex problems.

Summary Forecasting future events is a critical task in fields like finance, politics, and technology. However, improving the forecasting abilities of large language models (LLMs) often requires extensive human supervision. In this post, we explore a novel approach from the paper LLMs Can Teach Themselves to Better Predict the Future that enables LLMs to teach themselves better forecasting skills using self-play and Direct Preference Optimization (DPO). We’ll walk through a Python implementation of this method, step by step.

Summary Large Language Models (LLMs) are powerful, but their size can lead to slow inference speeds and high memory consumption, hindering real-world deployment. Quantization, a technique that reduces the precision of model weights, offers a powerful solution. This post will explore how to use quantization techniques like bitsandbytes, AutoGPTQ, and AutoRound to dramatically improve LLM inference performance. What is Quantization? Quantization reduces the computational and storage demands of a model by representing its weights with lower-precision data types.

Summary Large Language Models (LLMs) offer immense potential, but realizing that potential often requires fine-tuning them on task-specific data. This guide provides a comprehensive overview of LLM fine-tuning, focusing on practical implementation with LLaMA-Factory and the powerful LoRA technique. What is Fine-Tuning? Fine-tuning adapts a pre-trained model to a new, specific task or dataset. It leverages the general knowledge already learned by the model from a massive dataset (source domain) and refines it with a smaller, more specialized dataset (target domain).

Summary Visual Studio Code is the most popular editor for development. Jupyter Notebooks is the most widely used way to share, demonstrate and develop code in modern AI development. Debugging code is not just used when you have a bug. After you have written any substantial piece of code I suggest stepping through it in the debugger if possible. This can help improve you understanding and the quality of the code you have written

Summary This post details building a robust web data pipeline using SmolAgents. We’ll create tools to retrieve content from various web endpoints, convert it to a consistent format (Markdown), store it efficiently, and then evaluate its relevance and quality using Large Language Models (LLMs). This pipeline is crucial for building a knowledge base for LLM applications. Web Data Convertor (MarkdownConverter) We leverage the MarkdownConverter class, inspired by the one in autogen, to handle the diverse formats encountered on the web.

Summary In this post, we’ll build a Retrieval Augmented Generation (RAG) tool to process the PDF files downloaded from arXiv in the previous post DeepResearch Part 1. This RAG tool will be capable of loading, processing, and semantically searching the document content. It’s a versatile tool applicable to various text sources, including web pages. Building the RAG Tool Following up on our arXiv downloader, we now need a tool to process the downloaded PDFs.

Summary This post kicks off a series of three where we’ll build, extend, and use the open-source DeepResearch application inspired by the Hugging Face blog post. In this first part, we’ll focus on creating an arXiv search tool that can be used with SmolAgents. DeepResearch aims to empower research by providing tools that automate and streamline the process of discovering and managing academic papers. This series will demonstrate how to build such tools, starting with a powerful arXiv search tool.

Introduction FFmpeg is an incredibly versatile command-line tool for manipulating audio and video files. This post provides a practical collection of useful FFmpeg commands for common tasks. FFmpeg Command Structure The general structure of an FFmpeg command is: ffmpeg [global_options] {[input_file_options] -i input_url} ... {[output_file_options] output_url} ... Merging Video and Audio Merging video and audio, with audio re-encoding ffmpeg -i video.mp4 -i audio.wav -c:v copy -c:a aac output.mp4 Copying the audio without re-encoding ffmpeg -i video.

Summary This post provides a practical guide to building common neural network architectures using PyTorch. We’ll explore feedforward networks, convolutional neural networks (CNNs), recurrent neural networks (RNNs), LSTMs, transformers, autoencoders, and GANs, along with code examples and explanations. 1. Understanding PyTorch’s Neural Network Module PyTorch provides the torch.nn module to build neural networks. It provides classes for defining layers, activation functions, and loss functions, making it easy to create and manage complex network architectures in a structured way.

Summary This post provides a comprehensive guide to prompt engineering, the art of crafting effective inputs for Large Language Models (LLMs). Mastering prompt engineering is crucial for maximizing the potential of LLMs and achieving desired results. Effective prompting is the easiest way to enhance your experience with Large Language Models (LLMs). The prompts we make are our interface to LLMs. This is how we communicate with them. This is why it is important to understand how to do it well.

Summary In this blog I aim to try building using open source tools where possible. The benefits are price, control, knowledge and eventually quality. In the shorter term though the quality will trail the paid versions. My belief is we can construct AI applications to be self correcting sort of like how your camera auto focuses for you. This process will involve a lot of computation so using a paid service could be costly.

Summary This post demonstrates how to automatically transform a scientific paper (or any text/audio content) into a YouTube video using AI. We’ll leverage several powerful tools, including large language models (LLMs), Whisper for transcription, Stable Diffusion for image generation, and FFmpeg for video assembly. This process can streamline content creation and make research more accessible. Overview Our pipeline involves these steps: Audio Generation (Optional): If starting from a text document, we’ll use a text-to-speech service (like NotebookLM, or others) to create an audio narration.

Summary In this post I explore Robert Bridson’s paper: Fast Poisson Disk Sampling in Arbitrary Dimensions and provide an example python implementation. Additionally, I introduce an alternative method using Cellular Automata to generate Poisson disk distributions. Poisson disk sampling is a widely used technique in computer graphics, particularly for applications like rendering, texture generation, and particle simulation. Its appeal lies in producing sample distributions with “blue noise” characteristics—random yet evenly spaced, avoiding clustering.

Introduction Activation functions are a component of neural networks they introduce non-linearity into the model, enabling it to learn complex patterns. Without activation functions, a neural network would essentially act as a linear model, regardless of its depth. Key Properties of Activation Functions Non-linearity: Enables the model to learn complex relationships. Differentiability: Allows backpropagation to optimize weights. Range: Defines the output range, impacting gradient flow. In this post I will outline each of the most common activation functions how they are calculated and when they should be used.

Summary In this post I will implement a Support Vector Machine (SVM) in python. Then describe what it does how it does it and some applications of the instrument. What Are Support Vector Machines (SVM)? Support Vector Machines (SVM) are supervised learning algorithms used for classification and regression tasks. Their strength lies in handling both linear and non-linear problems effectively. By finding the optimal hyperplane that separates classes, SVMs maximize the margin between data points of different classes, making them highly effective in high-dimensional spaces.

Summary This post is about color wars: a grid containing dynamic automata at war until one dominates. Implementation The implementation consists of two core components: the Grid and the CellularAutomaton. 1. CellularAutomaton Class The CellularAutomaton class represents individual entities in the grid. Each automaton has: Attributes: ID, strength, age, position. Behavior: Updates itself by aging, reproducing, or dying based on simple rules. 2. Grid Class The Grid manages a collection of automata.

44. What does it mean to Fit a Model? Answer Fitting a model refers to the process of adjusting the model’s internal parameters to best match the given training data. It’s like tailoring a suit – you adjust the fabric and stitching to make it fit the wearer perfectly. Key Terms: Model: A mathematical representation that captures patterns in data. Examples include linear regression, decision trees, neural networks, etc. Parameters: These are the internal variables within the model that determine its behavior.

Summary The Nagel-Schreckenberg (NaSch) model is a traffic flow model which uses used cellular automata to simulate and predict traffic on roads. Design of the Nagel-Schreckenberg Model Discrete Space and Time: The road is divided into cells, each representing a fixed length (e.g., a few meters). Time advances in discrete steps. Vehicle Representation: Each cell is either empty or occupied by a single vehicle. Each vehicle has a velocity (an integer) which determines how many cells it moves in a single time step.

Summary I started with this paper A developmentally descriptive method forquantifying shape in gastropod shells and bridged the results to a cellular automata approach. An example of the shell we are modelling: Steps 1. Identify the Key Biological Features The paper outlines the logarithmic helicospiral model for shell growth, where: The shell grows outward and upward in a spiral shape. Parameters like width growth ((g_w)), height growth ((g_h)), and aperture shape dictate the final form.

Summary This page is the first in a series of posts about Cellular Automata. I believe that we could get the first evidence of AI through cellular automata. A recent paper Intelligence at the Edge of Chaos found that LLM’s trained on more complex data generate better results. Which makes sense in a human context like the harder the material is I study the smarter I get. We need to find out why this is also the case with machines.

Summary Retrieval-Augmented Generation (RAG) is a powerful technique that enhances large language models (LLMs) by allowing them to use external knowledge sources. An Artificial Intelligence (AI) system consists of components working together to apply knowledge learned from data. Some common components of those systems are: Large Language Model (LLM): Typically the core component of the system, often there is more than one. These are large models that have been trained on massive amounts of data and can make intelligent predictions based on their training.

Summary CAG performs better but does not solve the key reason RAG was created small context windows. Retrieval-Augmented Generation (RAG) is currently(early 2025) the most popular way to use external knowledge in current LLM opperations. RAG allows you to enhance your LLM with data beyond the data it was trained on. Ther are many great RAG solutions and products. RAG has some drawbacks - There can be significant retreival latency as it searches for and organizes the correct data.

LLM Agents Agents are used enhance and extend the functionality of LLM’s. In this tutorial, we’ll explore what LLM agents are, how they work, and how to implement them in Python. What Are LLM Agents? An agent is an autonomous process that may use the LLM and other tools multiple times to achieve a goal. The LLM output often controls the workflow of the agent(s). What is the difference between Agents and LLMs or AI?

Some FREE AI courses on Agents I recommend doing Agents were important in Machine Learning development last year. These are some courses I have done and recommend they are all free. A good reason to do courses and look at youtube videos is you will learn current applications of AI and may get ideas for new applications. AI Agentic Design Patterns with AutoGen AI Agentic Design Patterns with AutoGen Topics: Agents, Microsoft, Autogen.

Using Ollama Introduction Ollama is the best platform for running, managing, and interacting with Large Language Models (LLM) models locally. For Python programmers, Ollama offers seamless integration and robust features for querying, manipulating, and deploying LLMs. In this post I will explore how Python developers can leverage Ollama for powerful and efficient AI-based workflows. 1. What is Ollama? Ollama is a tool designed to enable local hosting and interaction with LLMs.

Hugo: A Static Site Generator In this post I give an introduction to what I think is the best static site generator: Hugo. What is Hugo? Hugo is an open-source static site generator written in Go. It takes structured content, often written in Markdown, and compiles it into static HTML, CSS, and JavaScript files. Setting Up Hugo: A Quickstart Guide Follow these steps to set up your first Hugo site

An android implementation of [CDB] (https://cr.yp.to/cdb.html) database. With some simple testing I am seeing a five to ten times increase in speed over Sqlite

A derivatives trading system. This is my attempt at building a derivatives trading system. In this first post I am going to outline the goals of the project and some of the early design decisions.

A validation tool for excel files. Sometimes you need to export data from one system for loading into another. For instance you may export a report from a derivatives trading system for information for a collateral management system.

This Excel macro file validates input files to make sure that they are in a specific format. The input file can be in any format that excel can load. The workbook will load the file and check for errors in place.

Google have just released their speech API. One really cool feature is the ability to transcribe voice in real time. Two years ago I built an app with this idea in mind. At that at the time I could no make it work. Now it is time to resurrect that app. This post will cover the recording section of that application.

This is another simple android application. It does one thing It keeps your phone awake.

This is a really simple android application. It does one thing It takes a photo of a web site and allows you to share the photo.

This is a script I use in my android projects to generate different sized images. You can find the script here process.vbs So I build all my images in inkscape. This is brilliant application. You can find some brilliant tutorials here heathenx The script uses inkscape to convert the svg files to png images of different sizes. The script also use imagemajic to format the pngs nicely. Finally it compresses the result using two png crushing programs

This is a very simple log class I reuse in my projects

It is a hybrid of Timber by Jake Wharton and the Log in Android Universal Image Loader by Sergey Tarasevich

I think that this tool is the best search tool for windows.

Everything

This is a wrapper around the Android SharedPreferences

It adds a few useful extensions

In this post I am going to share some of the tools I currently use to build the blog.

In this project I am building a file explorer library for android. As I was working on catcher it became obvious I would need a file picker and explorer solution. So I did a bit of looking on the web. I found three interesting projects that nearly did what I wanted. I put a few of them together to come up with a hybrid solution.

This is an android application to transfer files from your phone to somewhere else. I will be built as a PC solution but can be used for a server solution also.

AI Is the Interface: The Future of Human-Technology Interaction Technology is the bridge that transforms data into knowledge. In the coming years, artificial intelligence will evolve from being a tool that assists humans to becoming the primary interface through which we interact with technology and process information. The future of human-computer interaction will not be through keyboards, touchscreens, or even direct programming—it will be mediated by AI systems that understand, interpret, and execute our intentions seamlessly.