AI: The Future Interface to Technology

Summary

Imagine a world where you simply think of a task, and invisible devices seamlessly execute it. In fact most of what used to be your daily tasks you won’t even think about they will be automatically executed. Sounds like science fiction? This I believe is the future of human technology interaction. The technology disappears behind an AI driven interface.

Do we currently have Artificial Intelligence

Artificial intelligence refers to computer programs designed to mimic human cognitive abilities, 
such as understanding natural language, recognizing patterns, learning from data, and solving complex problems.
While AGI aims to replicate general human intelligence, 
narrow AI focuses on excelling at specific tasks within predefined parameters.

A common debate in AI discourse revolves around whether large language models (LLMs) truly qualify as artificial intelligence or if they are merely sophisticated algorithms mimicking human-like behavior. While discussions about Artificial General Intelligence (AGI) a theoretical form of AI capable of replicating human cognition across all domains are intriguing, they distract from the practical applications of AI that already exist today. AGI may never materialize, not because it’s unachievable, but because it lacks practical utility. A godlike AI with unrestricted capabilities offers little tangible benefit compared to specialized narrow AI systems. Instead, what we have now is narrow AI, which excels at specific tasks and operates within defined parameters. This AI can get broader through the use of Agents and can automatically self improve and learn as I have shown in previous blog posts.

Self-Learning LLMs for Stock Forecasting: A Python Implementation with Direct Preference Optimization

Summary

Forecasting future events is a critical task in fields like finance, politics, and technology. However, improving the forecasting abilities of large language models (LLMs) often requires extensive human supervision. In this post, we explore a novel approach from the paper LLMs Can Teach Themselves to Better Predict the Future that enables LLMs to teach themselves better forecasting skills using self-play and Direct Preference Optimization (DPO). We’ll walk through a Python implementation of this method, step by step.

Using Quantization to speed up and slim down your LLM

Summary

Large Language Models (LLMs) are powerful, but their size can lead to slow inference speeds and high memory consumption, hindering real-world deployment. Quantization, a technique that reduces the precision of model weights, offers a powerful solution. This post will explore how to use quantization techniques like bitsandbytes, AutoGPTQ, and AutoRound to dramatically improve LLM inference performance.

What is Quantization?

Quantization reduces the computational and storage demands of a model by representing its weights with lower-precision data types. Lets imagine data is water and we hold that water in buckets, most of the time we don’t need massive floating point buckets to hold data that can be represented by integers. Quantization is using smaller buckets to hold the same amount of water – you save space and can move the containers more quickly. Quantization trades a tiny amount of precision for significant gains in speed and memory efficiency.

Mastering LLM Fine-Tuning: A Practical Guide with LLaMA-Factory and LoRA

Summary

Large Language Models (LLMs) offer immense potential, but realizing that potential often requires fine-tuning them on task-specific data. This guide provides a comprehensive overview of LLM fine-tuning, focusing on practical implementation with LLaMA-Factory and the powerful LoRA technique.

What is Fine-Tuning?

Fine-tuning adapts a pre-trained model to a new, specific task or dataset. It leverages the general knowledge already learned by the model from a massive dataset (source domain) and refines it with a smaller, more specialized dataset (target domain). This approach saves time, resources, and data while often achieving superior performance.

Debugging Jupyter Notebooks in VS Code

Summary

Visual Studio Code is the most popular editor for development.

Jupyter Notebooks is the most widely used way to share, demonstrate and develop code in modern AI development.

Debugging code is not just used when you have a bug. After you have written any substantial piece of code I suggest stepping through it in the debugger if possible. This can help improve you understanding and the quality of the code you have written

DeepResearch Part 3: Getting the best web data for your research

Summary

This post details building a robust web data pipeline using SmolAgents. We’ll create tools to retrieve content from various web endpoints, convert it to a consistent format (Markdown), store it efficiently, and then evaluate its relevance and quality using Large Language Models (LLMs). This pipeline is crucial for building a knowledge base for LLM applications.

Web Data Convertor (MarkdownConverter)

We leverage the MarkdownConverter class, inspired by the one in autogen, to handle the diverse formats encountered on the web. This ensures consistency for downstream processing.

DeepResearch Part 2: Building a RAG Tool for arXiv PDFs

Summary

In this post, we’ll build a Retrieval Augmented Generation (RAG) tool to process the PDF files downloaded from arXiv in the previous post DeepResearch Part 1. This RAG tool will be capable of loading, processing, and semantically searching the document content. It’s a versatile tool applicable to various text sources, including web pages.

Building the RAG Tool

Following up on our arXiv downloader, we now need a tool to process the downloaded PDF’s. This post details the creation of such a tool.

DeepResearch Part 1: Building an arXiv Search Tool with SmolAgents

Summary

This post kicks off a series of three where we’ll build, extend, and use the open-source DeepResearch application inspired by the Hugging Face blog post. In this first part, we’ll focus on creating an arXiv search tool that can be used with SmolAgents.

DeepResearch aims to empower research by providing tools that automate and streamline the process of discovering and managing academic papers. This series will demonstrate how to build such tools, starting with a powerful arXiv search tool.

FFmpeg: A Practical Guide to Essential Command-Line Options

Introduction

FFmpeg is an incredibly versatile command-line tool for manipulating audio and video files. This post provides a practical collection of useful FFmpeg commands for common tasks.

FFmpeg Command Structure

The general structure of an FFmpeg command is:

ffmpeg [global_options] {[input_file_options] -i input_url} ... {[output_file_options] output_url} ...

Merging Video and Audio

Merging video and audio, with audio re-encoding

ffmpeg -i video.mp4 -i audio.wav -c:v copy -c:a aac output.mp4

Copying the audio without re-encoding

ffmpeg -i video.mp4 -i audio.wav -c copy output.mkv

Why copy audio?

Writing Neural Networks with PyTorch

Summary

This post provides a practical guide to building common neural network architectures using PyTorch. We’ll explore feedforward networks, convolutional neural networks (CNNs), recurrent neural networks (RNNs), LSTMs, transformers, autoencoders, and GANs, along with code examples and explanations.


1️⃣ Understanding PyTorch’s Neural Network Module

PyTorch provides the torch.nn module to build neural networks. It provides classes for defining layers, activation functions, and loss functions, making it easy to create and manage complex network architectures in a structured way.