Thinking in Primitives: Why AI Reasoning Should Learn to Point

Sun, 24 May 2026 13:16:50 +0100

From visual primitives to context-filtered reasoning, grounded verification, and AI movie repair

TL;DR

This post argues that AI reasoning should not operate over everything it can see, read, or detect. It should operate over the right primitives for the current task.

The paper Thinking with Visual Primitives shows that multimodal models reason better when they can point to visual entities using boxes and points. That solves a Reference Gap: language is often too vague to anchor reasoning to the right part of an image.

Visual Primitives on Programmer.ie: Modern AI programming

Thinking in Primitives: Why AI Reasoning Should Learn to Point

TL;DR