Stephanie Framework

Stephanie's Secret: The Dawn of Reflective AI

🌅 Introduction: The Dawn of Self-Reflective AI

What if your AI could not only answer questions but also question itself about those answers? Not with programmed doubt, but with genuine self-awareness recognizing when it’s uncertain, analyzing why it made a mistake, and systematically improving its own reasoning process? This isn’t science fiction. Today, we’re unveiling the first working implementation of an AI that doesn’t just think, but learns how to think better. It’s a bit cold here

General Reasoner: The smarter Local Agent

🔧 Summary

The General Reasoner paper shows how we can train LLMs to reason across domains using diverse data and a generative verifier. In this post, I walk through our open-source implementation showing how we built a modular reasoning agent capable of generating multiple hypotheses, evaluating them with an LLM-based judge, and selecting the best answer.

🧠 What We Built

We built a GeneralReasonerAgent that:

Dynamically generates multiple hypotheses using different reasoning strategies (e.g., cot, debate, verify_then_answer, etc.)
Evaluates each pair of hypotheses using either a local LLM judge or our custom MR.Q evaluator
Classifies the winning hypothesis using rubric dimensions
Logs structured results to a PostgreSQL-backed system

All of this was integrated with our existing stephanie framework, which includes: