Technical Guides

Dimensions of Thought: A Smarter Way to Evaluate AI

9 June 2025

📖 Summary

This post introduces a multidimensional reward modeling pipeline built on top of the stephanieanie framework. It covers:

✅ Structured Evaluation Setup How to define custom evaluation dimensions using YAML or database-backed rubrics.
🧠 Automated Scoring with LLMs Using the ScoreEvaluator to produce structured, rationale-backed scores for each dimension.
🧮 Embedding-Based Hypothesis Indexing Efficiently embedding hypotheses and comparing them for contrastive learning using similarity.
🔄 Contrast Pair Generation Creating training pairs where one hypothesis outperforms another on a given dimension.

A Novel Approach to Autonomous Research: Implementing NOVELSEEK with Modular AI Agents

27 May 2025

Summary

AI research tools today are often narrow: one generates summaries, another ranks models, a third suggests ideas. But real scientific discovery isn’t a single step—it’s a pipeline. It’s iterative, structured, and full of feedback loops.

In this post, I show how to build a modular AI system that mirrors this full research lifecycle. From initial idea generation to method planning, each phase is handled by a specialized agent working in concert.