Technical Guides

SIS: The Visual Dashboard That Makes Stephanie's AI Understandable

SIS: The Visual Dashboard That Makes Stephanie's AI Understandable

🔍 The Invisible AI Problem

How do you debug a system that generates thousands of database entries, hundreds of prompts, and dozens of knowledge artifacts for a single query?

SIS is our answer a visual dashboard that transforms Stephanie’s complex internal processes into something developers can actually understand and improve.

📰 In This Post

I

  • 🔎 See how Stephanie pipelines really work – from Arxiv search to cartridges, step by step.
  • 📜 View logs and pipeline steps clearly – no more digging through raw DB entries.
  • 📝 Generate dynamic reports from pipeline runs – structured outputs you can actually use.
  • 🤖 Use pipelines to train the system – showing how runs feed back into learning.
  • 🧩 Turn raw data into functional knowledge – cartridges, scores, and reasoning traces.
  • 🔄 Move from fixed pipelines toward self-learning – what it takes to make the system teach itself.
  • 🖥️ SIS isn’t just a pretty GUI - it’s the layer that makes Stephanie’s knowledge visible and usable.
  • 🈸️ Configuring Stephanie – We will show you how to get up and running with Stephanie.
  • 💡 What we learned – the big takeaway: knowledge without direction is just documentation.

❓ Why We Built SIS

When you’re developing a self-improving AI like Stephanie, the real challenge isn’t just running pipelines it’s making sense of the flood of logs, evaluations, and scores the system generates.

Dimensions of Thought: A Smarter Way to Evaluate AI

Dimensions of Thought: A Smarter Way to Evaluate AI

📖 Summary

This post introduces a multidimensional reward modeling pipeline built on top of the CO_AI framework. It covers:

  • Structured Evaluation Setup How to define custom evaluation dimensions using YAML or database-backed rubrics.

  • 🧠 Automated Scoring with LLMs Using the ScoreEvaluator to produce structured, rationale-backed scores for each dimension.

  • 🧮 Embedding-Based Hypothesis Indexing Efficiently embedding hypotheses and comparing them for contrastive learning using similarity.

  • 🔄 Contrast Pair Generation Creating training pairs where one hypothesis outperforms another on a given dimension.

A Novel Approach to Autonomous Research: Implementing NOVELSEEK with Modular AI Agents

A Novel Approach to Autonomous Research: Implementing NOVELSEEK with Modular AI Agents

Summary

AI research tools today are often narrow: one generates summaries, another ranks models, a third suggests ideas. But real scientific discovery isn’t a single step—it’s a pipeline. It’s iterative, structured, and full of feedback loops.

In this post, I show how to build a modular AI system that mirrors this full research lifecycle. From initial idea generation to method planning, each phase is handled by a specialized agent working in concert.

Self-Improving Agents: Applying the Sharpening Framework to Local LLMs

Self-Improving Agents: Applying the Sharpening Framework to Local LLMs

This is the second post in a 100-part series, where we take breakthrough AI papers and turn them into working code building the next generation of AI, one idea at a time.

🔧 Summary

In my previous post, I introduced co_ai a modular implementation of the AI co-scientist concept, inspired by DeepMind’s recent paper Towards an AI Co-Scientist.

But now, we’re going deeper.

This isn’t just about running prompts through an agent system it’s about building something radically different: