Going All In: Using AI to Build Better Assessments

Description

“By the time you dot the final I’s and cross the final T’s, the assessment is already out of date.”

— Taylor Sullivan

Episode Overview

In this episode I’m joined by rising I/O rockstar Taylor Sullivan, IO psychologist and the architect of Workera’s assessment strategy. With Taylor’s guidance Workera, a verified skills intelligence platform, is doing something most of the industry is still afraid to do: going all in on using AI to build, deliver, and validate AI-based assessments.

Taylor and I (and my AI co-host Mayda Tokens) dig into how this actually works, why it’s scientifically defensible, and why the industry needs to stop waiting and start moving.

Topics Discussed & Key Insights

1. Traditional Assessment Development Is Already Broken

By the time a traditional assessment clears all the I-dotting and T-crossing, it’s often already out of date. AI changes that — enabling dynamic content generation, richer construct understanding, and real-time iteration that keeps pace with how work actually evolves.

2. Codifying Measurement Science Into a Multi-Agent System

Workera didn’t just bolt AI onto existing processes. They embedded IO psychology’s core principles — evidence-centered design, validity frameworks, job analysis — directly into a multi-agent authoring system. Experts define the standards. Agents execute to those standards. The science drives the machine, not the other way around.

Here’s a brief sketch of how it works in practice

* Define the purpose — Tell the agent what you’re measuring and why. This grounds everything that follows.

* Extract the construct — The agent probes the skill space using critical incident techniques, identifying what great performance actually looks like.

* Design the assessment — The agent selects question formats (multiple choice, drag and drop, voice interaction, sequencing) based on what will best elicit evidence of the skill.

* Automated quality review — Before anything goes live, the system checks for bias, language issues, and content alignment to the original skill definition.

* Monitor and improve — Once deployed, the agent tracks response patterns, flags problems, and learns from score appeals adjudicated by humans.

The skill domain is flexible — it works for cheeseburgers or cybersecurity. The methodology behind it is the same either way.

3. The “Harness” — Why This Is Safe

The key to responsible agentic AI isn’t less autonomy — it’s a well-designed harness (the constrained ecosystem where the agents do their thing). Human experts define what good looks like, set quality thresholds, and build in escalation points. The agents work within those constraints and loop back when they hit uncertainty. As Taylor puts it: “It’s not running completely autonomously unchecked.”

4. This Is About Development, Not Just Hiring

Workera’s primary focus is post-hire — workforce development, upskilling, and learning. Once an assessment identifies verified gaps in a person’s skills, the platform connects those gaps directly to personalized learning plans, curating from an organization’s existing content library. Two people can get the same score on an assessment and walk away with completely different development paths based on their specific pattern of strengths and gaps.

5. Verified Skills Intelligence — What It Actually Means

In a world where AI can write a perfect resume and LinkedIn profile for anyone, credentials are noise. Verified skills intelligence cuts through that — using assessment to generate actual evidence of what someone can do, fit for the stakes of the decision being made.

Final Takeaway

The tools to move beyond multiple choice, beyond static assessments, and beyond slow validation cycles exist today. The bottleneck isn’t technology — it’s the will to trust well-designed systems. When the science is built into the machine from the start, speed and rigor aren’t in conflict. They’re the same thing.

This is a public episode. If you would like to discuss this with other subscribers or get access to bonus episodes, visit charleshandler.substack.com

Listen

Description

Want to check another podcast?