Building an ADHD Task System That Actually Works: Multi-Agent AI for Executive Function

ADHDos was my first serious multi-agent build. Shipped as the capstone for the Google/Kaggle AI Agents Intensive in late 2025. A couple months on, the architecture still holds up, and the constraint-aware design pattern that came out of it has shaped every build since. Worth a proper write-up.

The Gap Task Apps Keep Leaving

Every ADHD task app assumes you can do the one thing ADHD makes hardest: start. Break down the project yourself. Estimate the time yourself. Decide which task fits the energy you have right now. Keep the context alive across two weeks of not touching it.

Those are the parts of executive function that don't work the same way for ADHD brains. A task app can show you a list. It can't do executive function on your behalf.

Every app on the market (Things, Todoist, TickTick, Notion templates) solves a piece of the interface problem and leaves the hard part untouched. Visual organization. Recurring tasks. Time blocking. The list shows up, and then the wall hits. The wall is the start.

What I Was Trying to Build

The AI Agents Intensive ended with a capstone project. A finishing piece to prove the multi-agent concepts had stuck. I picked the hardest problem I had a strong intuition about: build a task system that does the executive function work, not the task management work.

Task apps are tools. What ADHD brains need is closer to a partner. Something that takes a vague goal, breaks it into steps you can start on, matches those steps to your current state, and keeps the thread alive when you come back two weeks later. That's three distinct skills. That's why ADHDos ended up as three agents instead of one.

Three Agents, One Workflow

The Breakdown Agent. Takes a high-level description like "redesign the dashboard" and splits it into concrete subtasks, each sized for 15 to 45 minutes of work. Not "work on design." "Extract typography scale from Figma and create CSS variables." If you read the subtask and still feel stuck, the breakdown didn't work. There's a feedback loop: reject a breakdown as too vague and it refines.

The Matcher Agent. Looks at your current state (energy level, time until next commitment, whether you're in hyperfocus) and recommends what to do right now. Not what's due soonest. Not what's most important. What you can move on in the next 20 minutes. Tired? Review work, documentation, small fixes. Hyperfocused? Deep work with notifications blocked.

The Persistence Agent. Maintains narrative context across long gaps. Why does this task matter? What's the larger goal it serves? What did past-you learn the last time you worked on it? When motivation has evaporated, the agent surfaces that context and suggests a restart ritual. Ten minutes, maybe a coffee first. The point is reducing the activation energy of re-engaging.

The single-agent version I tried first kept conflating concerns. An agent maximizing completion rate broke down tasks too large. An agent matching current energy ignored deadlines entirely. Split the concerns across three agents, and each could optimize for the thing it was good at.

Three Design Moments That Mattered

Task size as output, not input. Most task apps let you estimate time. ADHDos doesn't ask. The Breakdown Agent sets size based on the task description, using pattern matching against what you've finished before. The user gets a sequence of subtasks with logical dependencies, not an arbitrary timer.

Hyperfocus detection. Most task apps treat focus as uniform. ADHD doesn't. ADHDos watches task switching and session duration. Three hours on one task with minimal context switching, and the interface adapts: distracting suggestions get hidden, context pre-loads for the current task, a natural stopping point gets suggested before burnout. Flow state detection exists in research. Baking it into the interface is rarer.

Energy logging instead of time logging. Time assumes effort is uniform. It's not. An hour of creative work can drain you more than two hours of mechanical work. ADHDos asks a simpler question after each session: how does your energy feel, on a five-point scale. From that plus completion history, it builds a personal energy map. Meetings drain you. Planning is neutral. Coding in flow is energizing. Code reviews tank energy. The map feeds the Matcher's recommendations.

What It Proves (And Doesn't)

User testing showed the four things the system had to do: break vague goals into clear action items, recommend appropriate next steps based on energy state, reduce the activation energy to start or resume, and help you see burnout coming before it lands. People with ADHD reported using the app consistently, finishing more, feeling less stuck.

What it didn't prove: scale to team workflows, professional deadline pressure, complex multi-project coordination. That's intentional for a capstone. First version proves the core insight. Second version handles the edge cases.

The insight the architecture settled on: when you have a clear friction point and a user constraint strong enough to force decisions, multi-agent systems beat single models. Specialized agents optimize for their specific job. Single models split attention across the whole problem and land somewhere average.

Where ADHDos Lives Now

The build isn't running publicly. Multi-agent systems at this layer cost real money per interaction (multiple model calls per request, context reloaded on every turn), and the monthly API bill for an open demo of a capstone project didn't make sense. The full case-study walkthrough is at /adhdos.


Update: April 2026

A couple of primitives landed between the original build and now that change the answer to "how would I build this today." Anthropic's Claude Managed Agents, in public beta as of this month, bundles the agent loop, tool execution, sandbox, and state persistence into a managed harness where session context lives outside the model's context window. Vercel Workflows (generally available, with a DurableAgent primitive in AI SDK v7) gives you resumable, crash-surviving workflows that can pause for minutes or months and pick up from the exact point they left off. Either one resolves the core pain I hit in the original: every session started cold, and the model burned calls reconstructing what it already knew yesterday.

A few things I'd rebuild:

  • Move the Persistence Agent out of the model entirely. What that agent did with tokens, a durable workflow runtime does with session state. Context stays alive across two weeks of not touching a task without the model reconstructing it every turn.
  • Run breakdown and matching as managed agents with session-level persistence, not single-session prompts. The session log keeps prior breakdowns and prior recommendations in a place outside the context window. Cheaper, faster, and the reasoning stays sharper over long time gaps because the agent isn't starting from zero.
  • Keep hyperfocus handling and restart rituals as Skills attached to the primary agent, not separately-prompted sub-agents. Skills compose cleanly with durable agent state in a way that wasn't true when this first got built.

The constraint-aware design still stands. The architecture around it is lighter, and the feel gets closer to what the original version was reaching for.

If you spot gaps like this in your own workflows, places where existing products assume a user you're not, that's a prototype waiting to ship. Products designed for the constraint everyone else overlooked tend to work better for everyone.

If you're weighing custom multi-agent work for a team and want a second set of eyes on the architecture, that's a consulting conversation more than a course enrollment.