Slides
  1. 01 / Reframe

    An agent is not a tool list

    A better representation is this: an agent is a work unit placed inside context, roles, artifacts, and feedback loops.

    • Tools focus on capability
    • Workflows focus on position
    • Systems focus on feedback
  2. 02 / Resources

    Three primary objects

    A real workflow is not a chat trace. It is a set of objects that can be identified, assigned, checked, and reused; feedback makes them loop.

    1

    Context

    Goals, sources, constraints, decisions, and current state.

    2

    Role

    Reader, Planner, Reviewer, and Writer are work boundaries.

    3

    Artifact

    A document, task, judgment, code change, or reusable record.

  3. 03 / Flow

    How one collaboration flows

    Work is not a single exchange but a loop you can run again: goals go in, judgment comes back.

    Input

    Set the goal

    Name what to solve and where the edges are.

    Context

    Gather context

    Hand the agent the material, limits, and past decisions.

    Run

    Run and record

    Run the tools and produce a checkable artifact.

    Feedback

    Write judgment back

    Human trade-offs become the next round's context.

  4. 04 / Baseline

    Get to a baseline

    Do not optimize the prompt first. Make the workflow repeatable, recordable, and comparable.

    workflow-baseline — zsh
    git clone <repo> && cd agent-workflowpnpm installcp .env.example .envpnpm eval --case beforepnpm run session -- --record

    Every improvement needs the same inputs and the same judgment surface.

  5. 05 / Evals

    Define quality with verifiable tasks

    Evaluation is not a school test. It tells the workflow which part is actually improving.

    IDTaskWhat it testsGrader
    R1Extract contextMissing constraintshuman spot-check
    R2Generate planExecutable stepsstructure match
    R3Draft artifactFit for audiencehuman score
    R4Review riskKey assumptionschecklist
    R5Handoff summaryReusable next timereuse rate
    Score = repeatable input + inspectable artifact + human judgment.
  6. 06 / Dashboard

    Know whether the workflow got better

    A strong agent system is not more fluent. It has shorter context, steadier judgment, and more reusable artifacts.

    4work objects
    3checkpoints
    0hidden steps
    82%reuse rate
    Context

    Shorter input

    Background compresses into reusable chunks.

    Artifact

    Steadier output

    Every run leaves something inspectable.

    Feedback

    Judgment returns

    Human tradeoffs enter the next round.

  7. 07 / Doer vs Tutor

    The Doer / Tutor boundary

    The same agent behavior can help you form a mental model, or let you bypass one.

    Doer

    Hands over the answer

    It removes search, but also removes understanding, tradeoffs, and internalization.

    Tutor

    Provides scaffolding

    It lowers extraneous load while leaving the key judgment to you.

  8. 08 / Feedback

    Write feedback into the next run

    A workflow improves because human judgment becomes future context, not because the agent remembers more.

    feedback-loop.ts
    artifact = agent.run(context)review = human.check(artifact)memory.add(review.decision)rules.add(review.risk)next.run(memory, rules)
    01

    Record decisions

    Why this tradeoff was chosen.

    02

    Keep risks

    What could invalidate the result.

    03

    Compress rules

    What should be reused next time.

  9. 09 / Shift

    From tool list to work system

    Same AI, a different representation — and a very different long-term result.

    Before

    Tool list

    Models, plugins, commands, and buttons; it scatters as you learn.

    After

    Work system

    Objects, mechanisms, feedback, and delivery; it clarifies as you use it.

  10. 10 / Inner Map

    What remains is an inner map

    A good agent workflow does not only finish the task. It helps you recognize the next situation faster, judge the path better, and ship more steadily.

    • Redraw the object: what am I operating on
    • Compress the structure: can it work next time
    • Keep the judgment: who owns the tradeoff