Skip to contents

WorkflowState is the shared memory that all nodes in a graph read from and write to. Think of it as a whiteboard: nodes read whatever is on it, write their results back, and the next node picks up from there.

This vignette explains how that whiteboard is structured, what reducers are and why they matter, and how to design a schema that fits your workflow.

Two kinds of memory

puppeteeR has two memory systems that coexist independently:

WorkflowState channels are explicit, structured, and inspectable. Every node reads from them with state$get("name") and writes to them by returning a named list. You define them up front in a schema.

ellmer Chat history is implicit and per-agent. Every agent$chat("...") call appends the exchange to that agent’s internal conversation. The agent accumulates context at the LLM level automatically, regardless of what WorkflowState contains.

These two systems do not synchronise. Resetting or restoring a WorkflowState from a checkpoint does not touch any agent’s Chat history. For most workflows this is fine; for workflows that fork or replay, you may need to reset agent Chat objects manually:

agent$chat_object$set_turns(list())

Channels

A channel is a named slot in the state. You declare all channels up front in a schema:

ws <- workflow_state(
  messages = list(default = list(), reducer = reducer_append()),
  status   = list(default = "pending"),
  metadata = list(default = list(), reducer = reducer_merge())
)

Each channel needs two things:

  • default — the value it holds before any node has written to it.
  • reducer — a function(old, new) that controls what happens when a node writes a new value. If you omit reducer, puppeteeR picks a sensible default automatically (more on this below).

Channel names are completely free — status, my_counter, judge_verdict — any valid R name works. The engine has no reserved names. The only rule: the name you declare in the schema must be the exact same name used in state$get() and in node return values.

Reducers

A reducer is a two-argument function function(old, new) called every time a node writes to a channel. It decides how the incoming value is merged with the current one.

The simplest possible reducer just replaces the old value:

function(old, new) new

A more interesting one appends:

function(old, new) c(old, list(new))

puppeteeR ships four built-in reducers that cover almost every use case.

Built-in reducers

reducer_overwrite() — replace

r <- reducer_overwrite()
r("old value", "new value")
#> [1] "new value"

Discards old and returns new. This is the default for non-list channels (i.e. anything whose default is not a list()). Use it for:

  • Status flags and routing signals ("pending", "done", "approved")
  • The current version of a document
  • Counters and numeric accumulators where you compute the new value yourself

reducer_append() — accumulate

r <- reducer_append()
r(list("first"), "second")
#> [[1]]
#> [1] "first"
#> 
#> [[2]]
#> [1] "second"

Wraps new in list() and concatenates it to old. Use it for any channel that should grow over time: conversation history, collected results, log entries.

Warning. reducer_append() grows without bound. In a long-running workflow, an append channel fed to every LLM call will eventually exceed the context window and cause connection errors. Use reducer_last_n(n) instead when passing the channel to agents (see below).

reducer_last_n(n) — sliding window

r <- reducer_last_n(3L)
r(list("a", "b", "c"), "d")   # drops "a"
#> [[1]]
#> [1] "b"
#> 
#> [[2]]
#> [1] "c"
#> 
#> [[3]]
#> [1] "d"

Appends then trims to the most recent n entries. This is also the automatic default for list channels (those with default = list()) when no reducer is specified — with a window of 20. See the pitfall section below.

reducer_merge() — shallow merge

r <- reducer_merge()
r(list(model = "haiku", temp = 0.7), list(temp = 0.2, seed = 42L))
#> $model
#> [1] "haiku"
#> 
#> $temp
#> [1] 0.2
#> 
#> $seed
#> [1] 42

Uses modifyList(): keys in new overwrite matching keys in old; keys absent from new are preserved. Use it for configuration or metadata objects that are updated piecemeal.

The default reducer pitfall

When you declare a list channel without a reducer, puppeteeR assigns reducer_last_n(20) automatically:

# These two are equivalent:
plan = list(default = list())
plan = list(default = list(), reducer = reducer_last_n(20))

This is almost never what you want for a structured data channel. Consider a plan channel that holds a list of step objects:

planner_fn <- function(state, config) {
  steps <- parse_plan(response)          # list of 6 step objects
  list(plan = steps)                     # writes the list of 6 steps
}

With reducer_last_n(20), this write does not replace plan with the 6 steps — it appends the entire steps list as a single element. The result is list(list(step1, step2, ...)). When the dispatcher later reads plan[[1]], it gets the whole plan, not step 1.

Rule: any list channel that is meant to be replaced wholesale must declare reducer = reducer_overwrite() explicitly:

plan = list(default = list(), reducer = reducer_overwrite())

Only declare reducer_append() or reducer_last_n() for channels that genuinely accumulate entries over time.

Designing your schema

Append for history, overwrite for signals

The most common pattern:

state_schema <- workflow_state(
  messages      = list(default = list(), reducer = reducer_append()),
  current_route = list(default = "")
)

messages accumulates the full conversation. current_route is a transient routing signal — only the latest value matters, so overwrite is correct.

Separate the current draft from history

When agents revise output iteratively, separate the live draft from the audit trail:

state_schema <- workflow_state(
  messages     = list(default = list(), reducer = reducer_append()),
  latest_draft = list(default = ""),
  revision_n   = list(default = 0L)
)

Worker nodes write their output to latest_draft (overwrite — always the current version). The advisor reads latest_draft, not messages. The full history in messages serves as an audit trail without interfering with evaluation.

Dedicated channel per routing signal

Never route based on the last item in an accumulating channel. Use a dedicated overwrite channel for every routing decision:

state_schema <- workflow_state(
  messages      = list(default = list(), reducer = reducer_append()),
  judge_verdict = list(default = "continue")
)

The judge node writes list(judge_verdict = "done"). The conditional edge reads it directly:

g$add_conditional_edge("judge", function(state) state$get("judge_verdict"), route_map)

No text-scanning the last message, no fragile substring matching. If any future node appends to messages after the judge runs, routing is unaffected.

What the convenience workflows pass to agents

The messages channel accumulates the full conversation, but the built-in node functions do not all pass the full history to their LLM calls:

Workflow What the node passes to the LLM
sequential_workflow workers Last message only
supervisor_workflow manager Full history concatenated
supervisor_workflow workers Full history concatenated
debate_workflow debaters Full history concatenated
debate_workflow judge Full history concatenated

When building custom graphs, pass context explicitly in your node function:

writer_fn <- function(state, config) {
  msgs    <- state$get("messages")
  context <- paste(vapply(msgs, as.character, character(1)), collapse = "\n")
  list(messages = config$agents$writer$chat(context))
}

Custom reducers

Any function(old, new) works as a reducer. A common pattern is to wrap the inner function in an outer factory function so it can take parameters:

reducer_max <- function() {
  function(old, new) max(old, new, na.rm = TRUE)
}

state_schema <- workflow_state(
  messages   = list(default = list(), reducer = reducer_last_n(10L)),
  best_score = list(default = 0,      reducer = reducer_max())
)

The factory pattern (reducer_max <- function() { ... }) is not required for zero-argument reducers, but it is consistent with the built-in style and makes the schema read cleanly: reducer = reducer_max() vs reducer = function(old, new) max(old, new, na.rm = TRUE).

Snapshot and restore

$snapshot() returns a plain named list — a point-in-time copy of all channel values. This is what checkpointers persist, and what $restore() reads back:

ws <- workflow_state(
  messages = list(default = list(), reducer = reducer_append()),
  status   = list(default = "pending")
)

ws$update(list(messages = "hello", status = "running"))
snap <- ws$snapshot()   # plain list — safe to saveRDS(), serialize(), etc.

ws$update(list(messages = "world"))
ws$restore(snap)        # rolls back to the post-"hello" state
ws$get("messages")      # list("hello") — "world" is gone

$restore() bypasses reducers and directly overwrites channel values with the snapshot contents. This is intentional: restoration must reproduce the exact prior state, not merge into it.

Schema reference for the built-in workflows

The schemas below are what the convenience constructors use internally. They illustrate the design principles above in practice.

advisor_workflow()

state_schema <- workflow_state(
  messages         = list(default = list(), reducer = reducer_append()),
  latest_draft     = list(default = ""),         # current worker output — overwrite
  advisor_feedback = list(default = ""),         # advisor's revision notes — overwrite
  advisor_verdict  = list(default = "revise"),   # routing signal — overwrite
  revision_n       = list(default = 0L)          # revision counter — overwrite
)

Graph: START → worker → advisor → approved → END                                      ↑______________revise___|

planner_workflow()

state_schema <- workflow_state(
  messages            = list(default = list(), reducer = reducer_append()),
  plan                = list(default = list(), reducer = reducer_overwrite()),  # replaced wholesale each planner turn
  plan_index          = list(default = 0L),              # dispatcher cursor — overwrite
  current_instruction = list(default = ""),              # active step — overwrite
  current_worker      = list(default = ""),              # active step worker — overwrite
  results             = list(default = list(), reducer = reducer_append()),
  evaluator_verdict   = list(default = ""),              # routing signal — overwrite
  replan_count        = list(default = 0L)               # overwrite
)

plan uses reducer_overwrite() because the planner replaces the entire plan on each turn — it is not accumulating steps, it is issuing a new set of them. Without the explicit reducer, reducer_last_n(20) would wrap the new plan as a single nested element, breaking the dispatcher.

Graph: START → planner → dispatcher → workers (loop) → evaluator → done → END                                                                                        ↑_______________replan___|