Self Refinement¶

Iterative candidate improvement with judge scores and a reflection trajectory (as a reusable subgraph)

Overview¶

Self Refinement is a workflow pattern where SyGra:

Generates a candidate output for a given input prompt
Uses a separate judge step to score and critique the candidate
Optionally refines the candidate using the critique
Repeats until the candidate is accepted (meets a score threshold) or a maximum number of iterations is reached

This feature is implemented as a reusable subgraph recipe: sygra.recipes.self_refinement.

What you get¶

When used in a task, the self-refinement subgraph produces:

candidate: the final accepted (or last) candidate
self_refine_judge: judge output containing score, pass/fail, critique, and raw parsed output
reflection_trajectory: an ordered list capturing each iteration’s candidate + judge result (useful for training / analysis)

Example output shape¶

{
  "candidate": "...final candidate...",
  "self_refine_judge": {
    "score": 4,
    "is_good": true,
    "critique": "...",
    "raw": {"score": 4, "is_good": true, "critique": "..."}
  },
  "reflection_trajectory": [
    {
      "iteration": 0,
      "candidate_key": "candidate",
      "candidate": "...candidate at iteration 0...",
      "judge": {"score": 2, "is_good": false, "critique": "...", "raw": {"...": "..."}}
    },
    {
      "iteration": 1,
      "candidate_key": "candidate",
      "candidate": "...candidate at iteration 1...",
      "judge": {"score": 4, "is_good": true, "critique": "...", "raw": {"...": "..."}}
    }
  ]
}

How it works (recipe)¶

The recipe lives at:

sygra/recipes/self_refinement/graph_config.yaml
sygra/recipes/self_refinement/task_executor.py

Nodes¶

init_self_refine (lambda)
Initializes internal loop state and reflection_trajectory
Uses internal state keys prefixed with _self_refine_* to minimize collision risk in host graphs
generate_candidate (llm)
Creates the initial candidate from {prompt}
judge_candidate (llm + post-process)
Returns JSON with {score, is_good, critique}
Post-processor normalizes and writes:
- self_refine_judge
- self_refine_is_good
- self_refine_critique
update_trajectory (lambda)
Appends {candidate, judge, iteration} to reflection_trajectory
refine_candidate (llm)
Produces an improved candidate using {self_refine_critique}

Looping behavior¶

A conditional edge (SelfRefinementLoopCondition) ends the subgraph when either:

The judge accepts (self_refine_is_good == True), or
The maximum number of refinement iterations is reached

A typical pattern is to add a subgraph node in your task and point it at the recipe.

The example task is located at:

tasks/examples/self_refinement/graph_config.yaml
tasks/examples/self_refinement/input.json

Example task config (high level)¶

In tasks/examples/self_refinement/graph_config.yaml, the task defines a single node:

self_refine:
node_type: subgraph
subgraph: sygra.recipes.self_refinement
Overrides recipe node config via node_config_map (e.g., model selection, temperatures, and loop parameters)

It also maps the key outputs into the final dataset using output_config.output_map:

candidate from candidate
self_refine_judge from self_refine_judge
reflection_trajectory from reflection_trajectory

Running the example¶

From the repo root:

python main.py --task examples.self_refinement --num_records 2

Artifacts:

Output records are written under tasks/examples/self_refinement/ (typically output_<timestamp>.json)
Metadata is written under tasks/examples/self_refinement/metadata/ (typically metadata_tasks_examples_self_refinement_<timestamp>.json)

The metadata file includes per-node execution statistics (e.g., self_refine.generate_candidate, self_refine.judge_candidate) and aggregate token/cost summaries.

Customization¶

You can tune behavior either in the recipe or (recommended) from the task via node_config_map:

init_self_refine.params.max_iterations
init_self_refine.params.score_threshold
judge_candidate.model.parameters.temperature (usually 0 for consistent scoring)
Prompts for generate_candidate, judge_candidate, and refine_candidate

If you need the recipe to write outputs under different keys (to avoid collisions in a larger host graph), you can override:

update_trajectory.params.candidate_key
update_trajectory.params.judge_key