Self Refinement¶
Iterative candidate improvement with judge scores and a reflection trajectory (as a reusable subgraph)
Overview¶
Self Refinement is a workflow pattern where SyGra:
- Generates a candidate output for a given input prompt
- Uses a separate judge step to score and critique the candidate
- Optionally refines the candidate using the critique
- Repeats until the candidate is accepted (meets a score threshold) or a maximum number of iterations is reached
This feature is implemented as a reusable subgraph recipe: sygra.recipes.self_refinement.
What you get¶
When used in a task, the self-refinement subgraph produces:
candidate: the final accepted (or last) candidateself_refine_judge: judge output containing score, pass/fail, critique, and raw parsed outputreflection_trajectory: an ordered list capturing each iteration’s candidate + judge result (useful for training / analysis)
Example output shape¶
{
"candidate": "...final candidate...",
"self_refine_judge": {
"score": 4,
"is_good": true,
"critique": "...",
"raw": {"score": 4, "is_good": true, "critique": "..."}
},
"reflection_trajectory": [
{
"iteration": 0,
"candidate_key": "candidate",
"candidate": "...candidate at iteration 0...",
"judge": {"score": 2, "is_good": false, "critique": "...", "raw": {"...": "..."}}
},
{
"iteration": 1,
"candidate_key": "candidate",
"candidate": "...candidate at iteration 1...",
"judge": {"score": 4, "is_good": true, "critique": "...", "raw": {"...": "..."}}
}
]
}
How it works (recipe)¶
The recipe lives at:
sygra/recipes/self_refinement/graph_config.yamlsygra/recipes/self_refinement/task_executor.py
Nodes¶
init_self_refine(lambda)- Initializes internal loop state and
reflection_trajectory - Uses internal state keys prefixed with
_self_refine_*to minimize collision risk in host graphs generate_candidate(llm)- Creates the initial candidate from
{prompt} judge_candidate(llm + post-process)- Returns JSON with
{score, is_good, critique} - Post-processor normalizes and writes:
self_refine_judgeself_refine_is_goodself_refine_critique
update_trajectory(lambda)- Appends
{candidate, judge, iteration}toreflection_trajectory refine_candidate(llm)- Produces an improved candidate using
{self_refine_critique}
Looping behavior¶
A conditional edge (SelfRefinementLoopCondition) ends the subgraph when either:
- The judge accepts (
self_refine_is_good == True), or - The maximum number of refinement iterations is reached
Using self refinement as a subgraph in a task¶
A typical pattern is to add a subgraph node in your task and point it at the recipe.
The example task is located at:
tasks/examples/self_refinement/graph_config.yamltasks/examples/self_refinement/input.json
Example task config (high level)¶
In tasks/examples/self_refinement/graph_config.yaml, the task defines a single node:
self_refine:node_type: subgraphsubgraph: sygra.recipes.self_refinement- Overrides recipe node config via
node_config_map(e.g., model selection, temperatures, and loop parameters)
It also maps the key outputs into the final dataset using output_config.output_map:
candidatefromcandidateself_refine_judgefromself_refine_judgereflection_trajectoryfromreflection_trajectory
Running the example¶
From the repo root:
python main.py --task examples.self_refinement --num_records 2
Artifacts:
- Output records are written under
tasks/examples/self_refinement/(typicallyoutput_<timestamp>.json) - Metadata is written under
tasks/examples/self_refinement/metadata/(typicallymetadata_tasks_examples_self_refinement_<timestamp>.json)
The metadata file includes per-node execution statistics (e.g., self_refine.generate_candidate, self_refine.judge_candidate) and aggregate token/cost summaries.
Customization¶
You can tune behavior either in the recipe or (recommended) from the task via node_config_map:
init_self_refine.params.max_iterationsinit_self_refine.params.score_thresholdjudge_candidate.model.parameters.temperature(usually0for consistent scoring)- Prompts for
generate_candidate,judge_candidate, andrefine_candidate
If you need the recipe to write outputs under different keys (to avoid collisions in a larger host graph), you can override:
update_trajectory.params.candidate_keyupdate_trajectory.params.judge_key