Self Improving Code Assistant¶

This tutorial shows how to build a self-improving code assistant using the GraSP framework. You’ll implement a feedback loop where an LLM generates code, critiques its own answers, and refines them iteratively.

Key Features You’ll Learn
self-critique, iterative refinement, code generation, feedback loop, stopping conditions

Prerequisites¶

GraSP framework installed (see Installation Guide)
Familiarity with Python and coding problems

What You’ll Build¶

You’ll create a system that: - Loads coding problems from a dataset - Generates code solutions using an LLM - Evaluates solutions against references - Refines answers based on critique

Step 1: Project Structure¶

glaive_code_assistant/
├── task_executor.py     # Critique cycle logic
├── graph_config.yaml    # Workflow graph

Step 2: Pipeline Implementation¶

Parent Graph (`graph_config.yaml`)¶

The main pipeline is defined in glaive_code_assistant/graph_config.yaml:

Data Source: We will be using the glaiveai/glaive-code-assistant-v2 dataset from HuggingFace, which contains a variety of coding problems along with reference solutions. The dataset is loaded and the task_id field is renamed to id for consistency.
Nodes:
generate_answer: An LLM node that generates a code solution based on the question. The prompt instructs the model to respond only with code and revise based on critique.
critique_answer: An LLM node with custom pre- and post-processors. It acts as a teacher, using the reference solution to critique the generated answer and recommend improvements. If the answer is correct, it responds with 'NO MORE FEEDBACK'.
Edges: The graph cycles between answer generation and critique, controlled by a custom edge condition (ShouldContinueCondition). The loop ends if the solution is correct or after a maximum number of rounds.
Output Config: Custom output formatting is handled by the output generator in task_executor.py.

Reference: glaive_code_assistant/graph_config.yaml

Task Executor (`task_executor.py`)¶

This file implements custom logic for the pipeline: - CritiqueAnsNodePreProcessor: Prepares conversation state for grading by the critique node. - CritiqueAnsNodePostProcessor: Updates the state after critique. - ShouldContinueCondition: Custom edge condition to control the critique/answer loop, ending when the solution is correct or after 8 rounds. - CodeGenOutputGenerator: Formats the output to include all conversation turns, critiques, and improvements.

Reference: task_executor.py

Step 3: Output Collection¶

Collects all turns, critiques, and improvements in a structured format

Step 4: Running the Pipeline¶

From your GraSP project root, run:

python main.py --task path/to/your/glaive_code_assistant

Example Output¶

[
  {
    "id": "299392dcbd6a991853554c2869deeff6f97b13db6a3a4c7f6e25fb41778dafc2",
    "conversation": [
      { "role": "user", "content": "I want to create a function in Python that checks whether a given substring is present in a given string. How can I do that?" },
      { "role": "assistant", "content": "def is_substring_present(main_string, substring): return substring in main_string" },
      { "role": "user", "content": "NO MORE FEEDBACK" }
    ],
    "taxonomy": [ { "category": "Coding" } ],
    "tags": [ "glaiveai/glaive-code-assistant-v2", "reannotate", "self-critique" ]
  }
]

Try It Yourself¶

Add new coding problems
Adjust critique logic for stricter or more lenient feedback

Next Steps¶

Explore custom subgraphs for modular workflows
Learn about structured output for standardized results