Output Record Generator
Output Record Generator¶
The Output Record Generator is a flexible component that translates the final state of the graph into the desired output format for each record.
Key Concept:¶
Configure the final record fields in a YAML block (called output_map
), telling the generator how to map fields from the graph's state or use static values—and optionally apply transforms.
Usage:¶
In the graph_config.yaml
, declare an output_config
section specifying a generator
, such as processors.output_record_generator.CodeGenOutputGenerator
1. Simple Derived Class¶
Subclass the BaseOutputGenerator
and implement the generate()
method that returns the final record from the state.
Example:
-
YAML Configuration
output_config: generator: grasp.processors.output_record_generator.CodeGenOutputGenerator
-
Python Code
class CodeGenOutputGenerator(BaseOutputGenerator): def generate(self, state: GraphState) -> dict[str, Any]: if "messages" not in state: return None chat_format_messages = utils.convert_messages_from_langchain_to_chat_format( state["messages"] ) if ( len(chat_format_messages) < 1 or "no more feedback" not in chat_format_messages[-1]["content"].lower().strip() ): return None # remove the last message if it contains "no more feedback" chat_format_messages = chat_format_messages[:-1] chat_format_messages.insert( 0, { "role": "user", "content": state["rephrased_text"].replace( "PARAPHRASED QUESTION: ", "" ), }, ) return { "id": state.get("id", ""), "conversation": chat_format_messages, "taxonomy": [{"category": "Coding", "subcategory": ""}], "annotation_type": ["mistral-large"], "language": ["en"], "tags": ["mbpp", "reannotate", "self-critique"], }
2. YAML driven Mapping & Transform¶
In this approach, we define a YAML configuration that maps fields from the graph's state or uses static values—and optionally apply transforms.
Example:
- YAML Configuration
output_config:
generator: grasp.processors.output_record_generator.CodeGenOutputGenerator
output_map:
id:
from: "id" # Copy from state["id"]
conversation:
from: "messages"
transform: "build_conversation" # Apply a transform method on state["messages"]
taxonomy:
value:
- category: "Coding" # Hard-coded literal value
subcategory: ""
annotation_type:
value: [ "mistral-large" ] # Hard-coded literal list
language:
value: "en" # Hard-coded literal value
tags:
value: ["mbpp", "reannotate", "self-critique"] # Hard-coded literal list
-
Python Code
class CodeGenOutputGenerator(BaseOutputGenerator): """ Example specialized generator class, which defines a transform method for building a conversation from messages, removing 'no more feedback', etc. """ @staticmethod def build_conversation(data: Any, state: dict[str, Any]) -> Any: chat_format_messages = utils.convert_messages_from_langchain_to_chat_format(data) # Example logic: if ( not chat_format_messages or "no more feedback" not in chat_format_messages[-1]["content"].lower().strip() ): return None # remove the last message chat_format_messages.pop() # insert rephrased question if available if "rephrased_text" in state and state["rephrased_text"]: question = state["rephrased_text"].replace("PARAPHRASED QUESTION: ", "") chat_format_messages.insert(0, {"role": "user", "content": question}) return chat_format_messages
-
generator
: The dotted Python path to the output generator class (e.g.,CodeGenOutputGenerator
). output_map
: Each key (e.g., id, conversation, tags) will become a field in the final record.from
: Means “read from the in-memorystate
with that key.”value
: Means “use this literal value.”transform
: Means “call a method on the generator class to transform the data.” This is optional.
Note: If we include "transform": "my_custom_logic"
in the YAML, you then a method named my_custom_logic
must be defined in the generator class.
Here, we define a build_conversation
method that takes the data
and state
as arguments and returns the transformed conversation.