Utilities¶
WebArena-Verified provides several utility commands for managing and optimizing benchmark data.
Trimming Network Logs¶
Network log files (HAR format) can become large due to static resources like CSS, JavaScript, images, and fonts. Since these resources are not evaluated (only HTML pages, API requests, and form submissions matter), you can significantly reduce file sizes by removing them.
The trim-network-logs command removes entries for skipped resource types while preserving all evaluation-relevant events:
webarena-verified trim-network-logs \
--input logs/task_123.har \
--output logs/task_123_trimmed.har
What Gets Removed¶
The utility uses the same logic as NetworkEvent.is_evaluation_event to identify static resources:
- CSS files (
.css) - JavaScript files (
.js) - Images (
.png,.jpg,.jpeg,.gif,.svg,.webp,.ico) - Fonts (
.woff,.woff2,.ttf,.eot)
What Gets Kept¶
All evaluation-relevant network events are preserved:
- HTML pages
- API endpoints
- Form submissions
- All other navigation and data requests
Benefits¶
- 76-90% file size reduction on typical logs
- Evaluation results unchanged - trimmed files produce identical scores
- Faster processing - smaller files load and parse more quickly
- Reduced storage costs - especially important for large-scale evaluations
Example Size Reduction
Batch Trimming¶
You can trim multiple files in a loop:
for task_dir in output/*/; do
task_id=$(basename "$task_dir")
webarena-verified trim-network-logs \
--input "$task_dir/network.har" \
--output "$task_dir/network_trimmed.har"
done
Technical Details¶
The trimming utility:
- Loads the HAR file and converts entries to
NetworkEventobjects - Uses
NetworkEvent.is_evaluation_eventto identify evaluation-relevant events - Filters out static resources while preserving evaluation events
- Writes the trimmed HAR file with the same structure
This ensures consistency with the evaluation logic and guarantees that trimmed logs produce identical evaluation results.