Benchmark Run Details

System Prompt

You are taking a vocabulary test. Your task is to select the word that best matches 
a given definition from a list of choices. Respond with only the correct word, nothing else.

Run Summary

Model qwen3:4b:Q4_K_M
Benchmark 0020_definitions
Normed Score 100
Run Timestamp 2025-04-29 04:29:34

Question-Level Details

Question ID Score Evaluation Time (ms) Debug Info
0020_definitions:batch_gemma2_9b:0 100 11706 { "response": "\n\nvalley", "expected": "valley", "is_correct": true }
[+]
0020_definitions:batch_gemma2_9b:1 100 10019 { "response": "\n\nopal", "expected": "opal", "is_correct": true }
[+]
0020_definitions:batch_gemma2_9b:10 100 19603 { "response": "\n\necho", "expected": "echo", "is_correct": true }
[+]
0020_definitions:batch_gemma2_9b:11 100 19871 { "response": "\n\nstrength", "expected": "strength", "is_correct": true }
[+]
0020_definitions:batch_gemma2_9b:12 100 28660 { "response": "\n\nswift", "expected": "swift", "is_correct": true }
[+]
0020_definitions:batch_gemma2_9b:13 100 30411 { "response": "\n\nproud", "expected": "proud", "is_correct": true }
[+]
0020_definitions:batch_gemma2_9b:14 100 33041 { "response": "\n\nmagic", "expected": "magic", "is_correct": true }
[+]
0020_definitions:batch_gemma2_9b:15 100 26090 { "response": "\n\nadventure", "expected": "adventure", "is_correct": true }
[+]
0020_definitions:batch_gemma2_9b:16 100 21483 { "response": "\n\nzeal", "expected": "zeal", "is_correct": true }
[+]
0020_definitions:batch_gemma2_9b:17 100 19710 { "response": "\n\nmonster", "expected": "monster", "is_correct": true }
[+]
0020_definitions:batch_gemma2_9b:18 100 26847 { "response": "\n\nenergy", "expected": "energy", "is_correct": true }
[+]
0020_definitions:batch_gemma2_9b:19 100 15888 { "response": "\n\njustice", "expected": "justice", "is_correct": true }
[+]
0020_definitions:batch_gemma2_9b:2 100 24597 { "response": "\n\ngentle", "expected": "gentle", "is_correct": true }
[+]
0020_definitions:batch_gemma2_9b:20 100 16782 { "response": "\n\nknight", "expected": "knight", "is_correct": true }
[+]
0020_definitions:batch_gemma2_9b:21 100 24789 { "response": "\n\nhonor", "expected": "honor", "is_correct": true }
[+]
0020_definitions:batch_gemma2_9b:22 100 22758 { "response": "\n\nqueen", "expected": "queen", "is_correct": true }
[+]
0020_definitions:batch_gemma2_9b:23 100 20633 { "response": "\n\nignite", "expected": "ignite", "is_correct": true }
[+]
0020_definitions:batch_gemma2_9b:24 100 24705 { "response": "\n\nzeal", "expected": "zeal", "is_correct": true }
[+]
0020_definitions:batch_gemma2_9b:3 100 10400 { "response": "\n\nglory", "expected": "glory", "is_correct": true }
[+]
0020_definitions:batch_gemma2_9b:4 100 9391 { "response": "\n\nfighter", "expected": "fighter", "is_correct": true }
[+]
0020_definitions:batch_gemma2_9b:5 100 11693 { "response": "\n\nbelief", "expected": "belief", "is_correct": true }
[+]
0020_definitions:batch_gemma2_9b:6 100 15418 { "response": "\n\njustice", "expected": "justice", "is_correct": true }
[+]
0020_definitions:batch_gemma2_9b:7 100 19745 { "response": "\n\nscared", "expected": "scared", "is_correct": true }
[+]
0020_definitions:batch_gemma2_9b:8 100 48475 { "response": "\n\nhonest", "expected": "honest", "is_correct": true }
[+]
0020_definitions:batch_gemma2_9b:9 100 18016 { "response": "\n\nvillage", "expected": "village", "is_correct": true }
[+]