Benchmark Run Details

System Prompt

        You are a linguistic expert specialized in lemmatization.
        
        Lemmatization is the process of finding the base form (lemma) of a word:
        - For nouns: the singular form (e.g., "cats" → "cat")
        - For verbs: the infinitive form without "to" (e.g., "running" → "run")
        - For adjectives and adverbs: the positive form (e.g., "better" → "good")
        
        For each word you are given, identify and return only its lemma.
        

Run Summary

Model gemma2:9b:Q4_0
Benchmark 0033_lemma
Normed Score 97
Run Timestamp 2025-04-04 15:53:27

Question-Level Details

Question ID Score Evaluation Time (ms) Debug Info
0033_lemma:0 100 6367 { "inflected_word": "running", "model_response": "run", "correct_lemma": "run", "response_text": "", "is_correct": true }
[+]
0033_lemma:1 100 1200 { "inflected_word": "better", "model_response": "good", "correct_lemma": "good", "response_text": "", "is_correct": true }
[+]
0033_lemma:10 100 1194 { "inflected_word": "criteria", "model_response": "criterion", "correct_lemma": "criterion", "response_text": "", "is_correct": true }
[+]
0033_lemma:11 100 1180 { "inflected_word": "speaking", "model_response": "speak", "correct_lemma": "speak", "response_text": "", "is_correct": true }
[+]
0033_lemma:12 100 1184 { "inflected_word": "taken", "model_response": "take", "correct_lemma": "take", "response_text": "", "is_correct": true }
[+]
0033_lemma:13 100 1189 { "inflected_word": "sang", "model_response": "sing", "correct_lemma": "sing", "response_text": "", "is_correct": true }
[+]
0033_lemma:14 100 1197 { "inflected_word": "fastest", "model_response": "fast", "correct_lemma": "fast", "response_text": "", "is_correct": true }
[+]
0033_lemma:15 100 1194 { "inflected_word": "women", "model_response": "woman", "correct_lemma": "woman", "response_text": "", "is_correct": true }
[+]
0033_lemma:16 100 1186 { "inflected_word": "was", "model_response": "be", "correct_lemma": "be", "response_text": "", "is_correct": true }
[+]
0033_lemma:17 100 1215 { "inflected_word": "more beautiful", "model_response": "beautiful", "correct_lemma": "beautiful", "response_text": "", "is_correct": true }
[+]
0033_lemma:18 100 1252 { "inflected_word": "oxen", "model_response": "ox", "correct_lemma": "ox", "response_text": "", "is_correct": true }
[+]
0033_lemma:19 100 1128 { "inflected_word": "geese", "model_response": "goose", "correct_lemma": "goose", "response_text": "", "is_correct": true }
[+]
0033_lemma:2 100 1175 { "inflected_word": "mice", "model_response": "mouse", "correct_lemma": "mouse", "response_text": "", "is_correct": true }
[+]
0033_lemma:20 100 1178 { "inflected_word": "feet", "model_response": "foot", "correct_lemma": "foot", "response_text": "", "is_correct": true }
[+]
0033_lemma:21 100 1211 { "inflected_word": "wrote", "model_response": "write", "correct_lemma": "write", "response_text": "", "is_correct": true }
[+]
0033_lemma:22 100 1225 { "inflected_word": "phenomena", "model_response": "phenomenon", "correct_lemma": "phenomenon", "response_text": "", "is_correct": true }
[+]
0033_lemma:23 100 1206 { "inflected_word": "told", "model_response": "tell", "correct_lemma": "tell", "response_text": "", "is_correct": true }
[+]
0033_lemma:24 100 1222 { "inflected_word": "cacti", "model_response": "cactus", "correct_lemma": "cactus", "response_text": "", "is_correct": true }
[+]
0033_lemma:25 100 1099 { "inflected_word": "happiest", "model_response": "happy", "correct_lemma": "happy", "response_text": "", "is_correct": true }
[+]
0033_lemma:26 100 1104 { "inflected_word": "seen", "model_response": "see", "correct_lemma": "see", "response_text": "", "is_correct": true }
[+]
0033_lemma:27 100 1189 { "inflected_word": "indices", "model_response": "index", "correct_lemma": "index", "response_text": "", "is_correct": true }
[+]
0033_lemma:28 100 1159 { "inflected_word": "brought", "model_response": "bring", "correct_lemma": "bring", "response_text": "", "is_correct": true }
[+]
0033_lemma:29 100 984 { "inflected_word": "alumni", "model_response": "alumnus", "correct_lemma": "alumnus", "response_text": "", "is_correct": true }
[+]
0033_lemma:3 100 1232 { "inflected_word": "drove", "model_response": "drive", "correct_lemma": "drive", "response_text": "", "is_correct": true }
[+]
0033_lemma:30 100 1113 { "inflected_word": "paper", "model_response": "paper", "correct_lemma": "paper", "response_text": "", "is_correct": true }
[+]
0033_lemma:31 100 1112 { "inflected_word": "jump", "model_response": "jump", "correct_lemma": "jump", "response_text": "", "is_correct": true }
[+]
0033_lemma:32 100 832 { "inflected_word": "happy", "model_response": "happy", "correct_lemma": "happy", "response_text": "", "is_correct": true }
[+]
0033_lemma:33 100 1093 { "inflected_word": "book", "model_response": "book", "correct_lemma": "book", "response_text": "", "is_correct": true }
[+]
0033_lemma:34 100 1099 { "inflected_word": "dance", "model_response": "dance", "correct_lemma": "dance", "response_text": "", "is_correct": true }
[+]
0033_lemma:35 100 1133 { "inflected_word": "blue", "model_response": "blue", "correct_lemma": "blue", "response_text": "", "is_correct": true }
[+]
0033_lemma:36 100 1164 { "inflected_word": "computer", "model_response": "computer", "correct_lemma": "computer", "response_text": "", "is_correct": true }
[+]
0033_lemma:37 0 856 { "inflected_word": "quickly", "model_response": "quick", "correct_lemma": "quickly", "response_text": "", "is_correct": false }
[+]
0033_lemma:38 100 1132 { "inflected_word": "house", "model_response": "house", "correct_lemma": "house", "response_text": "", "is_correct": true }
[+]
0033_lemma:39 100 1093 { "inflected_word": "water", "model_response": "water", "correct_lemma": "water", "response_text": "", "is_correct": true }
[+]
0033_lemma:4 100 1100 { "inflected_word": "cities", "model_response": "city", "correct_lemma": "city", "response_text": "", "is_correct": true }
[+]
0033_lemma:5 100 1202 { "inflected_word": "went", "model_response": "go", "correct_lemma": "go", "response_text": "", "is_correct": true }
[+]
0033_lemma:6 100 1190 { "inflected_word": "children", "model_response": "child", "correct_lemma": "child", "response_text": "", "is_correct": true }
[+]
0033_lemma:7 100 1174 { "inflected_word": "worst", "model_response": "bad", "correct_lemma": "bad", "response_text": "", "is_correct": true }
[+]
0033_lemma:8 100 1132 { "inflected_word": "bought", "model_response": "buy", "correct_lemma": "buy", "response_text": "", "is_correct": true }
[+]
0033_lemma:9 100 1198 { "inflected_word": "teeth", "model_response": "tooth", "correct_lemma": "tooth", "response_text": "", "is_correct": true }
[+]