Benchmark Run Details

Run Summary

Model smollm2:1.7b:Q8_0
Benchmark 0050_translation_en_fr
Normed Score 56
Run Timestamp 2025-03-26 19:38:16

Question-Level Details

Question ID Score Evaluation Time (ms) Debug Info
0050_translation_en_fr:0 0 1821 { "response": "bléme", "expected": "fleur" }
[+]
0050_translation_en_fr:1 0 410 { "response": "se diriger", "expected": "conduire" }
[+]
0050_translation_en_fr:10 0 394 { "response": "pointu", "expected": "montagne" }
[+]
0050_translation_en_fr:11 100 458 {}
[+]
0050_translation_en_fr:12 0 401 { "response": "coeur", "expected": "cœur" }
[+]
0050_translation_en_fr:13 100 508 {}
[+]
0050_translation_en_fr:14 100 461 {}
[+]
0050_translation_en_fr:15 100 474 {}
[+]
0050_translation_en_fr:16 100 417 {}
[+]
0050_translation_en_fr:17 100 482 {}
[+]
0050_translation_en_fr:18 100 415 {}
[+]
0050_translation_en_fr:19 100 441 {}
[+]
0050_translation_en_fr:2 100 405 {}
[+]
0050_translation_en_fr:20 100 388 {}
[+]
0050_translation_en_fr:21 100 438 {}
[+]
0050_translation_en_fr:22 0 412 { "response": "roue", "expected": "rond" }
[+]
0050_translation_en_fr:23 100 407 {}
[+]
0050_translation_en_fr:24 100 420 {}
[+]
0050_translation_en_fr:25 100 410 {}
[+]
0050_translation_en_fr:26 0 393 { "response": "pluie", "expected": "profond" }
[+]
0050_translation_en_fr:27 0 460 { "response": "glisser", "expected": "nager" }
[+]
0050_translation_en_fr:28 100 423 {}
[+]
0050_translation_en_fr:29 100 434 {}
[+]
0050_translation_en_fr:3 100 409 {}
[+]
0050_translation_en_fr:30 0 440 { "response": "sucré", "expected": "nuage" }
[+]
0050_translation_en_fr:31 0 460 { "response": "se reléve", "expected": "sourire" }
[+]
0050_translation_en_fr:32 0 459 { "response": "loutre", "expected": "doux" }
[+]
0050_translation_en_fr:33 100 436 {}
[+]
0050_translation_en_fr:34 0 436 { "response": "montagne", "expected": "grandir" }
[+]
0050_translation_en_fr:35 0 445 { "response": "rapidité", "expected": "rapide" }
[+]
0050_translation_en_fr:36 100 428 {}
[+]
0050_translation_en_fr:37 100 433 {}
[+]
0050_translation_en_fr:38 0 429 { "response": "vitalité", "expected": "lourd" }
[+]
0050_translation_en_fr:39 100 408 {}
[+]
0050_translation_en_fr:4 100 381 {}
[+]
0050_translation_en_fr:40 100 436 {}
[+]
0050_translation_en_fr:41 0 410 { "response": "raide", "expected": "pointu" }
[+]
0050_translation_en_fr:42 0 402 { "response": "salle", "expected": "sel" }
[+]
0050_translation_en_fr:43 100 407 {}
[+]
0050_translation_en_fr:44 0 388 { "response": "sucré", "expected": "lisse" }
[+]
0050_translation_en_fr:45 0 443 { "response": "montagne", "expected": "herbe" }
[+]
0050_translation_en_fr:46 100 421 {}
[+]
0050_translation_en_fr:47 100 436 {}
[+]
0050_translation_en_fr:48 100 409 {}
[+]
0050_translation_en_fr:49 0 534 { "response": "faire de l'essui", "expected": "tomber" }
[+]
0050_translation_en_fr:5 0 388 { "response": "belles", "expected": "beau/belle" }
[+]
0050_translation_en_fr:50 0 404 { "response": "lumière", "expected": "léger" }
[+]
0050_translation_en_fr:6 0 476 { "response": "l'eau", "expected": "eau" }
[+]
0050_translation_en_fr:7 100 464 {}
[+]
0050_translation_en_fr:8 100 394 {}
[+]
0050_translation_en_fr:9 0 500 { "response": "commencer à rire", "expected": "rire" }
[+]