Benchmark Run Details

Run Summary

Model llama3.2:3b:Q4_K_M
Benchmark 0050_translation_en_fr
Normed Score 94
Run Timestamp 2025-03-26 19:39:51

Question-Level Details

Question ID Score Evaluation Time (ms) Debug Info
0050_translation_en_fr:0 100 2130 {}
[+]
0050_translation_en_fr:1 100 581 {}
[+]
0050_translation_en_fr:10 100 519 {}
[+]
0050_translation_en_fr:11 100 522 {}
[+]
0050_translation_en_fr:12 100 458 {}
[+]
0050_translation_en_fr:13 100 533 {}
[+]
0050_translation_en_fr:14 100 544 {}
[+]
0050_translation_en_fr:15 100 689 {}
[+]
0050_translation_en_fr:16 100 491 {}
[+]
0050_translation_en_fr:17 100 473 {}
[+]
0050_translation_en_fr:18 100 528 {}
[+]
0050_translation_en_fr:19 100 460 {}
[+]
0050_translation_en_fr:2 100 522 {}
[+]
0050_translation_en_fr:20 100 368 {}
[+]
0050_translation_en_fr:21 100 447 {}
[+]
0050_translation_en_fr:22 100 457 {}
[+]
0050_translation_en_fr:23 100 358 {}
[+]
0050_translation_en_fr:24 100 472 {}
[+]
0050_translation_en_fr:25 100 465 {}
[+]
0050_translation_en_fr:26 100 405 {}
[+]
0050_translation_en_fr:27 100 454 {}
[+]
0050_translation_en_fr:28 100 470 {}
[+]
0050_translation_en_fr:29 0 459 { "response": "sucre", "expected": "sucré" }
[+]
0050_translation_en_fr:3 100 424 {}
[+]
0050_translation_en_fr:30 100 456 {}
[+]
0050_translation_en_fr:31 100 483 {}
[+]
0050_translation_en_fr:32 100 454 {}
[+]
0050_translation_en_fr:33 100 403 {}
[+]
0050_translation_en_fr:34 100 470 {}
[+]
0050_translation_en_fr:35 100 455 {}
[+]
0050_translation_en_fr:36 100 537 {}
[+]
0050_translation_en_fr:37 100 541 {}
[+]
0050_translation_en_fr:38 100 400 {}
[+]
0050_translation_en_fr:39 100 427 {}
[+]
0050_translation_en_fr:4 100 596 {}
[+]
0050_translation_en_fr:40 100 557 {}
[+]
0050_translation_en_fr:41 100 438 {}
[+]
0050_translation_en_fr:42 100 429 {}
[+]
0050_translation_en_fr:43 100 519 {}
[+]
0050_translation_en_fr:44 100 491 {}
[+]
0050_translation_en_fr:45 100 385 {}
[+]
0050_translation_en_fr:46 100 524 {}
[+]
0050_translation_en_fr:47 100 473 {}
[+]
0050_translation_en_fr:48 100 392 {}
[+]
0050_translation_en_fr:49 100 398 {}
[+]
0050_translation_en_fr:5 0 486 { "response": "beau", "expected": "beau/belle" }
[+]
0050_translation_en_fr:50 0 466 { "response": "légère", "expected": "léger" }
[+]
0050_translation_en_fr:6 100 444 {}
[+]
0050_translation_en_fr:7 100 528 {}
[+]
0050_translation_en_fr:8 100 639 {}
[+]
0050_translation_en_fr:9 100 486 {}
[+]