Benchmark Run Details

Run Summary

Model gpt-4o-mini-2024-07-18
Benchmark 0050_translation_en_fr
Normed Score 98
Run Timestamp 2025-03-26 19:47:08

Question-Level Details

Question ID Score Evaluation Time (ms) Debug Info
0050_translation_en_fr:0 100 1245 {}
[+]
0050_translation_en_fr:1 100 590 {}
[+]
0050_translation_en_fr:10 100 587 {}
[+]
0050_translation_en_fr:11 100 499 {}
[+]
0050_translation_en_fr:12 100 478 {}
[+]
0050_translation_en_fr:13 100 555 {}
[+]
0050_translation_en_fr:14 100 1335 {}
[+]
0050_translation_en_fr:15 100 506 {}
[+]
0050_translation_en_fr:16 100 512 {}
[+]
0050_translation_en_fr:17 100 614 {}
[+]
0050_translation_en_fr:18 100 614 {}
[+]
0050_translation_en_fr:19 100 696 {}
[+]
0050_translation_en_fr:2 100 563 {}
[+]
0050_translation_en_fr:20 100 474 {}
[+]
0050_translation_en_fr:21 100 548 {}
[+]
0050_translation_en_fr:22 100 506 {}
[+]
0050_translation_en_fr:23 100 435 {}
[+]
0050_translation_en_fr:24 100 512 {}
[+]
0050_translation_en_fr:25 100 614 {}
[+]
0050_translation_en_fr:26 100 718 {}
[+]
0050_translation_en_fr:27 100 408 {}
[+]
0050_translation_en_fr:28 100 511 {}
[+]
0050_translation_en_fr:29 100 472 {}
[+]
0050_translation_en_fr:3 100 498 {}
[+]
0050_translation_en_fr:30 100 392 {}
[+]
0050_translation_en_fr:31 100 774 {}
[+]
0050_translation_en_fr:32 100 535 {}
[+]
0050_translation_en_fr:33 100 588 {}
[+]
0050_translation_en_fr:34 100 484 {}
[+]
0050_translation_en_fr:35 100 436 {}
[+]
0050_translation_en_fr:36 100 511 {}
[+]
0050_translation_en_fr:37 100 517 {}
[+]
0050_translation_en_fr:38 100 608 {}
[+]
0050_translation_en_fr:39 100 614 {}
[+]
0050_translation_en_fr:4 100 676 {}
[+]
0050_translation_en_fr:40 100 510 {}
[+]
0050_translation_en_fr:41 100 615 {}
[+]
0050_translation_en_fr:42 100 521 {}
[+]
0050_translation_en_fr:43 100 528 {}
[+]
0050_translation_en_fr:44 100 586 {}
[+]
0050_translation_en_fr:45 100 615 {}
[+]
0050_translation_en_fr:46 100 409 {}
[+]
0050_translation_en_fr:47 100 370 {}
[+]
0050_translation_en_fr:48 100 447 {}
[+]
0050_translation_en_fr:49 100 491 {}
[+]
0050_translation_en_fr:5 0 493 { "response": "beau", "expected": "beau/belle" }
[+]
0050_translation_en_fr:50 100 466 {}
[+]
0050_translation_en_fr:6 100 485 {}
[+]
0050_translation_en_fr:7 100 557 {}
[+]
0050_translation_en_fr:8 100 1635 {}
[+]
0050_translation_en_fr:9 100 542 {}
[+]