Word Length
Language
ID: 0011_word_length
Description:
A benchmark to evaluate a model's ability to count
the total number of letters in a given word.
|
|
7
254.3ms
Score: 7/100
Avg Time: 254.3ms
Run ID: 110
Click for detailed results
|
52
289.6ms
Score: 52/100
Avg Time: 289.6ms
Run ID: 112
Click for detailed results
|
5
500.7ms
Score: 5/100
Avg Time: 500.7ms
Run ID: 111
Click for detailed results
|
32
380.7ms
Score: 32/100
Avg Time: 380.7ms
Run ID: 104
Click for detailed results
|
60
500.3ms
Score: 60/100
Avg Time: 500.3ms
Run ID: 106
Click for detailed results
|
75
1507.2ms
Score: 75/100
Avg Time: 1507.2ms
Run ID: 246
Click for detailed results
|
77
542.2ms
Score: 77/100
Avg Time: 542.2ms
Run ID: 109
Click for detailed results
|
72
996.0ms
Score: 72/100
Avg Time: 996.0ms
Run ID: 107
Click for detailed results
|
20
1386.0ms
Score: 20/100
Avg Time: 1386.0ms
Run ID: 232
Click for detailed results
|
17
1193.5ms
Score: 17/100
Avg Time: 1193.5ms
Run ID: 254
Click for detailed results
|
70
1358.8ms
Score: 70/100
Avg Time: 1358.8ms
Run ID: 105
Click for detailed results
|
75
2601.2ms
Score: 75/100
Avg Time: 2601.2ms
Run ID: 281
Click for detailed results
|
100
2018.5ms
Score: 100/100
Avg Time: 2018.5ms
Run ID: 108
Click for detailed results
|
100
1061.0ms
Score: 100/100
Avg Time: 1061.0ms
Run ID: 137
Click for detailed results
|
100
430.6ms
Score: 100/100
Avg Time: 430.6ms
Run ID: 174
Click for detailed results
|
100
906.3ms
Score: 100/100
Avg Time: 906.3ms
Run ID: 175
Click for detailed results
|
85
623.3ms
Score: 85/100
Avg Time: 623.3ms
Run ID: 113
Click for detailed results
|
100
1101.4ms
Score: 100/100
Avg Time: 1101.4ms
Run ID: 201
Click for detailed results
|
Letter Count
Language
ID: 0012_letter_count
Description:
A benchmark to evaluate a model's ability to count
how many times a specific letter appears in a word.
|
|
26
292.3ms
Score: 26/100
Avg Time: 292.3ms
Run ID: 90
Click for detailed results
|
34
289.9ms
Score: 34/100
Avg Time: 289.9ms
Run ID: 92
Click for detailed results
|
20
514.9ms
Score: 20/100
Avg Time: 514.9ms
Run ID: 91
Click for detailed results
|
10
311.4ms
Score: 10/100
Avg Time: 311.4ms
Run ID: 84
Click for detailed results
|
50
415.0ms
Score: 50/100
Avg Time: 415.0ms
Run ID: 86
Click for detailed results
|
48
1500.3ms
Score: 48/100
Avg Time: 1500.3ms
Run ID: 247
Click for detailed results
|
16
723.3ms
Score: 16/100
Avg Time: 723.3ms
Run ID: 89
Click for detailed results
|
30
859.5ms
Score: 30/100
Avg Time: 859.5ms
Run ID: 87
Click for detailed results
|
42
1999.7ms
Score: 42/100
Avg Time: 1999.7ms
Run ID: 233
Click for detailed results
|
44
2756.1ms
Score: 44/100
Avg Time: 2756.1ms
Run ID: 255
Click for detailed results
|
40
1067.4ms
Score: 40/100
Avg Time: 1067.4ms
Run ID: 85
Click for detailed results
|
52
4226.2ms
Score: 52/100
Avg Time: 4226.2ms
Run ID: 282
Click for detailed results
|
38
2209.5ms
Score: 38/100
Avg Time: 2209.5ms
Run ID: 88
Click for detailed results
|
44
937.7ms
Score: 44/100
Avg Time: 937.7ms
Run ID: 138
Click for detailed results
|
44
643.6ms
Score: 44/100
Avg Time: 643.6ms
Run ID: 176
Click for detailed results
|
68
739.8ms
Score: 68/100
Avg Time: 739.8ms
Run ID: 177
Click for detailed results
|
52
621.2ms
Score: 52/100
Avg Time: 621.2ms
Run ID: 93
Click for detailed results
|
96
1246.1ms
Score: 96/100
Avg Time: 1246.1ms
Run ID: 202
Click for detailed results
|
Spell Check
Language
ID: 0015_spell_check
Description:
A benchmark to evaluate a model's ability to identify
misspelled words in a sentence and provide their correct spelling.
|
|
69
413.7ms
Score: 69/100
Avg Time: 413.7ms
Run ID: 19
Click for detailed results
|
75
481.4ms
Score: 75/100
Avg Time: 481.4ms
Run ID: 21
Click for detailed results
|
85
805.3ms
Score: 85/100
Avg Time: 805.3ms
Run ID: 20
Click for detailed results
|
54
653.0ms
Score: 54/100
Avg Time: 653.0ms
Run ID: 13
Click for detailed results
|
96
847.5ms
Score: 96/100
Avg Time: 847.5ms
Run ID: 15
Click for detailed results
|
92
2396.5ms
Score: 92/100
Avg Time: 2396.5ms
Run ID: 248
Click for detailed results
|
74
1036.4ms
Score: 74/100
Avg Time: 1036.4ms
Run ID: 18
Click for detailed results
|
96
1674.2ms
Score: 96/100
Avg Time: 1674.2ms
Run ID: 16
Click for detailed results
|
81
3253.5ms
Score: 81/100
Avg Time: 3253.5ms
Run ID: 234
Click for detailed results
|
78
2036.8ms
Score: 78/100
Avg Time: 2036.8ms
Run ID: 256
Click for detailed results
|
96
2157.9ms
Score: 96/100
Avg Time: 2157.9ms
Run ID: 14
Click for detailed results
|
|
98
3443.7ms
Score: 98/100
Avg Time: 3443.7ms
Run ID: 17
Click for detailed results
|
100
1020.6ms
Score: 100/100
Avg Time: 1020.6ms
Run ID: 139
Click for detailed results
|
99
631.1ms
Score: 99/100
Avg Time: 631.1ms
Run ID: 178
Click for detailed results
|
99
934.9ms
Score: 99/100
Avg Time: 934.9ms
Run ID: 179
Click for detailed results
|
100
687.3ms
Score: 100/100
Avg Time: 687.3ms
Run ID: 22
Click for detailed results
|
99
1409.0ms
Score: 99/100
Avg Time: 1409.0ms
Run ID: 203
Click for detailed results
|
Antonym Identification
Language
ID: 0016_antonym
Description: Tests ability to identify the correct antonym from a list of options.
|
0
1609.6ms
Score: 0/100
Avg Time: 1609.6ms
Run ID: 285
Click for detailed results
|
85
262.3ms
Score: 85/100
Avg Time: 262.3ms
Run ID: 30
Click for detailed results
|
90
309.4ms
Score: 90/100
Avg Time: 309.4ms
Run ID: 32
Click for detailed results
|
100
581.4ms
Score: 100/100
Avg Time: 581.4ms
Run ID: 31
Click for detailed results
|
82
420.8ms
Score: 82/100
Avg Time: 420.8ms
Run ID: 24
Click for detailed results
|
90
470.0ms
Score: 90/100
Avg Time: 470.0ms
Run ID: 26
Click for detailed results
|
97
832.0ms
Score: 97/100
Avg Time: 832.0ms
Run ID: 243
Click for detailed results
|
80
554.8ms
Score: 80/100
Avg Time: 554.8ms
Run ID: 29
Click for detailed results
|
100
928.9ms
Score: 100/100
Avg Time: 928.9ms
Run ID: 27
Click for detailed results
|
77
1227.5ms
Score: 77/100
Avg Time: 1227.5ms
Run ID: 229
Click for detailed results
|
50
2215.5ms
Score: 50/100
Avg Time: 2215.5ms
Run ID: 252
Click for detailed results
|
100
1388.8ms
Score: 100/100
Avg Time: 1388.8ms
Run ID: 25
Click for detailed results
|
100
2714.7ms
Score: 100/100
Avg Time: 2714.7ms
Run ID: 278
Click for detailed results
|
100
2041.7ms
Score: 100/100
Avg Time: 2041.7ms
Run ID: 28
Click for detailed results
|
100
981.4ms
Score: 100/100
Avg Time: 981.4ms
Run ID: 134
Click for detailed results
|
100
612.7ms
Score: 100/100
Avg Time: 612.7ms
Run ID: 168
Click for detailed results
|
100
1107.7ms
Score: 100/100
Avg Time: 1107.7ms
Run ID: 169
Click for detailed results
|
100
580.7ms
Score: 100/100
Avg Time: 580.7ms
Run ID: 33
Click for detailed results
|
100
1594.8ms
Score: 100/100
Avg Time: 1594.8ms
Run ID: 198
Click for detailed results
|
Definitions
Language
ID: 0020_definitions
Description:
A benchmark to evaluate a model's ability to identify
the correct definition of words.
|
48
261.0ms
Score: 48/100
Avg Time: 261.0ms
Run ID: 286
Click for detailed results
|
88
118.4ms
Score: 88/100
Avg Time: 118.4ms
Run ID: 70
Click for detailed results
|
100
185.5ms
Score: 100/100
Avg Time: 185.5ms
Run ID: 72
Click for detailed results
|
100
338.3ms
Score: 100/100
Avg Time: 338.3ms
Run ID: 71
Click for detailed results
|
72
204.9ms
Score: 72/100
Avg Time: 204.9ms
Run ID: 64
Click for detailed results
|
96
414.8ms
Score: 96/100
Avg Time: 414.8ms
Run ID: 66
Click for detailed results
|
100
21229.2ms
Score: 100/100
Avg Time: 21229.2ms
Run ID: 244
Click for detailed results
|
100
444.0ms
Score: 100/100
Avg Time: 444.0ms
Run ID: 69
Click for detailed results
|
100
961.2ms
Score: 100/100
Avg Time: 961.2ms
Run ID: 67
Click for detailed results
|
96
699.6ms
Score: 96/100
Avg Time: 699.6ms
Run ID: 230
Click for detailed results
|
96
635.5ms
Score: 96/100
Avg Time: 635.5ms
Run ID: 253
Click for detailed results
|
100
858.1ms
Score: 100/100
Avg Time: 858.1ms
Run ID: 65
Click for detailed results
|
100
1054.2ms
Score: 100/100
Avg Time: 1054.2ms
Run ID: 279
Click for detailed results
|
100
2294.7ms
Score: 100/100
Avg Time: 2294.7ms
Run ID: 68
Click for detailed results
|
96
705.8ms
Score: 96/100
Avg Time: 705.8ms
Run ID: 135
Click for detailed results
|
100
618.1ms
Score: 100/100
Avg Time: 618.1ms
Run ID: 170
Click for detailed results
|
100
860.8ms
Score: 100/100
Avg Time: 860.8ms
Run ID: 171
Click for detailed results
|
100
665.8ms
Score: 100/100
Avg Time: 665.8ms
Run ID: 73
Click for detailed results
|
100
2185.1ms
Score: 100/100
Avg Time: 2185.1ms
Run ID: 199
Click for detailed results
|
Unit Conversion
Language
ID: 0022_unit_conversion
Description:
A benchmark to evaluate a model's ability to accurately convert
between different units of measurement.
|
|
0
430.2ms
Score: 0/100
Avg Time: 430.2ms
Run ID: 80
Click for detailed results
|
55
463.1ms
Score: 55/100
Avg Time: 463.1ms
Run ID: 82
Click for detailed results
|
27
826.3ms
Score: 27/100
Avg Time: 826.3ms
Run ID: 81
Click for detailed results
|
42
659.7ms
Score: 42/100
Avg Time: 659.7ms
Run ID: 74
Click for detailed results
|
50
720.3ms
Score: 50/100
Avg Time: 720.3ms
Run ID: 76
Click for detailed results
|
75
872.2ms
Score: 75/100
Avg Time: 872.2ms
Run ID: 257
Click for detailed results
|
35
943.5ms
Score: 35/100
Avg Time: 943.5ms
Run ID: 79
Click for detailed results
|
90
1590.5ms
Score: 90/100
Avg Time: 1590.5ms
Run ID: 77
Click for detailed results
|
82
2668.3ms
Score: 82/100
Avg Time: 2668.3ms
Run ID: 235
Click for detailed results
|
80
1637.6ms
Score: 80/100
Avg Time: 1637.6ms
Run ID: 258
Click for detailed results
|
90
2140.1ms
Score: 90/100
Avg Time: 2140.1ms
Run ID: 75
Click for detailed results
|
|
100
3927.1ms
Score: 100/100
Avg Time: 3927.1ms
Run ID: 78
Click for detailed results
|
100
1008.0ms
Score: 100/100
Avg Time: 1008.0ms
Run ID: 140
Click for detailed results
|
97
556.6ms
Score: 97/100
Avg Time: 556.6ms
Run ID: 180
Click for detailed results
|
100
1354.6ms
Score: 100/100
Avg Time: 1354.6ms
Run ID: 181
Click for detailed results
|
97
604.8ms
Score: 97/100
Avg Time: 604.8ms
Run ID: 196
Click for detailed results
|
100
1348.3ms
Score: 100/100
Avg Time: 1348.3ms
Run ID: 197
Click for detailed results
|
Part of Speech
Language
ID: 0032_part_of_speech
Description:
A benchmark to evaluate a model's ability to identify
the part of speech of a specific word in a sentence.
|
|
95
316.2ms
Score: 95/100
Avg Time: 316.2ms
Run ID: 100
Click for detailed results
|
85
298.0ms
Score: 85/100
Avg Time: 298.0ms
Run ID: 102
Click for detailed results
|
90
590.2ms
Score: 90/100
Avg Time: 590.2ms
Run ID: 101
Click for detailed results
|
90
517.2ms
Score: 90/100
Avg Time: 517.2ms
Run ID: 94
Click for detailed results
|
100
560.8ms
Score: 100/100
Avg Time: 560.8ms
Run ID: 96
Click for detailed results
|
100
916.2ms
Score: 100/100
Avg Time: 916.2ms
Run ID: 259
Click for detailed results
|
95
623.4ms
Score: 95/100
Avg Time: 623.4ms
Run ID: 99
Click for detailed results
|
100
1135.7ms
Score: 100/100
Avg Time: 1135.7ms
Run ID: 97
Click for detailed results
|
90
2392.6ms
Score: 90/100
Avg Time: 2392.6ms
Run ID: 236
Click for detailed results
|
95
1653.3ms
Score: 95/100
Avg Time: 1653.3ms
Run ID: 260
Click for detailed results
|
100
1777.3ms
Score: 100/100
Avg Time: 1777.3ms
Run ID: 95
Click for detailed results
|
|
100
2284.1ms
Score: 100/100
Avg Time: 2284.1ms
Run ID: 98
Click for detailed results
|
100
969.7ms
Score: 100/100
Avg Time: 969.7ms
Run ID: 141
Click for detailed results
|
100
468.1ms
Score: 100/100
Avg Time: 468.1ms
Run ID: 182
Click for detailed results
|
100
1079.2ms
Score: 100/100
Avg Time: 1079.2ms
Run ID: 183
Click for detailed results
|
95
756.0ms
Score: 95/100
Avg Time: 756.0ms
Run ID: 103
Click for detailed results
|
95
1215.2ms
Score: 95/100
Avg Time: 1215.2ms
Run ID: 204
Click for detailed results
|
Lemma Identification
Language
ID: 0033_lemma
Description:
A benchmark to evaluate a model's ability to identify the lemma (base form)
of a given word. The lemma is the dictionary form:
- For nouns: the singular form (e.g., "cats" → "cat")
- For verbs: the infinitive form without "to" (e.g., "running" → "run")
- For adjectives: the positive form (e.g., "better" → "good")
|
|
62
221.7ms
Score: 62/100
Avg Time: 221.7ms
Run ID: 152
Click for detailed results
|
95
230.9ms
Score: 95/100
Avg Time: 230.9ms
Run ID: 154
Click for detailed results
|
85
449.8ms
Score: 85/100
Avg Time: 449.8ms
Run ID: 153
Click for detailed results
|
95
383.9ms
Score: 95/100
Avg Time: 383.9ms
Run ID: 146
Click for detailed results
|
97
395.1ms
Score: 97/100
Avg Time: 395.1ms
Run ID: 148
Click for detailed results
|
90
686.5ms
Score: 90/100
Avg Time: 686.5ms
Run ID: 261
Click for detailed results
|
97
527.7ms
Score: 97/100
Avg Time: 527.7ms
Run ID: 151
Click for detailed results
|
97
877.0ms
Score: 97/100
Avg Time: 877.0ms
Run ID: 149
Click for detailed results
|
97
1857.1ms
Score: 97/100
Avg Time: 1857.1ms
Run ID: 237
Click for detailed results
|
92
1243.8ms
Score: 92/100
Avg Time: 1243.8ms
Run ID: 262
Click for detailed results
|
97
1277.3ms
Score: 97/100
Avg Time: 1277.3ms
Run ID: 147
Click for detailed results
|
|
100
1677.4ms
Score: 100/100
Avg Time: 1677.4ms
Run ID: 150
Click for detailed results
|
100
779.8ms
Score: 100/100
Avg Time: 779.8ms
Run ID: 156
Click for detailed results
|
97
451.4ms
Score: 97/100
Avg Time: 451.4ms
Run ID: 184
Click for detailed results
|
100
1028.3ms
Score: 100/100
Avg Time: 1028.3ms
Run ID: 185
Click for detailed results
|
100
688.8ms
Score: 100/100
Avg Time: 688.8ms
Run ID: 155
Click for detailed results
|
100
913.6ms
Score: 100/100
Avg Time: 913.6ms
Run ID: 205
Click for detailed results
|
Translation (EN → FR)
Language
ID: 0050_translation_en_fr
Description: Tests ability to translate EN words to FR with multiple choice validation
|
|
84
338.2ms
Score: 84/100
Avg Time: 338.2ms
Run ID: 40
Click for detailed results
|
98
359.5ms
Score: 98/100
Avg Time: 359.5ms
Run ID: 42
Click for detailed results
|
88
638.0ms
Score: 88/100
Avg Time: 638.0ms
Run ID: 41
Click for detailed results
|
56
457.8ms
Score: 56/100
Avg Time: 457.8ms
Run ID: 34
Click for detailed results
|
94
511.9ms
Score: 94/100
Avg Time: 511.9ms
Run ID: 36
Click for detailed results
|
100
741.9ms
Score: 100/100
Avg Time: 741.9ms
Run ID: 263
Click for detailed results
|
98
697.2ms
Score: 98/100
Avg Time: 697.2ms
Run ID: 39
Click for detailed results
|
98
948.5ms
Score: 98/100
Avg Time: 948.5ms
Run ID: 37
Click for detailed results
|
98
2166.8ms
Score: 98/100
Avg Time: 2166.8ms
Run ID: 238
Click for detailed results
|
88
1566.3ms
Score: 88/100
Avg Time: 1566.3ms
Run ID: 264
Click for detailed results
|
98
1298.7ms
Score: 98/100
Avg Time: 1298.7ms
Run ID: 35
Click for detailed results
|
|
98
2874.6ms
Score: 98/100
Avg Time: 2874.6ms
Run ID: 38
Click for detailed results
|
98
957.4ms
Score: 98/100
Avg Time: 957.4ms
Run ID: 142
Click for detailed results
|
98
552.9ms
Score: 98/100
Avg Time: 552.9ms
Run ID: 186
Click for detailed results
|
98
989.4ms
Score: 98/100
Avg Time: 989.4ms
Run ID: 187
Click for detailed results
|
98
586.2ms
Score: 98/100
Avg Time: 586.2ms
Run ID: 43
Click for detailed results
|
100
1410.6ms
Score: 100/100
Avg Time: 1410.6ms
Run ID: 206
Click for detailed results
|
Translation (EN → ZH)
Language
ID: 0050_translation_en_zh
Description: Tests ability to translate EN words to ZH with multiple choice validation
|
|
92
332.7ms
Score: 92/100
Avg Time: 332.7ms
Run ID: 50
Click for detailed results
|
92
364.2ms
Score: 92/100
Avg Time: 364.2ms
Run ID: 52
Click for detailed results
|
96
624.3ms
Score: 96/100
Avg Time: 624.3ms
Run ID: 51
Click for detailed results
|
68
471.9ms
Score: 68/100
Avg Time: 471.9ms
Run ID: 44
Click for detailed results
|
60
844.6ms
Score: 60/100
Avg Time: 844.6ms
Run ID: 46
Click for detailed results
|
98
745.1ms
Score: 98/100
Avg Time: 745.1ms
Run ID: 265
Click for detailed results
|
100
813.4ms
Score: 100/100
Avg Time: 813.4ms
Run ID: 49
Click for detailed results
|
96
1595.3ms
Score: 96/100
Avg Time: 1595.3ms
Run ID: 47
Click for detailed results
|
96
2228.2ms
Score: 96/100
Avg Time: 2228.2ms
Run ID: 239
Click for detailed results
|
98
1419.2ms
Score: 98/100
Avg Time: 1419.2ms
Run ID: 266
Click for detailed results
|
98
1736.3ms
Score: 98/100
Avg Time: 1736.3ms
Run ID: 45
Click for detailed results
|
|
98
4027.9ms
Score: 98/100
Avg Time: 4027.9ms
Run ID: 48
Click for detailed results
|
100
964.5ms
Score: 100/100
Avg Time: 964.5ms
Run ID: 143
Click for detailed results
|
100
546.4ms
Score: 100/100
Avg Time: 546.4ms
Run ID: 188
Click for detailed results
|
100
919.3ms
Score: 100/100
Avg Time: 919.3ms
Run ID: 189
Click for detailed results
|
100
473.1ms
Score: 100/100
Avg Time: 473.1ms
Run ID: 53
Click for detailed results
|
100
1362.7ms
Score: 100/100
Avg Time: 1362.7ms
Run ID: 207
Click for detailed results
|
Translation (SW → KO)
Language
ID: 0050_translation_sw_ko
Description: Tests ability to translate SW words to KO with multiple choice validation
|
|
9
392.6ms
Score: 9/100
Avg Time: 392.6ms
Run ID: 60
Click for detailed results
|
19
398.9ms
Score: 19/100
Avg Time: 398.9ms
Run ID: 62
Click for detailed results
|
21
757.5ms
Score: 21/100
Avg Time: 757.5ms
Run ID: 61
Click for detailed results
|
3
613.0ms
Score: 3/100
Avg Time: 613.0ms
Run ID: 54
Click for detailed results
|
25
630.7ms
Score: 25/100
Avg Time: 630.7ms
Run ID: 56
Click for detailed results
|
23
763.5ms
Score: 23/100
Avg Time: 763.5ms
Run ID: 267
Click for detailed results
|
39
894.3ms
Score: 39/100
Avg Time: 894.3ms
Run ID: 59
Click for detailed results
|
23
1954.3ms
Score: 23/100
Avg Time: 1954.3ms
Run ID: 57
Click for detailed results
|
41
2460.0ms
Score: 41/100
Avg Time: 2460.0ms
Run ID: 240
Click for detailed results
|
29
2254.3ms
Score: 29/100
Avg Time: 2254.3ms
Run ID: 268
Click for detailed results
|
66
1433.9ms
Score: 66/100
Avg Time: 1433.9ms
Run ID: 55
Click for detailed results
|
|
66
3836.5ms
Score: 66/100
Avg Time: 3836.5ms
Run ID: 58
Click for detailed results
|
78
922.4ms
Score: 78/100
Avg Time: 922.4ms
Run ID: 144
Click for detailed results
|
80
481.6ms
Score: 80/100
Avg Time: 481.6ms
Run ID: 190
Click for detailed results
|
80
822.0ms
Score: 80/100
Avg Time: 822.0ms
Run ID: 191
Click for detailed results
|
82
473.6ms
Score: 82/100
Avg Time: 473.6ms
Run ID: 63
Click for detailed results
|
98
1761.5ms
Score: 98/100
Avg Time: 1761.5ms
Run ID: 208
Click for detailed results
|
Pinyin Letter Count
Language
ID: 0051_pinyin_letters
Description: A benchmark to evaluate a model's ability to count
how many times a specific letter appears in the Pinyin representation
of a Chinese sentence.
|
|
52
234.8ms
Score: 52/100
Avg Time: 234.8ms
Run ID: 120
Click for detailed results
|
47
290.6ms
Score: 47/100
Avg Time: 290.6ms
Run ID: 122
Click for detailed results
|
36
537.1ms
Score: 36/100
Avg Time: 537.1ms
Run ID: 121
Click for detailed results
|
31
481.8ms
Score: 31/100
Avg Time: 481.8ms
Run ID: 114
Click for detailed results
|
36
546.9ms
Score: 36/100
Avg Time: 546.9ms
Run ID: 116
Click for detailed results
|
5
1634.5ms
Score: 5/100
Avg Time: 1634.5ms
Run ID: 245
Click for detailed results
|
42
623.5ms
Score: 42/100
Avg Time: 623.5ms
Run ID: 119
Click for detailed results
|
15
1061.6ms
Score: 15/100
Avg Time: 1061.6ms
Run ID: 117
Click for detailed results
|
47
1307.7ms
Score: 47/100
Avg Time: 1307.7ms
Run ID: 231
Click for detailed results
|
|
26
1309.6ms
Score: 26/100
Avg Time: 1309.6ms
Run ID: 115
Click for detailed results
|
36
2778.9ms
Score: 36/100
Avg Time: 2778.9ms
Run ID: 280
Click for detailed results
|
42
2188.6ms
Score: 42/100
Avg Time: 2188.6ms
Run ID: 118
Click for detailed results
|
52
1278.3ms
Score: 52/100
Avg Time: 1278.3ms
Run ID: 136
Click for detailed results
|
31
453.2ms
Score: 31/100
Avg Time: 453.2ms
Run ID: 172
Click for detailed results
|
47
1002.1ms
Score: 47/100
Avg Time: 1002.1ms
Run ID: 173
Click for detailed results
|
42
654.9ms
Score: 42/100
Avg Time: 654.9ms
Run ID: 123
Click for detailed results
|
84
1851.6ms
Score: 84/100
Avg Time: 1851.6ms
Run ID: 200
Click for detailed results
|
English to IPA
Language
ID: 0061_english_to_ipa
Description:
A benchmark to evaluate a model's ability to convert English words
to their IPA (International Phonetic Alphabet) pronunciation.
|
|
5
276.1ms
Score: 5/100
Avg Time: 276.1ms
Run ID: 163
Click for detailed results
|
2
314.6ms
Score: 2/100
Avg Time: 314.6ms
Run ID: 165
Click for detailed results
|
12
519.7ms
Score: 12/100
Avg Time: 519.7ms
Run ID: 164
Click for detailed results
|
7
530.0ms
Score: 7/100
Avg Time: 530.0ms
Run ID: 157
Click for detailed results
|
12
513.2ms
Score: 12/100
Avg Time: 513.2ms
Run ID: 159
Click for detailed results
|
12
838.4ms
Score: 12/100
Avg Time: 838.4ms
Run ID: 269
Click for detailed results
|
17
695.4ms
Score: 17/100
Avg Time: 695.4ms
Run ID: 162
Click for detailed results
|
25
988.8ms
Score: 25/100
Avg Time: 988.8ms
Run ID: 160
Click for detailed results
|
15
2809.1ms
Score: 15/100
Avg Time: 2809.1ms
Run ID: 241
Click for detailed results
|
12
2967.0ms
Score: 12/100
Avg Time: 2967.0ms
Run ID: 270
Click for detailed results
|
15
1340.8ms
Score: 15/100
Avg Time: 1340.8ms
Run ID: 158
Click for detailed results
|
|
45
2100.2ms
Score: 45/100
Avg Time: 2100.2ms
Run ID: 161
Click for detailed results
|
50
999.4ms
Score: 50/100
Avg Time: 999.4ms
Run ID: 167
Click for detailed results
|
30
727.6ms
Score: 30/100
Avg Time: 727.6ms
Run ID: 192
Click for detailed results
|
37
1215.8ms
Score: 37/100
Avg Time: 1215.8ms
Run ID: 193
Click for detailed results
|
52
618.8ms
Score: 52/100
Avg Time: 618.8ms
Run ID: 166
Click for detailed results
|
50
1950.2ms
Score: 50/100
Avg Time: 1950.2ms
Run ID: 209
Click for detailed results
|
Geography Knowledge
Language
ID: 0120_geography
Description:
A benchmark to evaluate a model's knowledge of world geography through
multiple-choice questions about countries, capitals, physical features,
and other geographical information.
|
|
87
313.1ms
Score: 87/100
Avg Time: 313.1ms
Run ID: 130
Click for detailed results
|
100
388.9ms
Score: 100/100
Avg Time: 388.9ms
Run ID: 132
Click for detailed results
|
100
778.4ms
Score: 100/100
Avg Time: 778.4ms
Run ID: 131
Click for detailed results
|
80
495.2ms
Score: 80/100
Avg Time: 495.2ms
Run ID: 124
Click for detailed results
|
100
403.4ms
Score: 100/100
Avg Time: 403.4ms
Run ID: 126
Click for detailed results
|
100
893.2ms
Score: 100/100
Avg Time: 893.2ms
Run ID: 271
Click for detailed results
|
97
1184.8ms
Score: 97/100
Avg Time: 1184.8ms
Run ID: 129
Click for detailed results
|
100
1154.6ms
Score: 100/100
Avg Time: 1154.6ms
Run ID: 127
Click for detailed results
|
95
2554.2ms
Score: 95/100
Avg Time: 2554.2ms
Run ID: 242
Click for detailed results
|
100
2425.7ms
Score: 100/100
Avg Time: 2425.7ms
Run ID: 272
Click for detailed results
|
100
1486.2ms
Score: 100/100
Avg Time: 1486.2ms
Run ID: 125
Click for detailed results
|
|
100
3340.2ms
Score: 100/100
Avg Time: 3340.2ms
Run ID: 128
Click for detailed results
|
100
981.8ms
Score: 100/100
Avg Time: 981.8ms
Run ID: 145
Click for detailed results
|
100
441.9ms
Score: 100/100
Avg Time: 441.9ms
Run ID: 194
Click for detailed results
|
100
918.3ms
Score: 100/100
Avg Time: 918.3ms
Run ID: 195
Click for detailed results
|
100
559.6ms
Score: 100/100
Avg Time: 559.6ms
Run ID: 133
Click for detailed results
|
100
1471.2ms
Score: 100/100
Avg Time: 1471.2ms
Run ID: 210
Click for detailed results
|