mirror of
https://github.com/deepseek-ai/DeepSeek-V3.git
synced 2025-02-22 13:48:56 -05:00
Update README.md
fix(table): correct bold formatting for TriviaQA EM comparison - Remove redundant bolding on LLaMA3.1 405B (82.7) - Retain single bold style for DeepSeek-V3's highest score (82.9) - Aligns with evaluation convention of highlighting only the best performance
This commit is contained in:
parent
b5d872ead0
commit
d5c08b384b
@ -130,7 +130,7 @@ For developers looking to dive deeper, we recommend exploring [README_WEIGHTS.md
|
|||||||
| | WinoGrande (Acc.) | 5-shot | **86.3** | 82.3 | 85.2 | 84.9 |
|
| | WinoGrande (Acc.) | 5-shot | **86.3** | 82.3 | 85.2 | 84.9 |
|
||||||
| | RACE-Middle (Acc.) | 5-shot | 73.1 | 68.1 | **74.2** | 67.1 |
|
| | RACE-Middle (Acc.) | 5-shot | 73.1 | 68.1 | **74.2** | 67.1 |
|
||||||
| | RACE-High (Acc.) | 5-shot | 52.6 | 50.3 | **56.8** | 51.3 |
|
| | RACE-High (Acc.) | 5-shot | 52.6 | 50.3 | **56.8** | 51.3 |
|
||||||
| | TriviaQA (EM) | 5-shot | 80.0 | 71.9 | **82.7** | **82.9** |
|
| | TriviaQA (EM) | 5-shot | 80.0 | 71.9 | 82.7 | **82.9** |
|
||||||
| | NaturalQuestions (EM) | 5-shot | 38.6 | 33.2 | **41.5** | 40.0 |
|
| | NaturalQuestions (EM) | 5-shot | 38.6 | 33.2 | **41.5** | 40.0 |
|
||||||
| | AGIEval (Acc.) | 0-shot | 57.5 | 75.8 | 60.6 | **79.6** |
|
| | AGIEval (Acc.) | 0-shot | 57.5 | 75.8 | 60.6 | **79.6** |
|
||||||
| Code | HumanEval (Pass@1) | 0-shot | 43.3 | 53.0 | 54.9 | **65.2** |
|
| Code | HumanEval (Pass@1) | 0-shot | 43.3 | 53.0 | 54.9 | **65.2** |
|
||||||
|
Loading…
Reference in New Issue
Block a user