Update README.md (#29)

This commit is contained in:
DeepSeekPH 2024-01-09 12:57:32 +08:00 committed by GitHub
parent 867e0f68ec
commit 70aaeb30ff
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23

View File

@ -130,13 +130,12 @@ In line with Grok-1, we have evaluated the model's mathematical capabilities usi
--- ---
**Instruction Following Evaluation:** On Nov 15th, 2023, Google released an [instruction following evaluation dataset](https://arxiv.org/pdf/2311.07911.pdf). They identified 25 types of verifiable instructions and constructed around 500 prompts, with each prompt containing one or more verifiable instructions. We use the prompt-level loose metric to evaluate all models. **Instruction Following Evaluation:** On Nov 15th, 2023, Google released an [instruction following evaluation dataset](https://arxiv.org/pdf/2311.07911.pdf). They identified 25 types of verifiable instructions and constructed around 500 prompts, with each prompt containing one or more verifiable instructions. We use the prompt-level loose metric to evaluate all models. Here, we used the first version released by Google for the evaluation. For the Google revised test set evaluation results, please refer to the number in our paper.
<div align="center"> <div align="center">
<img src="images/if_eval.png" alt="result" width="70%"> <img src="images/if_eval.png" alt="result" width="70%">
</div> </div>
--- ---
**LeetCode Weekly Contest:** **LeetCode Weekly Contest:**