Update README.md

Fixed some spelling errors incorrect grammar
2025-06-14 13:43:47 -04:00 · 2025-01-21 11:07:44 -08:00 · 2025-01-21 11:07:44 -08:00 · a7f72e1aea
commit a7f72e1aea
parent fdf883c014
1 changed files with 9 additions and 9 deletions
--- a/README.md
+++ b/README.md
@ -49,12 +49,12 @@
 ## 1. Introduction

 We introduce our first-generation reasoning models, DeepSeek-R1-Zero and DeepSeek-R1. 
-DeepSeek-R1-Zero, a model trained via large-scale reinforcement learning (RL) without supervised fine-tuning (SFT) as a preliminary step, demonstrated remarkable performance on reasoning.
+DeepSeek-R1-Zero, a model trained via large-scale reinforcement learning (RL) without supervised fine-tuning (SFT) as a preliminary step, demonstrated remarkable reasoning performance.
 With RL, DeepSeek-R1-Zero naturally emerged with numerous powerful and interesting reasoning behaviors.
-However, DeepSeek-R1-Zero encounters challenges such as endless repetition, poor readability, and language mixing. To address these issues and further enhance reasoning performance,
-we introduce DeepSeek-R1, which incorporates cold-start data before RL.
+However, DeepSeek-R1-Zero encounters challenges like endless repetition, poor readability, and language mixing. 
+We introduce DeepSeek-R1, which incorporates cold-start data before RL to address these issues and enhance reasoning performance.
 DeepSeek-R1 achieves performance comparable to OpenAI-o1 across math, code, and reasoning tasks. 
-To support the research community, we have open-sourced DeepSeek-R1-Zero, DeepSeek-R1, and six dense models distilled from DeepSeek-R1 based on Llama and Qwen. DeepSeek-R1-Distill-Qwen-32B outperforms OpenAI-o1-mini across various benchmarks, achieving new state-of-the-art results for dense models.
+We have open-sourced DeepSeek-R1-Zero, DeepSeek-R1, and six dense models distilled from DeepSeek-R1 based on Llama and Qwen to support the research community. DeepSeek-R1-Distill-Qwen-32B outperforms OpenAI-o1-mini across various benchmarks, achieving new state-of-the-art results for dense models.

 <p align="center">
  <img width="80%" src="figures/benchmark.jpg">
@ -92,7 +92,7 @@ To support the research community, we have open-sourced DeepSeek-R1-Zero, DeepSe
 </div>

 DeepSeek-R1-Zero & DeepSeek-R1 are trained based on DeepSeek-V3-Base. 
-For more details regrading the model architecture, please refer to [DeepSeek-V3](https://github.com/deepseek-ai/DeepSeek-V3) repository.
+For more details regarding the model architecture, please refer to the [DeepSeek-V3](https://github.com/deepseek-ai/DeepSeek-V3) repository.

 ### DeepSeek-R1-Distill Models

@ -104,18 +104,18 @@ For more details regrading the model architecture, please refer to [DeepSeek-V3]
 | DeepSeek-R1-Distill-Qwen-7B  | [Qwen2.5-Math-7B](https://huggingface.co/Qwen/Qwen2.5-Math-7B) | [🤗 HuggingFace](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-7B)   |
 | DeepSeek-R1-Distill-Llama-8B  | [Llama-3.1-8B](https://huggingface.co/meta-llama/Llama-3.1-8B) | [🤗 HuggingFace](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Llama-8B)   |
 | DeepSeek-R1-Distill-Qwen-14B   | [Qwen2.5-14B](https://huggingface.co/Qwen/Qwen2.5-14B) | [🤗 HuggingFace](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-14B)   |
-|DeepSeek-R1-Distill-Qwen-32B  | [Qwen2.5-32B](https://huggingface.co/Qwen/Qwen2.5-32B) | [🤗 HuggingFace](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-32B)   |
+| DeepSeek-R1-Distill-Qwen-32B  | [Qwen2.5-32B](https://huggingface.co/Qwen/Qwen2.5-32B) | [🤗 HuggingFace](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-32B)   |
 | DeepSeek-R1-Distill-Llama-70B  | [Llama-3.3-70B-Instruct](https://huggingface.co/meta-llama/Llama-3.3-70B-Instruct) | [🤗 HuggingFace](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Llama-70B)   |

 </div>

-DeepSeek-R1-Distill models are fine-tuned based on open-source models, using samples generated by DeepSeek-R1.
-We slightly change their configs and tokenizers. Please use our setting to run these models.
+DeepSeek-R1-Distill models are fine-tuned based on open-source models using samples generated by DeepSeek-R1.
+We slightly change their configs and tokenizers. Please use our settings to run these models.

 ## 4. Evaluation Results

 ### DeepSeek-R1-Evaluation
- For all our models, the maximum generation length is set to 32,768 tokens. For benchmarks requiring sampling, we use a temperature of $0.6$, a top-p value of $0.95$, and generate 64 responses per query to estimate pass@1.
+ The maximum generation length for all our models is 32,768 tokens. For benchmarks requiring sampling, we use a temperature of $0.6$, a top-p value of $0.95$, and generate 64 responses per query to estimate pass@1.
 <div align="center">