mirror of
https://github.com/deepseek-ai/DeepSeek-R1.git
synced 2025-02-23 06:09:00 -05:00
Update README.md
This commit is contained in:
parent
1108785f81
commit
dbc3a01195
@ -53,6 +53,13 @@ we introduce DeepSeek-R1, which incorporates cold-start data before RL.
|
|||||||
DeepSeek-R1 achieves performance comparable to OpenAI-o1 across math, code, and reasoning tasks.
|
DeepSeek-R1 achieves performance comparable to OpenAI-o1 across math, code, and reasoning tasks.
|
||||||
To support the research community, we have open-sourced DeepSeek-R1-Zero, DeepSeek-R1, and six dense models distilled from DeepSeek-R1 based on Llama and Qwen. DeepSeek-R1-Distill-Qwen-32B outperforms OpenAI-o1-mini across various benchmarks, achieving new state-of-the-art results for dense models.
|
To support the research community, we have open-sourced DeepSeek-R1-Zero, DeepSeek-R1, and six dense models distilled from DeepSeek-R1 based on Llama and Qwen. DeepSeek-R1-Distill-Qwen-32B outperforms OpenAI-o1-mini across various benchmarks, achieving new state-of-the-art results for dense models.
|
||||||
|
|
||||||
|
**Key Features**
|
||||||
|
|
||||||
|
- State-of-the-art performance in reasoning tasks
|
||||||
|
- Open-source availability of both main models
|
||||||
|
- Six dense distilled models based on Llama and Qwen architectures
|
||||||
|
- 32,768 token context length support
|
||||||
|
- Comprehensive benchmark results across multiple domains
|
||||||
|
|
||||||
**NOTE: Before running DeepSeek-R1 series models locally, we kindly recommend reviewing the [Usage Recommendation](#usage-recommendations) section.**
|
**NOTE: Before running DeepSeek-R1 series models locally, we kindly recommend reviewing the [Usage Recommendation](#usage-recommendations) section.**
|
||||||
|
|
||||||
|
Loading…
Reference in New Issue
Block a user