mirror of
https://github.com/deepseek-ai/DeepSeek-R1.git
synced 2025-04-19 18:18:58 -04:00
alert
This commit is contained in:
parent
ed4409d2a9
commit
e2d4c31b6f
@ -53,7 +53,8 @@ we introduce DeepSeek-R1, which incorporates cold-start data before RL.
|
||||
DeepSeek-R1 achieves performance comparable to OpenAI-o1 across math, code, and reasoning tasks.
|
||||
To support the research community, we have open-sourced DeepSeek-R1-Zero, DeepSeek-R1, and six dense models distilled from DeepSeek-R1 based on Llama and Qwen. DeepSeek-R1-Distill-Qwen-32B outperforms OpenAI-o1-mini across various benchmarks, achieving new state-of-the-art results for dense models.
|
||||
|
||||
**NOTE: Before running DeepSeek-R1 series models locally, we kindly recommend reviewing the [Usage Recommendation](#usage-recommendations) section.**
|
||||
> [!NOTE]
|
||||
> Before running DeepSeek-R1 series models locally, we kindly recommend reviewing the [Usage Recommendation](#usage-recommendations) section.
|
||||
|
||||
<p align="center">
|
||||
<img width="80%" src="figures/benchmark.jpg">
|
||||
@ -180,7 +181,8 @@ We also provide OpenAI-Compatible API at DeepSeek Platform: [platform.deepseek.c
|
||||
|
||||
Please visit [DeepSeek-V3](https://github.com/deepseek-ai/DeepSeek-V3) repo for more information about running DeepSeek-R1 locally.
|
||||
|
||||
**NOTE: Hugging Face's Transformers has not been directly supported yet.**
|
||||
> [!NOTE]
|
||||
> Hugging Face's Transformers has not been directly supported yet.
|
||||
|
||||
### DeepSeek-R1-Distill Models
|
||||
|
||||
|
Loading…
Reference in New Issue
Block a user