From dbc3a01195b168280619c67e5993cd53bd62d16d Mon Sep 17 00:00:00 2001 From: Manas Dey <51678885+mannas006@users.noreply.github.com> Date: Tue, 28 Jan 2025 23:59:12 +0530 Subject: [PATCH] Update README.md --- README.md | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/README.md b/README.md index 9433140..0e01c89 100644 --- a/README.md +++ b/README.md @@ -53,6 +53,13 @@ we introduce DeepSeek-R1, which incorporates cold-start data before RL. DeepSeek-R1 achieves performance comparable to OpenAI-o1 across math, code, and reasoning tasks. To support the research community, we have open-sourced DeepSeek-R1-Zero, DeepSeek-R1, and six dense models distilled from DeepSeek-R1 based on Llama and Qwen. DeepSeek-R1-Distill-Qwen-32B outperforms OpenAI-o1-mini across various benchmarks, achieving new state-of-the-art results for dense models. +**Key Features** + +- State-of-the-art performance in reasoning tasks +- Open-source availability of both main models +- Six dense distilled models based on Llama and Qwen architectures +- 32,768 token context length support +- Comprehensive benchmark results across multiple domains **NOTE: Before running DeepSeek-R1 series models locally, we kindly recommend reviewing the [Usage Recommendation](#usage-recommendations) section.**