Fix typos and ensure consistency in documentation

Correct minor typos and ensure consistency in terminology in `README.md` and `README_WEIGHTS.md`.

* **README.md**
  - Correct minor typos in the text.
  - Ensure consistency in terminology across the document.

* **README_WEIGHTS.md**
  - Correct minor typos in the text.
  - Ensure consistency in terminology across the document.

---

For more details, open the [Copilot Workspace session](https://copilot-workspace.githubnext.com/deepseek-ai/DeepSeek-V3?shareId=XXXX-XXXX-XXXX-XXXX).
This commit is contained in:
Abenezer Anglo 2025-01-27 18:50:55 +01:00
parent b5d872ead0
commit 0b39205aed
2 changed files with 3 additions and 3 deletions

View File

@ -23,7 +23,7 @@
<img alt="Discord" src="https://img.shields.io/badge/Discord-DeepSeek%20AI-7289da?logo=discord&logoColor=white&color=7289da" style="display: inline-block; vertical-align: middle;"/>
</a>
<a href="https://github.com/deepseek-ai/DeepSeek-V2/blob/main/figures/qr.jpeg?raw=true" target="_blank" style="margin: 2px;">
<img alt="Wechat" src="https://img.shields.io/badge/WeChat-DeepSeek%20AI-brightgreen?logo=wechat&logoColor=white" style="display: inline-block; vertical-align: middle;"/>
<img alt="WeChat" src="https://img.shields.io/badge/WeChat-DeepSeek%20AI-brightgreen?logo=wechat&logoColor=white" style="display: inline-block; vertical-align: middle;"/>
</a>
<a href="https://twitter.com/deepseek_ai" target="_blank" style="margin: 2px;">
<img alt="Twitter Follow" src="https://img.shields.io/badge/Twitter-deepseek_ai-white?logo=x&logoColor=white" style="display: inline-block; vertical-align: middle;"/>

View File

@ -3,7 +3,7 @@
## New Fields in `config.json`
- **model_type**: Specifies the model type, which is updated to `deepseek_v3` in this release.
- **num_nextn_predict_layers**: Indicates the number of Multi-Token Prediction (MTP) Modules. The open-sourced V3 weights include **1 MTP Module** .
- **num_nextn_predict_layers**: Indicates the number of Multi-Token Prediction (MTP) Modules. The open-sourced V3 weights include **1 MTP Module**.
- **quantization_config**: Describes the configuration for FP8 quantization.
---
@ -35,7 +35,7 @@ The DeepSeek-V3 weight file consists of two main components: **Main Model Weight
- **Composition**:
- Additional MTP Modules defined by the `num_nextn_predict_layers` field. In this model, the value is set to 1.
- **Parameter Count**:
- Parameters: **11.5B unique parameters**, excluding the shared 0.9B Embedding and 0.9B output Head).
- Parameters: **11.5B unique parameters**, excluding the shared 0.9B Embedding and 0.9B output Head.
- Activation parameters: **2.4B** (including the shared 0.9B Embedding and 0.9B output Head).
#### Structural Details