mirror of
https://github.com/deepseek-ai/DeepSeek-V3.git
synced 2025-04-20 02:28:57 -04:00
docs(readme): improve table formatting and readability
This commit is contained in:
parent
5ee97a83f0
commit
fbdd5dcfeb
@ -104,7 +104,7 @@ Throughout the entire training process, we did not experience any irrecoverable
|
|||||||
<div align="center">
|
<div align="center">
|
||||||
|
|
||||||
| **Model** | **#Total Params** | **#Activated Params** | **Context Length** | **Download** |
|
| **Model** | **#Total Params** | **#Activated Params** | **Context Length** | **Download** |
|
||||||
| :------------: | :------------: | :------------: | :------------: | :------------: |
|
| :--------------: | :---------------: | :-------------------: | :----------------: | :--------------------------------------------------------------------: |
|
||||||
| DeepSeek-V3-Base | 671B | 37B | 128K | [🤗 Hugging Face](https://huggingface.co/deepseek-ai/DeepSeek-V3-Base) |
|
| DeepSeek-V3-Base | 671B | 37B | 128K | [🤗 Hugging Face](https://huggingface.co/deepseek-ai/DeepSeek-V3-Base) |
|
||||||
| DeepSeek-V3 | 671B | 37B | 128K | [🤗 Hugging Face](https://huggingface.co/deepseek-ai/DeepSeek-V3) |
|
| DeepSeek-V3 | 671B | 37B | 128K | [🤗 Hugging Face](https://huggingface.co/deepseek-ai/DeepSeek-V3) |
|
||||||
|
|
||||||
@ -125,7 +125,7 @@ For developers looking to dive deeper, we recommend exploring [README_WEIGHTS.md
|
|||||||
|
|
||||||
|
|
||||||
| | Benchmark (Metric) | # Shots | DeepSeek-V2 | Qwen2.5 72B | LLaMA3.1 405B | DeepSeek-V3 |
|
| | Benchmark (Metric) | # Shots | DeepSeek-V2 | Qwen2.5 72B | LLaMA3.1 405B | DeepSeek-V3 |
|
||||||
|---|-------------------|----------|--------|-------------|---------------|---------|
|
| ------------ | --------------------------- | ------- | ----------- | ----------- | ------------- | ----------- |
|
||||||
| | Architecture | - | MoE | Dense | Dense | MoE |
|
| | Architecture | - | MoE | Dense | Dense | MoE |
|
||||||
| | # Activated Params | - | 21B | 72B | 405B | 37B |
|
| | # Activated Params | - | 21B | 72B | 405B | 37B |
|
||||||
| | # Total Params | - | 236B | 72B | 405B | 671B |
|
| | # Total Params | - | 236B | 72B | 405B | 671B |
|
||||||
@ -180,7 +180,7 @@ Evaluation results on the ``Needle In A Haystack`` (NIAH) tests. DeepSeek-V3 pe
|
|||||||
<div align="center">
|
<div align="center">
|
||||||
|
|
||||||
| | **Benchmark (Metric)** | **DeepSeek V2-0506** | **DeepSeek V2.5-0905** | **Qwen2.5 72B-Inst.** | **Llama3.1 405B-Inst.** | **Claude-3.5-Sonnet-1022** | **GPT-4o 0513** | **DeepSeek V3** |
|
| | **Benchmark (Metric)** | **DeepSeek V2-0506** | **DeepSeek V2.5-0905** | **Qwen2.5 72B-Inst.** | **Llama3.1 405B-Inst.** | **Claude-3.5-Sonnet-1022** | **GPT-4o 0513** | **DeepSeek V3** |
|
||||||
|---|---------------------|---------------------|----------------------|---------------------|----------------------|---------------------------|----------------|----------------|
|
| ------- | -------------------------- | -------------------- | ---------------------- | --------------------- | ----------------------- | -------------------------- | --------------- | --------------- |
|
||||||
| | Architecture | MoE | MoE | Dense | Dense | - | - | MoE |
|
| | Architecture | MoE | MoE | Dense | Dense | - | - | MoE |
|
||||||
| | # Activated Params | 21B | 21B | 72B | 405B | - | - | 37B |
|
| | # Activated Params | 21B | 21B | 72B | 405B | - | - | 37B |
|
||||||
| | # Total Params | 236B | 236B | 72B | 405B | - | - | 671B |
|
| | # Total Params | 236B | 236B | 72B | 405B | - | - | 671B |
|
||||||
@ -220,7 +220,7 @@ Evaluation results on the ``Needle In A Haystack`` (NIAH) tests. DeepSeek-V3 pe
|
|||||||
|
|
||||||
|
|
||||||
| Model | Arena-Hard | AlpacaEval 2.0 |
|
| Model | Arena-Hard | AlpacaEval 2.0 |
|
||||||
|-------|------------|----------------|
|
| ---------------------- | ---------- | -------------- |
|
||||||
| DeepSeek-V2.5-0905 | 76.2 | 50.5 |
|
| DeepSeek-V2.5-0905 | 76.2 | 50.5 |
|
||||||
| Qwen2.5-72B-Instruct | 81.2 | 49.1 |
|
| Qwen2.5-72B-Instruct | 81.2 | 49.1 |
|
||||||
| LLaMA-3.1 405B | 69.3 | 40.5 |
|
| LLaMA-3.1 405B | 69.3 | 40.5 |
|
||||||
|
Loading…
Reference in New Issue
Block a user