DeepSeek-V3

mirror of https://github.com/deepseek-ai/DeepSeek-V3.git synced 2025-07-05 07:51:38 -04:00

History

Gabriel Caetano a7bab5c920 Clean up and optimize Triton FP8 kernels - Improved readability and structure of Triton kernels for FP8 weight dequantization and matrix multiplication (GEMM) - Added comments for clarity - Replaced hardcoded block sizes with configurable parameters - Improved safety using tl.cdiv and masking - Renamed variables and ensured consistency in naming		2025-04-08 22:33:48 -03:00
..
configs	Release DeepSeek-V3	2024-12-26 19:01:57 +08:00
convert.py	Enhance documentation and update .gitignore for model conversion scripts	2025-01-05 18:18:18 +00:00
fp8_cast_bf16.py	Enhance documentation and update .gitignore for model conversion scripts	2025-01-05 18:18:18 +00:00
generate.py	Change	2025-01-30 22:47:39 -03:00
kernel.py	Clean up and optimize Triton FP8 kernels	2025-04-08 22:33:48 -03:00
model.py	Updated model.py docstrings	2025-01-05 18:24:31 +00:00
requirements.txt	Release DeepSeek-V3	2024-12-26 19:01:57 +08:00