DeepSeek-V3

mirror of https://github.com/deepseek-ai/DeepSeek-V3.git synced 2025-07-03 23:11:36 -04:00

History

Gabriel Caetano 61790e1653 Update 2 Here are the improvements made to the code for your commit message: Refactored init_distributed function: Extracted distributed setup logic into a separate function. Updated sample function: Replaced exponential approach with torch.multinomial for sampling. Improved argument validation: Replaced assert with a more user-friendly validation in main to ensure at least one parameter (input-file or interactive) is provided. Refactored interactive mode logic: Maintained user interaction logic but moved init_distributed call to the beginning of main.		2025-01-31 19:33:00 -03:00
..
configs	Release DeepSeek-V3	2024-12-26 19:01:57 +08:00
convert.py	Enhance documentation and update .gitignore for model conversion scripts	2025-01-05 18:18:18 +00:00
fp8_cast_bf16.py	Enhance documentation and update .gitignore for model conversion scripts	2025-01-05 18:18:18 +00:00
generate.py	Change	2025-01-30 22:47:39 -03:00
kernel.py	Update 2	2025-01-31 19:33:00 -03:00
model.py	Updated model.py docstrings	2025-01-05 18:24:31 +00:00
requirements.txt	Release DeepSeek-V3	2024-12-26 19:01:57 +08:00