mirror of
https://github.com/deepseek-ai/DeepSeek-V3.git
synced 2025-02-23 06:08:58 -05:00
Here are the improvements made to the code for your commit message: Refactored init_distributed function: Extracted distributed setup logic into a separate function. Updated sample function: Replaced exponential approach with torch.multinomial for sampling. Improved argument validation: Replaced assert with a more user-friendly validation in main to ensure at least one parameter (input-file or interactive) is provided. Refactored interactive mode logic: Maintained user interaction logic but moved init_distributed call to the beginning of main. |
||
---|---|---|
.. | ||
configs | ||
convert.py | ||
fp8_cast_bf16.py | ||
generate.py | ||
kernel.py | ||
model.py | ||
requirements.txt |