DeepSeek-V3

mirror of https://github.com/deepseek-ai/DeepSeek-V3.git synced 2025-06-30 13:31:34 -04:00

History

Yang Wang 65d8f5f1e9 Add CUDA cache clearing in memory management Added torch.cuda.empty_cache() to free up unused memory on the GPU,		2024-12-26 23:18:39 +08:00
..
configs	Release DeepSeek-V3	2024-12-26 19:01:57 +08:00
convert.py	Release DeepSeek-V3	2024-12-26 19:01:57 +08:00
fp8_cast_bf16.py	Add CUDA cache clearing in memory management	2024-12-26 23:18:39 +08:00
generate.py	Release DeepSeek-V3	2024-12-26 19:01:57 +08:00
kernel.py	Release DeepSeek-V3	2024-12-26 19:01:57 +08:00
model.py	Release DeepSeek-V3	2024-12-26 19:01:57 +08:00
requirements.txt	Release DeepSeek-V3	2024-12-26 19:01:57 +08:00