DeepSeek-V3/inference
Yang Wang 65d8f5f1e9
Add CUDA cache clearing in memory management
Added torch.cuda.empty_cache() to free up unused memory on the GPU,
2024-12-26 23:18:39 +08:00
..
configs Release DeepSeek-V3 2024-12-26 19:01:57 +08:00
convert.py Release DeepSeek-V3 2024-12-26 19:01:57 +08:00
fp8_cast_bf16.py Add CUDA cache clearing in memory management 2024-12-26 23:18:39 +08:00
generate.py Release DeepSeek-V3 2024-12-26 19:01:57 +08:00
kernel.py Release DeepSeek-V3 2024-12-26 19:01:57 +08:00
model.py Release DeepSeek-V3 2024-12-26 19:01:57 +08:00
requirements.txt Release DeepSeek-V3 2024-12-26 19:01:57 +08:00