DeepSeek-V3

mirror of https://github.com/deepseek-ai/DeepSeek-V3.git synced 2025-07-18 15:19:09 -04:00

History

Cristian Cezar Moisés eee820cc36 Update fp8_cast_bf16.py Type Hints & Path Management: Added comprehensive type annotations Used pathlib.Path for safer path handling Enhanced Error Handling: Structured exception handling throughout Clear error messages with context Safe resource cleanup Memory Management: LRU cache implementation with OrderedDict Configurable cache size Explicit GPU memory cleanup Logging System: Configurable logging levels Detailed progress tracking Structured error reporting Code Organization: Split into focused, testable functions Clear separation of concerns Documented public methods Validation & Safety: Input path validation Weight type checking Clone tensors to prevent reference issues Performance: Optimized file loading with LRU cache Batched tensor processing Asynchronous CUDA operations Metadata & Traceability: Added conversion metadata to output files Preserved original index structure Enhanced output index information Configuration: Centralized constants Device-aware execution (CUDA/CPU) Progress Tracking: Nested progress bars Detailed file processing status		2025-01-27 23:13:11 -03:00
..
configs	Release DeepSeek-V3	2024-12-26 19:01:57 +08:00
convert.py	Update convert.py	2025-01-27 23:10:08 -03:00
fp8_cast_bf16.py	Update fp8_cast_bf16.py	2025-01-27 23:13:11 -03:00
generate.py	Enhance documentation and update .gitignore for model conversion scripts	2025-01-05 18:18:18 +00:00
kernel.py	Enhance documentation and update .gitignore for model conversion scripts	2025-01-05 18:18:18 +00:00
model.py	Updated model.py docstrings	2025-01-05 18:24:31 +00:00
requirements.txt	Release DeepSeek-V3	2024-12-26 19:01:57 +08:00