Commit Graph

7 Commits

Author SHA1 Message Date
sudopacman
687f06b004
Update requirements.txt
The current pip library does not provide version 2.4.1 'touch' and version 3.0.0 'triton', and the 'requirements.txt' file has been updated to a minimum to meet the current pip installation requirements
2025-02-05 01:53:45 +08:00
enoch kan
bc77f22afc Updated model.py docstrings 2025-01-05 18:24:31 +00:00
enoch kan
a1296f099e Enhance documentation and update .gitignore for model conversion scripts 2025-01-05 18:18:18 +00:00
GeeeekExplorer
fd011c11aa torch rmsnorm 2025-01-05 14:33:48 +08:00
Xingkai Yu
8710ec2ecb
require model-parallel in convert.py 2024-12-31 18:05:55 +08:00
Yang Wang
8f1c9488b5
handle missing scale_inv_name (#2)
* handle missing scale_inv_name

Fixed an issue where `weight` and `weight_scale_inv` (e.g. `model.layers.39.mlp.experts.92.gate_proj.weight` and `model.layers.39.mlp.experts.92.gate_proj.weight_scale_inv`) were not in the same SafeTensor, causing an assertion error due to scale_inv_name not being in the state_dict.

* sort filename to reduce memory costs

* Add CUDA cache clearing in memory management

Added torch.cuda.empty_cache() to free up unused memory on the GPU,
2024-12-27 09:34:38 +08:00
stack-heap-overflow
4c2fdb8f55 Release DeepSeek-V3 2024-12-26 19:01:57 +08:00