Commit Graph

3 Commits

Author SHA1 Message Date
Yang Wang
1e3a83629e
handle missing scale_inv_name
Fixed an issue where `weight` and `weight_scale_inv` (e.g. `model.layers.39.mlp.experts.92.gate_proj.weight` and `model.layers.39.mlp.experts.92.gate_proj.weight_scale_inv`) were not in the same SafeTensor, causing an assertion error due to scale_inv_name not being in the state_dict.
2024-12-26 23:09:17 +08:00
stack-heap-overflow
4c2fdb8f55 Release DeepSeek-V3 2024-12-26 19:01:57 +08:00
stack-heap-overflow
4b58dc6bfc
Initial commit 2024-12-26 17:52:41 +08:00