DeepSeek-V3/inference
Yang Wang 1e3a83629e
handle missing scale_inv_name
Fixed an issue where `weight` and `weight_scale_inv` (e.g. `model.layers.39.mlp.experts.92.gate_proj.weight` and `model.layers.39.mlp.experts.92.gate_proj.weight_scale_inv`) were not in the same SafeTensor, causing an assertion error due to scale_inv_name not being in the state_dict.
2024-12-26 23:09:17 +08:00
..
configs Release DeepSeek-V3 2024-12-26 19:01:57 +08:00
convert.py Release DeepSeek-V3 2024-12-26 19:01:57 +08:00
fp8_cast_bf16.py handle missing scale_inv_name 2024-12-26 23:09:17 +08:00
generate.py Release DeepSeek-V3 2024-12-26 19:01:57 +08:00
kernel.py Release DeepSeek-V3 2024-12-26 19:01:57 +08:00
model.py Release DeepSeek-V3 2024-12-26 19:01:57 +08:00
requirements.txt Release DeepSeek-V3 2024-12-26 19:01:57 +08:00