update readme

This commit is contained in:
ZihanWang314 2025-05-21 21:48:26 +00:00
parent 579a7711e4
commit 98fd21ce21

View File

@ -55,23 +55,26 @@ python eval_multigpu.py \
This script calculates the scores for each expert based on the evaluation datasets. This script calculates the scores for each expert based on the evaluation datasets.
**Usage:** **Usage:**
```bash ```bash
export PYTHONPATH=$PYTHONPATH:$(pwd)
python scripts/expert/get_expert_scores.py \ python scripts/expert/get_expert_scores.py \
--eval_dataset=translation \ --eval_dataset=intent \
--base_model_path=deepseek-ai/ESFT-vanilla-lite \ --base_model_path=deepseek-ai/ESFT-vanilla-lite \
--output_dir=results/expert_scores/translation \ --output_dir=results/expert_scores/intent \
--n_sample_tokens=131072 \ --n_sample_tokens=131072 \
--world_size=4 \ --world_size=4 \
--gpus_per_rank=2 --gpus_per_rank=2
# for N gpus, world_size should be N / gpus_per_rank
``` ```
3. **generate_expert_config.py** 3. **generate_expert_config.py**
This script generates the configuration to convert a MoE model with only task-relevant tasks trained based on evaluation scores. This script generates the configuration to convert a MoE model with only task-relevant tasks trained based on evaluation scores.
**Usage:** **Usage:**
```bash ```bash
export PYTHONPATH=$PYTHONPATH:$(pwd)
python scripts/expert/generate_expert_config.py \ python scripts/expert/generate_expert_config.py \
--eval_datasets=intent,summary,law,translation \ --eval_dataset=intent \
--expert_scores_dir=results/expert_scores \ --expert_scores_dir=results/expert_scores/intent \
--output_dir=results/expert_configs \ --output_path=results/expert_configs/intent.json \
--score_function=token \ --score_function=token \
--top_p=0.2 # the scoring function and top_p are hyperparameters --top_p=0.2 # the scoring function and top_p are hyperparameters
``` ```