update readme

2025-05-23 02:37:07 -04:00 · 2025-05-21 21:48:26 +00:00 · 2025-05-21 21:48:26 +00:00 · 98fd21ce21
commit 98fd21ce21
parent 579a7711e4
1 changed files with 8 additions and 5 deletions
--- a/README.md
+++ b/README.md
@ -55,23 +55,26 @@ python eval_multigpu.py \
 This script calculates the scores for each expert based on the evaluation datasets.
 **Usage:**
 ```bash
 export PYTHONPATH=$PYTHONPATH:$(pwd)
 python scripts/expert/get_expert_scores.py \
-    --eval_dataset=translation \
+    --eval_dataset=intent \
    --base_model_path=deepseek-ai/ESFT-vanilla-lite \
-    --output_dir=results/expert_scores/translation \
+    --output_dir=results/expert_scores/intent \
    --n_sample_tokens=131072 \
    --world_size=4 \
    --gpus_per_rank=2
    # for N gpus, world_size should be N / gpus_per_rank
 ```
 3. **generate_expert_config.py**
 This script generates the configuration to convert a MoE model with only task-relevant tasks trained based on evaluation scores.
 **Usage:**
 ```bash
 export PYTHONPATH=$PYTHONPATH:$(pwd)
 python scripts/expert/generate_expert_config.py \
-    --eval_datasets=intent,summary,law,translation \
+    --eval_dataset=intent \
-    --expert_scores_dir=results/expert_scores \
+    --expert_scores_dir=results/expert_scores/intent \
-    --output_dir=results/expert_configs \
+    --output_path=results/expert_configs/intent.json \
    --score_function=token \
    --top_p=0.2 # the scoring function and top_p are hyperparameters
 ```