Commit Graph

  • 9c370c0638
    Merge 2b3c49c9cb into b8b0f8ce09 #41 jayesh thakare 2025-06-19 10:37:50 -0400
  • 2b3c49c9cb Add commented options for output attentions and hidden states in DeepSeekMathConfig #41 jayeshthk 2025-06-19 17:46:50 +0530
  • 735546a4f9 Add List import to model.py for type hinting jayeshthk 2025-06-19 17:33:52 +0530
  • b53e984052 Implement DeepSeek-Math model and training pipeline with dataset handling and distributed training support jayeshthk 2025-06-19 17:27:05 +0530
  • c77291ddff
    Merge ddf18bb444 into b8b0f8ce09 #17 Dylancer 2024-04-15 10:18:15 -0500
  • b8b0f8ce09
    Update summarize_results.py main Zhihong Shao 2024-04-15 15:55:36 +0800
  • ddf18bb444 [fixed] the merging output is incorrect, when parallel_num=1 #17 Dylancer1998 2024-04-11 09:30:32 +0000
  • 7c34ad4fa4
    Merge pull request #6 from chenxwh/main Daya Guo 2024-02-19 17:05:02 +0800
  • a0fdfa2682 replicate #6 chenxwh 2024-02-12 21:30:24 +0000
  • 555ba27526 repliate chenxwh 2024-02-11 14:51:06 +0000
  • 32b2faf06e replicate chenxwh 2024-02-11 14:45:44 +0000
  • db877abb91 update submit_eval_jobs ZhihongShao 2024-02-09 15:23:28 +0800
  • 83d3c4cc6a update instruct_results.png ZhihongShao 2024-02-07 02:44:27 +0800
  • 4efb75d131
    Update README.md ZHU QIHAO 2024-02-06 19:24:28 +0800
  • a36f67d7d8
    Update README.md ZHU QIHAO 2024-02-06 19:19:55 +0800
  • 21cc5c6701 init ZhihongShao 2024-02-06 10:27:40 +0800