jayesh thakare
|
9c370c0638
|
Merge 2b3c49c9cb into b8b0f8ce09
|
2025-06-19 10:37:50 -04:00 |
|
jayeshthk
|
2b3c49c9cb
|
Add commented options for output attentions and hidden states in DeepSeekMathConfig
|
2025-06-19 17:46:50 +05:30 |
|
jayeshthk
|
735546a4f9
|
Add List import to model.py for type hinting
|
2025-06-19 17:33:52 +05:30 |
|
jayeshthk
|
b53e984052
|
Implement DeepSeek-Math model and training pipeline with dataset handling and distributed training support
|
2025-06-19 17:27:05 +05:30 |
|
Zhihong Shao
|
b8b0f8ce09
|
Update summarize_results.py
|
2024-04-15 15:55:36 +08:00 |
|
Daya Guo
|
7c34ad4fa4
|
Merge pull request #6 from chenxwh/main
Add Replicate demo and API
|
2024-02-19 17:05:02 +08:00 |
|
chenxwh
|
a0fdfa2682
|
replicate
|
2024-02-12 21:30:24 +00:00 |
|
chenxwh
|
555ba27526
|
repliate
|
2024-02-11 14:51:06 +00:00 |
|
chenxwh
|
32b2faf06e
|
replicate
|
2024-02-11 14:45:44 +00:00 |
|
ZhihongShao
|
db877abb91
|
update submit_eval_jobs
|
2024-02-09 15:23:28 +08:00 |
|
ZhihongShao
|
83d3c4cc6a
|
update instruct_results.png
|
2024-02-07 02:44:27 +08:00 |
|
ZHU QIHAO
|
4efb75d131
|
Update README.md
|
2024-02-06 19:24:28 +08:00 |
|
ZHU QIHAO
|
a36f67d7d8
|
Update README.md
|
2024-02-06 19:19:55 +08:00 |
|
ZhihongShao
|
21cc5c6701
|
init
|
2024-02-06 10:27:40 +08:00 |
|