mirror of
https://github.com/deepseek-ai/awesome-deepseek-integration.git
synced 2025-02-23 06:09:02 -05:00
.. | ||
README_cn.md | ||
README.md |
Curator
Curator is an open-source tool to curate large scale datasets for post-training LLMs.
Curator was used to curate Bespoke-Stratos-17k, a reasoning dataset to train a fully open reasoning model Bespoke-Stratos.
Curator supports:
- Calling Deepseek API for scalable synthetic data curation
- Easy structured data extraction
- Caching and automatic recovery
- Dataset visualization
- Saving $
using batch mode