awesome-deepseek-integration/docs/curator/README.md
Shreyas Pimpalgaonkar 1547c531a2 add curator
2025-01-27 11:11:52 -08:00

1.1 KiB

image

Curator

Curator is an open-source tool to curate large scale datasets for post-training LLMs.

Curator was used to curate Bespoke-Stratos-17k, a reasoning dataset to train a fully open reasoning model Bespoke-Stratos.

Curator supports:

  • Calling Deepseek API for scalable synthetic data curation
  • Easy structured data extraction
  • Caching and automatic recovery
  • Dataset visualization
  • Saving $ using batch mode

Call Deepseek API with Curator easily:

image

Get Started here