awesome-deepseek-integration/docs/promptfoo/README.md

71 lines
1.6 KiB
Markdown
Raw Normal View History

# promptfoo
[promptfoo](https://promptfoo.dev) is an open-source framework for testing and evaluating LLM outputs. It helps you compare DeepSeek models with other LLMs (like o1, GPT-4o, Claude 3.5, Llama3.3, and Gemini) and test LLMs and LLM applications for security vulnerabilities. You can:
- Run side-by-side comparisons between models
- Check output quality and consistency
- Generate test reports
## Setup
1. Install promptfoo:
```bash
npm install -g promptfoo
# or
brew install promptfoo
```
2. Configure API keys:
```bash
export DEEPSEEK_API_KEY=your_api_key
# Add other API keys as needed
```
## Quick Start
Create a configuration file `promptfooconfig.yaml`:
```yaml
providers:
- deepseek:deepseek-reasoner # DeepSeek-R1
- openai:o1
prompts:
- 'Solve this step by step: {{math_problem}}'
tests:
- vars:
math_problem: 'What is the derivative of x^3 + 2x with respect to x?'
assert:
- type: contains
value: '3x^2' # Check for correct answer
- type: llm-rubric
value: 'Response shows clear steps'
- type: cost
threshold: 0.05 # Maximum cost per request
```
Run tests:
```bash
promptfoo eval
```
View results in your browser:
```bash
promptfoo view
```
## Example Project
Check out our [example](https://github.com/promptfoo/promptfoo/tree/main/examples/deepseek-r1-vs-openai-o1) that compares r1 and o1 on MMLU.
## Resources
- [Documentation](https://promptfoo.dev/docs/providers/deepseek)
- [GitHub Repository](https://github.com/promptfoo/promptfoo)
- [Community Discord](https://discord.gg/promptfoo)