awesome-deepseek-integration/docs/promptfoo/README.md

# promptfoo

[promptfoo](https://promptfoo.dev) is an open-source framework for testing and evaluating LLM outputs. It helps you compare DeepSeek models with other LLMs (like o1, GPT-4o, Claude 3.5, Llama3.3, and Gemini) and test LLMs and LLM applications for security vulnerabilities. You can:

- Run side-by-side comparisons between models
- Check output quality and consistency
- Generate test reports

## Setup

1. Install promptfoo:

```bash
npm install -g promptfoo
# or
brew install promptfoo
```

2. Configure API keys:

```bash
export DEEPSEEK_API_KEY=your_api_key
# Add other API keys as needed
```

## Quick Start

Create a configuration file `promptfooconfig.yaml`:

```yaml
providers:
  - deepseek:deepseek-reasoner # DeepSeek-R1
  - openai:o1

prompts:
  - 'Solve this step by step: {{math_problem}}'

tests:
  - vars:
      math_problem: 'What is the derivative of x^3 + 2x with respect to x?'
    assert:
      - type: contains
        value: '3x^2' # Check for correct answer
      - type: llm-rubric
        value: 'Response shows clear steps'
      - type: cost
        threshold: 0.05 # Maximum cost per request
```

Run tests:

```bash
promptfoo eval
```

View results in your browser:

```bash
promptfoo view
```

## Example Project

Check out our [example](https://github.com/promptfoo/promptfoo/tree/main/examples/deepseek-r1-vs-openai-o1) that compares r1 and o1 on MMLU.

## Resources

- [Documentation](https://promptfoo.dev/docs/providers/deepseek)
- [GitHub Repository](https://github.com/promptfoo/promptfoo)
- [Community Discord](https://discord.gg/promptfoo)
feat(integration): add promptfoo LLM testing framework Add promptfoo to the awesome-deepseek-integration library with: - English and Chinese documentation - Basic setup and usage guides - Example configuration for DeepSeek model testing - Integration entry in both README.md and README_cn.md 2025-01-26 02:33:01 -05:00			`# promptfoo`

			`[promptfoo](https://promptfoo.dev) is an open-source framework for testing and evaluating LLM outputs. It helps you compare DeepSeek models with other LLMs (like o1, GPT-4o, Claude 3.5, Llama3.3, and Gemini) and test LLMs and LLM applications for security vulnerabilities. You can:`

			`- Run side-by-side comparisons between models`
			`- Check output quality and consistency`
			`- Generate test reports`

			`## Setup`

			`1. Install promptfoo:`

			```bash
			`npm install -g promptfoo`
			`# or`
			`brew install promptfoo`
			```

			`2. Configure API keys:`

			```bash
			`export DEEPSEEK_API_KEY=your_api_key`
			`# Add other API keys as needed`
			```

			`## Quick Start`

			Create a configuration file `promptfooconfig.yaml`:

			```yaml
			`providers:`
			`- deepseek:deepseek-reasoner # DeepSeek-R1`
			`- openai:o1`

			`prompts:`
			`- 'Solve this step by step: {{math_problem}}'`

			`tests:`
			`- vars:`
			`math_problem: 'What is the derivative of x^3 + 2x with respect to x?'`
			`assert:`
			`- type: contains`
			`value: '3x^2' # Check for correct answer`
			`- type: llm-rubric`
			`value: 'Response shows clear steps'`
			`- type: cost`
			`threshold: 0.05 # Maximum cost per request`
			```

			`Run tests:`

			```bash
			`promptfoo eval`
			```

			`View results in your browser:`

			```bash
			`promptfoo view`
			```

			`## Example Project`

			`Check out our [example](https://github.com/promptfoo/promptfoo/tree/main/examples/deepseek-r1-vs-openai-o1) that compares r1 and o1 on MMLU.`

			`## Resources`

			`- [Documentation](https://promptfoo.dev/docs/providers/deepseek)`
			`- [GitHub Repository](https://github.com/promptfoo/promptfoo)`
			`- [Community Discord](https://discord.gg/promptfoo)`