mirror of
https://github.com/deepseek-ai/awesome-deepseek-integration.git
synced 2025-02-23 06:09:02 -05:00
feat(integration): add promptfoo LLM testing framework
Add promptfoo to the awesome-deepseek-integration library with: - English and Chinese documentation - Basic setup and usage guides - Example configuration for DeepSeek model testing - Integration entry in both README.md and README_cn.md
This commit is contained in:
parent
eed78498c8
commit
25e2ae3f20
@ -323,4 +323,9 @@ English/[简体中文](https://github.com/deepseek-ai/awesome-deepseek-integrati
|
|||||||
<td> <a href="https://geneplore.com/bot"> Geneplore AI </a> </td>
|
<td> <a href="https://geneplore.com/bot"> Geneplore AI </a> </td>
|
||||||
<td> Geneplore AI runs one of the largest AI Discord bots, now with Deepseek v3 and R1. </td>
|
<td> Geneplore AI runs one of the largest AI Discord bots, now with Deepseek v3 and R1. </td>
|
||||||
</tr>
|
</tr>
|
||||||
|
<tr>
|
||||||
|
<td> <img src="https://www.promptfoo.dev/img/logo-panda.svg" alt="Icon" width="64" height="auto" /> </td>
|
||||||
|
<td> <a href="docs/promptfoo/README.md"> promptfoo </a> </td>
|
||||||
|
<td> Test and evaluate LLM prompts, including DeepSeek models. Compare different LLM providers, catch regressions, and evaluate responses. </td>
|
||||||
|
</tr>
|
||||||
</table>
|
</table>
|
@ -100,7 +100,7 @@
|
|||||||
<tr>
|
<tr>
|
||||||
<td> <img src="https://github.com/LiberSonora/LiberSonora/blob/main/assets/avatar.jpeg?raw=true" alt="Icon" width="64" height="auto" /> </td>
|
<td> <img src="https://github.com/LiberSonora/LiberSonora/blob/main/assets/avatar.jpeg?raw=true" alt="Icon" width="64" height="auto" /> </td>
|
||||||
<td> <a href="https://github.com/LiberSonora/LiberSonora">LiberSonora</a> </td>
|
<td> <a href="https://github.com/LiberSonora/LiberSonora">LiberSonora</a> </td>
|
||||||
<td> LiberSonora,寓意“自由的声音”,是一个 AI 赋能的、强大的、开源有声书工具集,包含智能字幕提取、AI标题生成、多语言翻译等功能,支持 GPU 加速、批量离线处理</td>
|
<td> LiberSonora,寓意"自由的声音",是一个 AI 赋能的、强大的、开源有声书工具集,包含智能字幕提取、AI标题生成、多语言翻译等功能,支持 GPU 加速、批量离线处理</td>
|
||||||
</tr>
|
</tr>
|
||||||
<tr>
|
<tr>
|
||||||
<td> <img src="https://raw.githubusercontent.com/ripperhe/Bob/master/docs/_media/icon_128.png" alt="Icon" width="64" height="auto" /> </td>
|
<td> <img src="https://raw.githubusercontent.com/ripperhe/Bob/master/docs/_media/icon_128.png" alt="Icon" width="64" height="auto" /> </td>
|
||||||
@ -242,4 +242,9 @@
|
|||||||
<td> <a href="https://github.com/rubickecho/n8n-deepseek"> n8n-nodes-deepseek </a> </td>
|
<td> <a href="https://github.com/rubickecho/n8n-deepseek"> n8n-nodes-deepseek </a> </td>
|
||||||
<td> 一个 N8N 的社区节点,支持直接使用 DeepSeek API 集成到工作流中 </td>
|
<td> 一个 N8N 的社区节点,支持直接使用 DeepSeek API 集成到工作流中 </td>
|
||||||
</tr>
|
</tr>
|
||||||
|
<tr>
|
||||||
|
<td> <img src="https://www.promptfoo.dev/img/logo-panda.svg" alt="Icon" width="64" height="auto" /> </td>
|
||||||
|
<td> <a href="docs/promptfoo/README.md"> promptfoo </a> </td>
|
||||||
|
<td> 测试和评估LLM提示,包括DeepSeek模型。比较不同的LLM提供商,捕获回归,并评估响应。 </td>
|
||||||
|
</tr>
|
||||||
</table>
|
</table>
|
||||||
|
70
docs/promptfoo/README.md
Normal file
70
docs/promptfoo/README.md
Normal file
@ -0,0 +1,70 @@
|
|||||||
|
# promptfoo
|
||||||
|
|
||||||
|
[promptfoo](https://promptfoo.dev) is an open-source framework for testing and evaluating LLM outputs. It helps you compare DeepSeek models with other LLMs (like o1, GPT-4o, Claude 3.5, Llama3.3, and Gemini) and test LLMs and LLM applications for security vulnerabilities. You can:
|
||||||
|
|
||||||
|
- Run side-by-side comparisons between models
|
||||||
|
- Check output quality and consistency
|
||||||
|
- Generate test reports
|
||||||
|
|
||||||
|
## Setup
|
||||||
|
|
||||||
|
1. Install promptfoo:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
npm install -g promptfoo
|
||||||
|
# or
|
||||||
|
brew install promptfoo
|
||||||
|
```
|
||||||
|
|
||||||
|
2. Configure API keys:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
export DEEPSEEK_API_KEY=your_api_key
|
||||||
|
# Add other API keys as needed
|
||||||
|
```
|
||||||
|
|
||||||
|
## Quick Start
|
||||||
|
|
||||||
|
Create a configuration file `promptfooconfig.yaml`:
|
||||||
|
|
||||||
|
```yaml
|
||||||
|
providers:
|
||||||
|
- deepseek:deepseek-reasoner # DeepSeek-R1
|
||||||
|
- openai:o1
|
||||||
|
|
||||||
|
prompts:
|
||||||
|
- 'Solve this step by step: {{math_problem}}'
|
||||||
|
|
||||||
|
tests:
|
||||||
|
- vars:
|
||||||
|
math_problem: 'What is the derivative of x^3 + 2x with respect to x?'
|
||||||
|
assert:
|
||||||
|
- type: contains
|
||||||
|
value: '3x^2' # Check for correct answer
|
||||||
|
- type: llm-rubric
|
||||||
|
value: 'Response shows clear steps'
|
||||||
|
- type: cost
|
||||||
|
threshold: 0.05 # Maximum cost per request
|
||||||
|
```
|
||||||
|
|
||||||
|
Run tests:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
promptfoo eval
|
||||||
|
```
|
||||||
|
|
||||||
|
View results in your browser:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
promptfoo view
|
||||||
|
```
|
||||||
|
|
||||||
|
## Example Project
|
||||||
|
|
||||||
|
Check out our [example](https://github.com/promptfoo/promptfoo/tree/main/examples/deepseek-r1-vs-openai-o1) that compares r1 and o1 on MMLU.
|
||||||
|
|
||||||
|
## Resources
|
||||||
|
|
||||||
|
- [Documentation](https://promptfoo.dev/docs/providers/deepseek)
|
||||||
|
- [GitHub Repository](https://github.com/promptfoo/promptfoo)
|
||||||
|
- [Community Discord](https://discord.gg/promptfoo)
|
70
docs/promptfoo/README_cn.md
Normal file
70
docs/promptfoo/README_cn.md
Normal file
@ -0,0 +1,70 @@
|
|||||||
|
# promptfoo
|
||||||
|
|
||||||
|
[promptfoo](https://promptfoo.dev) 是一个开源框架,用于测试和评估 LLM 输出。它可以帮助您将 DeepSeek 模型与其他 LLM(如 o1、GPT-4o、Claude 3.5、Llama 3.3 和 Gemini)进行比较,并测试 LLM 及其应用的安全漏洞。您可以:
|
||||||
|
|
||||||
|
- 对不同模型进行并排比较
|
||||||
|
- 检查输出质量和一致性
|
||||||
|
- 生成测试报告
|
||||||
|
|
||||||
|
## 安装设置
|
||||||
|
|
||||||
|
1. 安装 promptfoo:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
npm install -g promptfoo
|
||||||
|
# 或者使用 brew
|
||||||
|
brew install promptfoo
|
||||||
|
```
|
||||||
|
|
||||||
|
2. 配置 API 密钥:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
export DEEPSEEK_API_KEY=your_api_key
|
||||||
|
# 根据需要添加其他 API 密钥
|
||||||
|
```
|
||||||
|
|
||||||
|
## 快速开始
|
||||||
|
|
||||||
|
创建配置文件 `promptfooconfig.yaml`:
|
||||||
|
|
||||||
|
```yaml
|
||||||
|
providers:
|
||||||
|
- deepseek:deepseek-reasoner # DeepSeek-R1
|
||||||
|
- openai:o1
|
||||||
|
|
||||||
|
prompts:
|
||||||
|
- '请逐步解决这个问题:{{math_problem}}'
|
||||||
|
|
||||||
|
tests:
|
||||||
|
- vars:
|
||||||
|
math_problem: '求 x^3 + 2x 对 x 的导数'
|
||||||
|
assert:
|
||||||
|
- type: contains
|
||||||
|
value: '3x^2' # 检查正确答案
|
||||||
|
- type: llm-rubric
|
||||||
|
value: '回答需要展示清晰的步骤'
|
||||||
|
- type: cost
|
||||||
|
threshold: 0.05 # 每次请求的最大成本
|
||||||
|
```
|
||||||
|
|
||||||
|
运行测试:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
promptfoo eval
|
||||||
|
```
|
||||||
|
|
||||||
|
在浏览器中查看结果:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
promptfoo view
|
||||||
|
```
|
||||||
|
|
||||||
|
## 示例项目
|
||||||
|
|
||||||
|
查看我们的[示例](https://github.com/promptfoo/promptfoo/tree/main/examples/deepseek-r1-vs-openai-o1),展示了 r1 和 o1 在 MMLU 上的比较。
|
||||||
|
|
||||||
|
## 资源
|
||||||
|
|
||||||
|
- [文档](https://promptfoo.dev/docs/providers/deepseek)
|
||||||
|
- [GitHub 仓库](https://github.com/promptfoo/promptfoo)
|
||||||
|
- [社区 Discord](https://discord.gg/promptfoo)
|
Loading…
Reference in New Issue
Block a user