deepseekmirror/awesome-deepseek-integration

mirror of https://github.com/deepseek-ai/awesome-deepseek-integration.git synced 2025-02-23 06:09:02 -05:00

Michael D'Angelo 25e2ae3f20 feat(integration): add promptfoo LLM testing framework

Add promptfoo to the awesome-deepseek-integration library with:
- English and Chinese documentation
- Basic setup and usage guides
- Example configuration for DeepSeek model testing
- Integration entry in both README.md and README_cn.md

2025-01-25 23:33:01 -08:00

1.5 KiB

Raw Blame History

promptfoo

promptfoo 是一个开源框架，用于测试和评估 LLM 输出。它可以帮助您将 DeepSeek 模型与其他 LLM（如 o1、GPT-4o、Claude 3.5、Llama 3.3 和 Gemini）进行比较，并测试 LLM 及其应用的安全漏洞。您可以：

对不同模型进行并排比较
检查输出质量和一致性
生成测试报告

安装设置

安装 promptfoo：

npm install -g promptfoo
# 或者使用 brew
brew install promptfoo

配置 API 密钥：

export DEEPSEEK_API_KEY=your_api_key
# 根据需要添加其他 API 密钥

快速开始

创建配置文件 promptfooconfig.yaml：

providers:
  - deepseek:deepseek-reasoner # DeepSeek-R1
  - openai:o1

prompts:
  - '请逐步解决这个问题：{{math_problem}}'

tests:
  - vars:
      math_problem: '求 x^3 + 2x 对 x 的导数'
    assert:
      - type: contains
        value: '3x^2' # 检查正确答案
      - type: llm-rubric
        value: '回答需要展示清晰的步骤'
      - type: cost
        threshold: 0.05 # 每次请求的最大成本

运行测试：

promptfoo eval

在浏览器中查看结果：

promptfoo view

示例项目

查看我们的示例，展示了 r1 和 o1 在 MMLU 上的比较。

1.5 KiB Raw Blame History Unescape Escape

promptfoo

安装设置

快速开始

示例项目

资源

1.5 KiB

Raw Blame History