Merge 800d05606f into 4660e32a6c

2025-06-28 04:21:50 -04:00 · 2025-06-14 09:02:52 +03:00 · 2025-06-14 09:02:52 +03:00 · 1274ebcdf1
commit 1274ebcdf1
parent 4660e32a6c 800d05606f
10 changed files with 322 additions and 0 deletions
--- a/deepseek_content_moderation/README.md
+++ b/deepseek_content_moderation/README.md
@ -0,0 +1,100 @@
+# DeepSeek Content Moderation Tool
+
+## Overview
+
+This project provides a Python tool for detecting potentially sensitive content in text. It uses a configurable list of keywords and phrases across various categories to analyze input text and flag any matches. This tool is intended as a basic building block for more complex content moderation systems.
+
+## Features
+
+-   **Configurable Categories**: Define your own categories of sensitive content and the keywords/phrases for each.
+-   **JSON Configuration**: Sensitive word lists are managed in an easy-to-edit `config.json` file.
+-   **Regex-Based Matching**: Uses regular expressions for case-insensitive, whole-word matching.
+-   **Returns Matched Categories and Words**: Provides a dictionary of categories that were triggered and the specific words found.
+-   **Extensible**: Designed to be integrated into larger applications or workflows.
+
+## Project Structure
+
+```
+deepseek_content_moderation/
+├── __init__.py
+├── config.json
+├── moderator.py
+├── README.md
+└── tests/
+    ├── __init__.py
+    └── test_moderator.py
+```
+
+## Setup and Installation
+
+1.  **Prerequisites**:
+    *   Python 3.7+
+    *   `pytest` (for running tests): `pip install pytest`
+
+2.  **Configuration (`config.json`)**:
+    The `config.json` file stores the categories and lists of sensitive words. You can edit this file to add, remove, or modify categories and their associated terms.
+
+    Example structure:
+    ```json
+    {
+      "Profanity": ["example_swear", "another_bad_word"],
+      "HateSpeech": ["example_slur", "derogatory_term"],
+      // ... other categories
+    }
+    ```
+    *Initially, the file contains a predefined set of categories and example terms based on common types of sensitive content.*
+
+## Usage
+
+The core functionality is provided by the `Moderator` class in `moderator.py`.
+
+```python
+from deepseek_content_moderation.moderator import Moderator
+
+# Initialize the moderator (it will load from config.json by default)
+# You can also provide a custom path to a config file:
+# moderator = Moderator(config_path="path/to/your/custom_config.json")
+moderator = Moderator()
+
+text_to_analyze = "This text contains an example_swear and a derogatory_term."
+analysis_result = moderator.analyze_text(text_to_analyze)
+
+if analysis_result:
+    print("Sensitive content found:")
+    for category, words in analysis_result.items():
+        print(f"  Category: {category}, Words: {', '.join(words)}")
+else:
+    print("No sensitive content detected.")
+
+# Example Output:
+# Sensitive content found:
+#   Category: Profanity, Words: example_swear
+#   Category: HateSpeech, Words: derogatory_term
+```
+
+### `analyze_text(text: str) -> dict`
+
+-   **Input**: A string of text to analyze.
+-   **Output**: A dictionary where keys are the names of sensitive categories found in the text. The value for each key is a list of unique words/phrases from the input text that matched the sensitive terms in that category. If no sensitive content is found, an empty dictionary is returned.
+
+## Running Tests
+
+To run the unit tests, navigate to the parent directory of `deepseek_content_moderation` and run:
+
+```bash
+python -m pytest
+```
+Or, navigate into the `deepseek_content_moderation` directory and run `pytest`. Ensure `pytest` is installed.
+
+## Disclaimer
+
+This tool provides basic keyword-based detection. It is not a comprehensive solution for content moderation, which often requires more sophisticated NLP techniques, contextual understanding, and human oversight. The initial lists of sensitive words in `config.json` are illustrative and will likely need significant expansion and refinement for any practical application.
+
+## Contributing
+
+Feel free to expand upon this project. Suggestions for improvement include:
+- More sophisticated matching algorithms (e.g., Levenshtein distance for typos).
+- Support for multiple languages.
+- Integration with machine learning models for nuanced detection.
+- More granular reporting.
+```
--- a/deepseek_content_moderation/init.py
+++ b/deepseek_content_moderation/init.py
--- a/deepseek_content_moderation/pycache/init.cpython-310.pyc
+++ b/deepseek_content_moderation/pycache/init.cpython-310.pyc
--- a/deepseek_content_moderation/pycache/moderator.cpython-310.pyc
+++ b/deepseek_content_moderation/pycache/moderator.cpython-310.pyc
--- a/deepseek_content_moderation/config.json
+++ b/deepseek_content_moderation/config.json
@ -0,0 +1,57 @@
+{
+  "Profanity": [
+    "swearword1",
+    "vulgarterm1",
+    "explicitlang1"
+  ],
+  "HateSpeech": [
+    "hatespeech_slur1",
+    "derogatory_term1",
+    "incitement_example1"
+  ],
+  "DiscriminatoryLanguage": [
+    "stereotype_example1",
+    "biased_phrase1",
+    "microaggression_example1"
+  ],
+  "SexuallyExplicitLanguage": [
+    "sexual_act_description1",
+    "explicit_anatomical_term1",
+    "suggestive_innuendo1"
+  ],
+  "ViolenceGore": [
+    "graphic_violence_desc1",
+    "torture_example1",
+    "weapon_for_harm1"
+  ],
+  "SelfHarmSuicide": [
+    "selfharm_method1",
+    "suicidal_ideation_phrase1",
+    "encouragement_selfharm1"
+  ],
+  "IllegalActivities": [
+    "drug_use_term1",
+    "illegal_weapon_term1",
+    "terrorism_related_term1"
+  ],
+  "BlasphemyReligiousInsults": [
+    "religious_insult1",
+    "disrespectful_term_religion1",
+    "offensive_to_belief1"
+  ],
+  "MedicalMisinformation": [
+    "unproven_medical_advice1",
+    "dangerous_health_claim1",
+    "harmful_pseudo_treatment1"
+  ],
+  "PrivacyViolatingPII": [
+    "personal_name_example",
+    "address_example_term",
+    "phone_number_example_term"
+  ],
+  "OffensiveSlangCulturalInsults": [
+    "cultural_slang_insult1",
+    "derogatory_cultural_term1",
+    "offensive_local_slang1"
+  ]
+}
--- a/deepseek_content_moderation/moderator.py
+++ b/deepseek_content_moderation/moderator.py
@ -0,0 +1,55 @@
+import json
+import re
+
+class Moderator:
+    def __init__(self, config_path="config.json"):
+        with open(config_path, 'r') as f:
+            self.config = json.load(f)
+        self._compile_regexes()
+
+    def _compile_regexes(self):
+        self.category_regexes = {}
+        for category, words in self.config.items():
+            # Escape special characters in words and join with | for OR logic
+            # Use  for whole word matching
+            escaped_words = [re.escape(word) for word in words]
+            regex_pattern = r"\b(" + "|".join(escaped_words) + r")\b"
+            # Compile with IGNORECASE
+            self.category_regexes[category] = re.compile(regex_pattern, re.IGNORECASE)
+
+    def analyze_text(self, text: str) -> dict:
+        found_sensitivities = {}
+        if not text:
+            return found_sensitivities
+
+        for category, regex_pattern in self.category_regexes.items():
+            matches = regex_pattern.findall(text)
+            if matches:
+                # Store unique matches
+                found_sensitivities[category] = sorted(list(set(matches)))
+
+        return found_sensitivities
+
+if __name__ == '__main__':
+    # Example Usage (optional, for basic testing)
+    moderator = Moderator()
+
+    test_text_1 = "This is a test with swearword1 and another bad term like HATEspeech_slur1."
+    analysis_1 = moderator.analyze_text(test_text_1)
+    print(f"Analysis for '{test_text_1}': {analysis_1}")
+
+    test_text_2 = "This text is clean and should pass."
+    analysis_2 = moderator.analyze_text(test_text_2)
+    print(f"Analysis for '{test_text_2}': {analysis_2}")
+
+    test_text_3 = "Another example with MEdiCaL_MiSiNfoRmAtiOn1 and suggestive_innuendo1."
+    analysis_3 = moderator.analyze_text(test_text_3)
+    print(f"Analysis for '{test_text_3}': {analysis_3}")
+
+    test_text_4 = "Testing PII like personal_name_example here."
+    analysis_4 = moderator.analyze_text(test_text_4)
+    print(f"Analysis for '{test_text_4}': {analysis_4}")
+
+    test_text_5 = "This has drug_use_term1 and also drug_use_term1 again."
+    analysis_5 = moderator.analyze_text(test_text_5)
+    print(f"Analysis for '{test_text_5}': {analysis_5}")
--- a/deepseek_content_moderation/tests/init.py
+++ b/deepseek_content_moderation/tests/init.py
--- a/deepseek_content_moderation/tests/pycache/init.cpython-310.pyc
+++ b/deepseek_content_moderation/tests/pycache/init.cpython-310.pyc
--- a/deepseek_content_moderation/tests/pycache/test_moderator.cpython-310-pytest-8.3.5.pyc
+++ b/deepseek_content_moderation/tests/pycache/test_moderator.cpython-310-pytest-8.3.5.pyc
--- a/deepseek_content_moderation/tests/test_moderator.py
+++ b/deepseek_content_moderation/tests/test_moderator.py
@ -0,0 +1,110 @@
+import pytest
+import json
+import os
+from deepseek_content_moderation.moderator import Moderator # Adjusted import path
+
+# Helper to create a temporary config file for testing
+@pytest.fixture
+def temp_config_file(tmp_path):
+    config_data = {
+        "Profanity": ["badword", "swear"],
+        "HateSpeech": ["hateful_term", "slur"],
+        "SpecificCategory": ["unique_term_for_test"]
+    }
+    config_file = tmp_path / "test_config.json"
+    with open(config_file, 'w') as f:
+        json.dump(config_data, f)
+    return str(config_file) # Return path as string
+
+@pytest.fixture
+def moderator_instance(temp_config_file):
+    # Ensure the moderator uses the temp config by passing the path
+    return Moderator(config_path=temp_config_file)
+
+def test_config_loading(moderator_instance):
+    assert "Profanity" in moderator_instance.config
+    assert "swear" in moderator_instance.config["Profanity"]
+    assert "HateSpeech" in moderator_instance.category_regexes
+    assert moderator_instance.category_regexes["Profanity"].pattern == r"\b(badword|swear)\b"
+
+def test_analyze_text_no_sensitivities(moderator_instance):
+    analysis = moderator_instance.analyze_text("This is a clean sentence.")
+    assert analysis == {}
+
+def test_analyze_text_single_category_single_word(moderator_instance):
+    analysis = moderator_instance.analyze_text("This sentence contains a badword.")
+    assert "Profanity" in analysis
+    assert analysis["Profanity"] == ["badword"]
+
+def test_analyze_text_single_category_multiple_words(moderator_instance):
+    analysis = moderator_instance.analyze_text("This sentence has badword and also swear.")
+    assert "Profanity" in analysis
+    assert sorted(analysis["Profanity"]) == sorted(["badword", "swear"])
+
+def test_analyze_text_multiple_categories(moderator_instance):
+    analysis = moderator_instance.analyze_text("A sentence with badword and a hateful_term.")
+    assert "Profanity" in analysis
+    assert analysis["Profanity"] == ["badword"]
+    assert "HateSpeech" in analysis
+    assert analysis["HateSpeech"] == ["hateful_term"]
+
+def test_analyze_text_case_insensitivity(moderator_instance):
+    analysis = moderator_instance.analyze_text("This has a BADWORD and HATEFUL_TERM.")
+    assert "Profanity" in analysis
+    assert analysis["Profanity"] == ["BADWORD"] # The regex returns the found casing
+    assert "HateSpeech" in analysis
+    assert analysis["HateSpeech"] == ["HATEFUL_TERM"]
+
+def test_analyze_text_empty_string(moderator_instance):
+    analysis = moderator_instance.analyze_text("")
+    assert analysis == {}
+
+def test_analyze_text_words_within_words_whole_word_matching(moderator_instance):
+    # 'swear' is a keyword, 'swearinger' is not.
+    analysis = moderator_instance.analyze_text("He is swearinger but not swear.")
+    assert "Profanity" in analysis
+    assert analysis["Profanity"] == ["swear"]
+
+    # Test with a word that is a substring of a sensitive word, but not a whole word match
+    analysis_substring = moderator_instance.analyze_text("This is just a test, not a hateful_term at all.")
+    assert "HateSpeech" in analysis_substring
+    assert analysis_substring["HateSpeech"] == ["hateful_term"]
+
+    analysis_no_match = moderator_instance.analyze_text("This sentence has a term but not the specific unique_term_for_testing.")
+    assert "SpecificCategory" not in analysis_no_match
+
+
+def test_analyze_text_repeated_words(moderator_instance):
+    analysis = moderator_instance.analyze_text("This badword is a badword again badword.")
+    assert "Profanity" in analysis
+    assert analysis["Profanity"] == ["badword"] # Should only list unique matches
+
+def test_analyze_text_with_punctuation(moderator_instance):
+    analysis = moderator_instance.analyze_text("Is this a badword? Yes, badword!")
+    assert "Profanity" in analysis
+    assert analysis["Profanity"] == ["badword"]
+
+    analysis_slur = moderator_instance.analyze_text("No slur, okay?")
+    assert "HateSpeech" in analysis_slur
+    assert analysis_slur["HateSpeech"] == ["slur"]
+
+# It's good practice to ensure the test file can be found and run.
+# Create a dummy __init__.py in the parent directory of 'tests' if moderator.py is in the root
+# and tests are in a subdirectory, to make Python treat 'deepseek-content-moderation' as a package.
+# For this subtask, we assume the structure is:
+# deepseek-content-moderation/
+#   moderator.py
+#   config.json
+#   tests/
+#     test_moderator.py
+
+# To run these tests, navigate to the `deepseek-content-moderation` directory and run `pytest`.
+# Ensure `pytest` is installed (`pip install pytest`).
+# If `moderator.py` is in the root of `deepseek-content-moderation`, the import
+# `from ..moderator import Moderator` is for when tests are run as part of a package.
+# If running `pytest` directly from within the `tests` directory, or if `deepseek-content-moderation`
+# is not treated as a package, a simple `from moderator import Moderator` might be needed,
+# and `sys.path` manipulation or running pytest with `python -m pytest` from the root.
+
+# For this tool, we will ensure the structure supports `from ..moderator import Moderator`.
+# This requires an `__init__.py` in the `deepseek-content-moderation` directory.