Add accessible CAPTCHA solution using hCaptcha to improve accessibility for visually impaired users.
* **Add dependencies**: Add `hcaptcha` and `flask` to `requirements.txt`.
* **Implement hCaptcha in Flask app**: Create `app.py` to set up Flask application, configure hCaptcha, and create routes for sign-up and login with hCaptcha integration.
* **Create sign-up form**: Add `templates/signup.html` with HTML form for user sign-up, integrating hCaptcha widget and adding ARIA labels and roles for screen reader compatibility.
* **Create login form**: Add `templates/login.html` with HTML form for user login, integrating hCaptcha widget and adding ARIA labels and roles for screen reader compatibility.
* **Add CSS styles**: Add `static/css/styles.css` to style form elements and hCaptcha widget, ensuring high contrast and readability for visually impaired users.
* **Write unit tests**: Add `tests/test_accessibility.py` to test accessibility features using NVDA and verify hCaptcha integration and screen reader compatibility.
---
For more details, open the [Copilot Workspace session](https://copilot-workspace.githubnext.com/twlhitesh/DeepSeek-V3?shareId=XXXX-XXXX-XXXX-XXXX).
BREAKING CHANGE: Restructured model.py into dedicated modules under inference/models/
Key Changes:
- Split monolithic model.py into focused, single-responsibility modules:
- config.py: Model configuration and hyperparameters
- attention.py: Multi-head Latent Attention (MLA) implementation
- moe.py: Mixture of Experts components (Gate, Expert, MoE)
- linear.py: Linear layer variants with parallel processing support
- __init__.py: Clean public API exports
Benefits:
- Improved code organization and maintainability
- Better separation of concerns
- Enhanced testability of individual components
- Clearer dependency management
- Simplified future modifications and extensions
Migration:
- Update imports to use new module structure
- No functional changes to existing implementations
- Backwards compatible with current model weights
* handle missing scale_inv_name
Fixed an issue where `weight` and `weight_scale_inv` (e.g. `model.layers.39.mlp.experts.92.gate_proj.weight` and `model.layers.39.mlp.experts.92.gate_proj.weight_scale_inv`) were not in the same SafeTensor, causing an assertion error due to scale_inv_name not being in the state_dict.
* sort filename to reduce memory costs
* Add CUDA cache clearing in memory management
Added torch.cuda.empty_cache() to free up unused memory on the GPU,