DeepSeek-V3

mirror of https://github.com/deepseek-ai/DeepSeek-V3.git synced 2025-07-03 23:11:36 -04:00

Author	SHA1	Message	Date
Hitesh Yadav	cd28bb1bf4	Merge pull request #1 from twlhitesh/improve-accessibility Improve accessibility for CAPTCHA challenges in DeepSeek v3	2025-01-06 15:54:11 +05:30
Hitesh Yadav	7d95c3e5ec	Improve accessibility for CAPTCHA challenges in DeepSeek v3 Add accessible CAPTCHA solution using hCaptcha to improve accessibility for visually impaired users. * Add dependencies: Add `hcaptcha` and `flask` to `requirements.txt`. * Implement hCaptcha in Flask app: Create `app.py` to set up Flask application, configure hCaptcha, and create routes for sign-up and login with hCaptcha integration. * Create sign-up form: Add `templates/signup.html` with HTML form for user sign-up, integrating hCaptcha widget and adding ARIA labels and roles for screen reader compatibility. * Create login form: Add `templates/login.html` with HTML form for user login, integrating hCaptcha widget and adding ARIA labels and roles for screen reader compatibility. * Add CSS styles: Add `static/css/styles.css` to style form elements and hCaptcha widget, ensuring high contrast and readability for visually impaired users. * Write unit tests: Add `tests/test_accessibility.py` to test accessibility features using NVDA and verify hCaptcha integration and screen reader compatibility. --- For more details, open the [Copilot Workspace session](https://copilot-workspace.githubnext.com/twlhitesh/DeepSeek-V3?shareId=XXXX-XXXX-XXXX-XXXX).	2025-01-06 15:53:37 +05:30
Hitesh Yadav	bc9459df40	refactor(inference): modularize model architecture for improved maintainability BREAKING CHANGE: Restructured model.py into dedicated modules under inference/models/ Key Changes: - Split monolithic model.py into focused, single-responsibility modules: - config.py: Model configuration and hyperparameters - attention.py: Multi-head Latent Attention (MLA) implementation - moe.py: Mixture of Experts components (Gate, Expert, MoE) - linear.py: Linear layer variants with parallel processing support - __init__.py: Clean public API exports Benefits: - Improved code organization and maintainability - Better separation of concerns - Enhanced testability of individual components - Clearer dependency management - Simplified future modifications and extensions Migration: - Update imports to use new module structure - No functional changes to existing implementations - Backwards compatible with current model weights	2025-01-05 16:28:10 +05:30
GeeeekExplorer	fd011c11aa	torch rmsnorm	2025-01-05 14:33:48 +08:00
Xingkai Yu	9b288b86cc	Update README.md	2025-01-03 15:30:48 +08:00
Huang Panpan	0d16ea24c8	Merge pull request #206 from kutt/patch-1 use alert formatting for notes in readme	2025-01-03 09:48:03 +08:00
kutt	21bc231f32	use alert formatting for notes in readme	2025-01-02 15:02:52 +01:00
Xingkai Yu	8710ec2ecb	require model-parallel in convert.py	2024-12-31 18:05:55 +08:00
Huang Panpan	7c2466b310	Update issue templates	2024-12-31 14:49:05 +08:00
Huang Panpan	1b8e18cc29	Merge pull request #21 from eltociear/patch-1 docs: update README.md	2024-12-30 15:03:30 +08:00
Haswell Iris	94410f8d58	Merge pull request #33 from zhyncs/main docs: update SGLang usage	2024-12-30 14:37:38 +08:00
zhyncs	68d0061937	upd	2024-12-30 14:25:28 +08:00
zhyncs	2fc98d1cdf	upd	2024-12-30 14:21:00 +08:00
zhyncs	a1edf4138e	upd	2024-12-30 14:18:00 +08:00
zhyncs	8638950ec2	docs: update SGLang usage	2024-12-30 14:13:27 +08:00
DeepSeekDDM	83dd18eda4	Update README.md add citation format to the arxiv-version paper	2024-12-30 11:04:14 +08:00
Ikko Eltociear Ashimine	710c8b8b6e	docs: update README.md HuggingFace -> Hugging Face	2024-12-29 00:43:11 +09:00
Yang Wang	8f1c9488b5	handle missing scale_inv_name (#2 ) * handle missing scale_inv_name Fixed an issue where `weight` and `weight_scale_inv` (e.g. `model.layers.39.mlp.experts.92.gate_proj.weight` and `model.layers.39.mlp.experts.92.gate_proj.weight_scale_inv`) were not in the same SafeTensor, causing an assertion error due to scale_inv_name not being in the state_dict. * sort filename to reduce memory costs * Add CUDA cache clearing in memory management Added torch.cuda.empty_cache() to free up unused memory on the GPU,	2024-12-27 09:34:38 +08:00
Huang Panpan	c8087bd8b8	Merge pull request #9 from simon-mo/vllm Docs: add vLLM as supported engine	2024-12-27 09:16:09 +08:00
simon-mo	e2c15caf04	add version Signed-off-by: simon-mo <simon.mo@hey.com>	2024-12-26 17:11:31 -08:00
simon-mo	cf47874d8e	Docs: add vLLM as supported engine Signed-off-by: simon-mo <simon.mo@hey.com>	2024-12-26 17:10:33 -08:00
stack-heap-overflow	4c2fdb8f55	Release DeepSeek-V3	2024-12-26 19:01:57 +08:00
stack-heap-overflow	4b58dc6bfc	Initial commit	2024-12-26 17:52:41 +08:00

23 Commits