DeepSeek-V3

deepseekmirror/DeepSeek-V3

Fork 0

mirror of https://github.com/deepseek-ai/DeepSeek-V3.git synced 2025-07-11 19:58:59 -04:00

Commit Graph

Author	SHA1	Message	Date
agentmarketbot	a7151e67fb	Add robust MoE implementation with dynamic shapes Implement a more robust Mixture of Experts (MoE) solution that handles dynamic shapes in PyTorch. The implementation avoids GuardOnDataDependentSymNode errors by: - Using masked operations instead of data-dependent control flow - Providing a cleaner alternative to error suppression - Including a test file to verify both regular and compiled model behavior The solution offers two approaches: 1. Quick fix via torch._dynamo.config.suppress_errors 2. Robust implementation using masked operations and proper weight handling	2025-01-27 15:59:02 +00:00

Author

SHA1

Message

Date

agentmarketbot

a7151e67fb

Add robust MoE implementation with dynamic shapes

Implement a more robust Mixture of Experts (MoE) solution that handles 
dynamic shapes in PyTorch. The implementation avoids GuardOnDataDependentSymNode 
errors by:
- Using masked operations instead of data-dependent control flow
- Providing a cleaner alternative to error suppression
- Including a test file to verify both regular and compiled model behavior

The solution offers two approaches:
1. Quick fix via torch._dynamo.config.suppress_errors
2. Robust implementation using masked operations and proper weight handling

2025-01-27 15:59:02 +00:00

1 Commits