DeepSeek-V3/inference
Cristian Cezar Moisés ebbbf84d35
Update generate.py
Distributed Training Enhancements:
        Proper NCCL/Gloo backend selection
        Distributed timeout handling
        Rank-aware input broadcasting
        Graceful process group cleanup

    Error Handling & Validation
        Comprehensive path validation
        Config schema validation
        Tokenization error handling
        Batch processing safeguards
        CUDA OOM fallback handling

    Generation Improvements:
        Top-k sampling support
        Repetition penalty
        Dynamic sequence length management
        Progress tracking with tqdm
        Sequence truncation warnings

    Performance Optimizations:
        Device-aware tensor placement
        Batch tokenization
        Memory-efficient generation loop
        Model parallelism support

    User Experience:

        Interactive mode enhancements:
            Command history
            Input validation
            Graceful exit handling

        Batch processing:
            Progress tracking
            Error resilience
            Clean output formatting

    Code Quality:
        Type hints throughout
        Configurable constants
        Modular architecture
        Docstrings with examples
        Logging integration

    Safety Features:
        Tokenizer trust_remote_code handling
        Config validation
        Input sanitization
        Resource cleanup guarantees
2025-01-27 23:16:21 -03:00
..
configs Release DeepSeek-V3 2024-12-26 19:01:57 +08:00
convert.py Update convert.py 2025-01-27 23:10:08 -03:00
fp8_cast_bf16.py Update fp8_cast_bf16.py 2025-01-27 23:13:11 -03:00
generate.py Update generate.py 2025-01-27 23:16:21 -03:00
kernel.py Enhance documentation and update .gitignore for model conversion scripts 2025-01-05 18:18:18 +00:00
model.py Updated model.py docstrings 2025-01-05 18:24:31 +00:00
requirements.txt Release DeepSeek-V3 2024-12-26 19:01:57 +08:00