Commit Graph

9 Commits

Author SHA1 Message Date
Cristian Cezar Moisés
ebbbf84d35
Update generate.py
Distributed Training Enhancements:
        Proper NCCL/Gloo backend selection
        Distributed timeout handling
        Rank-aware input broadcasting
        Graceful process group cleanup

    Error Handling & Validation
        Comprehensive path validation
        Config schema validation
        Tokenization error handling
        Batch processing safeguards
        CUDA OOM fallback handling

    Generation Improvements:
        Top-k sampling support
        Repetition penalty
        Dynamic sequence length management
        Progress tracking with tqdm
        Sequence truncation warnings

    Performance Optimizations:
        Device-aware tensor placement
        Batch tokenization
        Memory-efficient generation loop
        Model parallelism support

    User Experience:

        Interactive mode enhancements:
            Command history
            Input validation
            Graceful exit handling

        Batch processing:
            Progress tracking
            Error resilience
            Clean output formatting

    Code Quality:
        Type hints throughout
        Configurable constants
        Modular architecture
        Docstrings with examples
        Logging integration

    Safety Features:
        Tokenizer trust_remote_code handling
        Config validation
        Input sanitization
        Resource cleanup guarantees
2025-01-27 23:16:21 -03:00
Cristian Cezar Moisés
eee820cc36
Update fp8_cast_bf16.py
Type Hints & Path Management:
        Added comprehensive type annotations
        Used pathlib.Path for safer path handling

    Enhanced Error Handling:
        Structured exception handling throughout
        Clear error messages with context
        Safe resource cleanup

    Memory Management:
        LRU cache implementation with OrderedDict
        Configurable cache size
        Explicit GPU memory cleanup

    Logging System:
        Configurable logging levels
        Detailed progress tracking
        Structured error reporting

    Code Organization:
        Split into focused, testable functions
        Clear separation of concerns
        Documented public methods

    Validation & Safety:
        Input path validation
        Weight type checking
        Clone tensors to prevent reference issues

    Performance:
        Optimized file loading with LRU cache
        Batched tensor processing
        Asynchronous CUDA operations

    Metadata & Traceability:
        Added conversion metadata to output files
        Preserved original index structure
        Enhanced output index information

    Configuration:
        Centralized constants
        Device-aware execution (CUDA/CPU)

    Progress Tracking:
        Nested progress bars
        Detailed file processing status
2025-01-27 23:13:11 -03:00
Cristian Cezar Moisés
a26fca4a41
Update convert.py 2025-01-27 23:10:08 -03:00
enoch kan
bc77f22afc Updated model.py docstrings 2025-01-05 18:24:31 +00:00
enoch kan
a1296f099e Enhance documentation and update .gitignore for model conversion scripts 2025-01-05 18:18:18 +00:00
GeeeekExplorer
fd011c11aa torch rmsnorm 2025-01-05 14:33:48 +08:00
Xingkai Yu
8710ec2ecb
require model-parallel in convert.py 2024-12-31 18:05:55 +08:00
Yang Wang
8f1c9488b5
handle missing scale_inv_name (#2)
* handle missing scale_inv_name

Fixed an issue where `weight` and `weight_scale_inv` (e.g. `model.layers.39.mlp.experts.92.gate_proj.weight` and `model.layers.39.mlp.experts.92.gate_proj.weight_scale_inv`) were not in the same SafeTensor, causing an assertion error due to scale_inv_name not being in the state_dict.

* sort filename to reduce memory costs

* Add CUDA cache clearing in memory management

Added torch.cuda.empty_cache() to free up unused memory on the GPU,
2024-12-27 09:34:38 +08:00
stack-heap-overflow
4c2fdb8f55 Release DeepSeek-V3 2024-12-26 19:01:57 +08:00