Commit Graph

  • cc66d60c67 Optimize Multi-head Latent Attention (MLA) for Short Sequences XxAlonexX 2025-02-19 10:31:28 +0530
  • 92931a9514
    Update model.py #681 helme 2025-02-18 06:55:48 -1200
  • 5c2346ddff
    Update model.py #680 helme 2025-02-18 06:52:08 -1200
  • f09f5fa321
    Merge pull request #616 from Konano/chore-readme Huang Panpan 2025-02-18 18:04:06 +0800
  • 3189313c24
    sudo su && README.md #678 deepseekr1d2 2025-02-17 15:16:29 -0500
  • 1766d255bc
    Create python-publish.yml #674 NKCSRairdrop NFT Finance Guardian 2025-02-17 15:36:16 +0700
  • e598e95674
    Update LICENSE-CODE #671 codechamp12345 2025-02-16 23:44:22 +0530
  • 4a65fd9221 fix an args description. #666 oyzh 2025-02-15 11:02:28 +0800
  • 1398800ebf
    fix scores mask Xingkai Yu 2025-02-14 20:26:45 +0800
  • 4e570a99a7 Fix incorrect comment in linear function regarding weight.element_size() #662 iamvalenciia 2025-02-14 03:09:07 -0500
  • f07bccc49e
    fix: resolve center alignment issue in preview #616 Konano 2025-02-14 12:12:16 +0800
  • 0866cab5f9
    chore: update README.md to improve layout and image attributes Konano 2025-02-14 12:02:10 +0800
  • bd38425b0e [Edited] Fix minor bug in the main function #657 MayureshMore 2025-02-13 08:58:05 -0800
  • b3dfcef550 Automated change: No ML label MayureshMore 2025-02-12 10:07:29 -0800
  • fed8284309
    Update README.md #641 Can Deliktaş 2025-02-11 16:32:03 +0300
  • d98f935545
    Update README.md Can Deliktaş 2025-02-11 16:28:36 +0300
  • 7f8ae677e4
    Update translates TR README.md Can Deliktaş 2025-02-11 16:21:30 +0300
  • 2c1de9ff1c
    Update translates TR README_WEIGHTS.md Can Deliktaş 2025-02-11 16:20:40 +0300
  • 5342a74995
    Update translates TR README.md Can Deliktaş 2025-02-11 16:18:36 +0300
  • 22db85a39a
    Update translates TR README_WEIGHTS.md Can Deliktaş 2025-02-11 16:17:25 +0300
  • 9a9554dfe6
    Update translates TR README_WEIGHTS.md Can Deliktaş 2025-02-11 16:16:13 +0300
  • 315dcb7e20
    Update translates TR README_WEIGHTS.md Can Deliktaş 2025-02-11 16:14:53 +0300
  • 6939d4380f
    Update translates TR README_WEIGHTS.md Can Deliktaş 2025-02-11 16:13:21 +0300
  • a68a1814de
    Update README.md Can Deliktaş 2025-02-11 16:08:31 +0300
  • 339e500ec2
    Update README.md Can Deliktaş 2025-02-11 16:07:58 +0300
  • 117240e2b8
    Update README.md Can Deliktaş 2025-02-11 16:06:34 +0300
  • a8f596f900
    Update README.md Can Deliktaş 2025-02-11 16:05:31 +0300
  • ca2dc67021
    Update README.md Can Deliktaş 2025-02-11 16:04:25 +0300
  • c0511bfa74
    Update README.md Can Deliktaş 2025-02-11 16:02:37 +0300
  • d389e53687
    Update README.md Can Deliktaş 2025-02-11 15:59:59 +0300
  • 854ddc8ee9
    Create 陈诚 #636 cc-helper 2025-02-11 11:47:34 +0800
  • 897291478c
    Refactor checkpoint conversion script for improved readability and efficiency #633 Tanmay Das 2025-02-10 18:40:56 -0500
  • 83cdc4c226
    Create DeepSeek-V3، #627 ASA700 2025-02-10 06:33:19 +0300
  • d700e0056d
    Merge pull request #1 from wowrakibul/imgbot #618 Wow Rakibul 2025-02-09 02:02:52 +0600
  • 361d0bcc1c
    Merge pull request #2 from wowrakibul/fix/convert-py-improvements Wow Rakibul 2025-02-09 02:02:17 +0600
  • 35703ca641
    mprove convert.py with error handling and code optimization Wow Rakibul 2025-02-09 01:55:23 +0600
  • d3be6c9d91
    [ImgBot] Optimize images ImgBotApp 2025-02-08 19:36:42 +0000
  • 5c7c0312bc
    Update LICENSE-MODEL #617 Quirinus kolhoze 2025-02-08 19:12:20 +0800
  • e15f67af1c
    chore: update README.md to improve layout and image attributes Konano 2025-02-08 18:28:40 +0800
  • 2f7b80eece
    Merge pull request #611 from Konano/chore-stale Huang Panpan 2025-02-08 16:10:06 +0800
  • f10ff9c262 Update kernel.py #612 messagezsl 2025-02-08 16:07:47 +0800
  • 76d8d39560
    chore: add stale issue management configuration #611 Konano 2025-02-08 15:12:09 +0800
  • 6bb22e0c15
    Merge branch 'main' into refactor/codebase #444 Pratiyank Kumar 2025-02-08 09:14:14 +0530
  • fbdd5dcfeb
    docs(readme): improve table formatting and readability #607 乐平 2025-02-08 02:58:48 +0800
  • 29875620c0
    Merge branch 'deepseek-ai:main' into docs/issue-config #593 Azis Alvriyanto 2025-02-07 16:11:05 +0700
  • 5ee97a83f0
    fix comment Xingkai Yu 2025-02-07 16:42:55 +0800
  • 43cffb1ae3
    Minor grammatical tense corrections to README.md #600 Benjamin Winkler 2025-02-07 01:01:40 -0500
  • acbbd8938c docs: fix contact mail link and BibTeX citation casing Azis Alvriyanto 2025-02-06 17:48:13 +0700
  • 745bdf2c44 chore: add issue template config Azis Alvriyanto 2025-02-06 17:46:23 +0700
  • 0e0ba3aea0 Set up a small multinode addcmul test #584 DrJessop 2025-02-05 17:38:22 +0000
  • 426feee9f7 improve docs gate bias #580 Cerebrovinny 2025-02-05 13:38:29 +0000
  • 1d7d440461
    Merge pull request #432 from luislh-dev/main Xingkai Yu 2025-02-05 16:53:53 +0800
  • 09d108620a
    Merge pull request #440 from spenserblack/main Xingkai Yu 2025-02-05 16:50:03 +0800
  • d0f8c4fca3
    Merge pull request #528 from WSL0809/main Xingkai Yu 2025-02-05 16:33:18 +0800
  • 87a01053e4
    Merge pull request #556 from XxAlonexX/main Xingkai Yu 2025-02-05 16:23:02 +0800
  • a157077c61
    Merge pull request #408 from fitzjalen/refactor Huang Panpan 2025-02-05 12:03:02 +0800
  • 014842a143
    Merge 70ff909fdc into c32c957fb0 #391 Pratiyank Kumar 2025-02-05 11:58:31 +0800
  • c32c957fb0
    Merge pull request #364 from Dhie-boop/feature/table-of-content Huang Panpan 2025-02-05 11:39:08 +0800
  • d8c34b007e
    updated Model Summary verbiage to be past tense to help with understanding #565 aquashere 2025-02-04 10:11:55 -0800
  • 687f06b004
    Update requirements.txt #564 sudopacman 2025-02-05 01:53:45 +0800
  • dca08f2cfd fix(fp8_cast): Add robust memory management and error handling #563 ajwise9 2025-02-04 16:36:07 +0000
  • 0b8ca63f78
    Update model.py #561 felipjah 2025-02-04 21:10:07 +0800
  • 6a30b43249 Fix Linear Layer Bias Initialization #556 XxAlonexX 2025-02-04 10:38:45 +0530
  • 30e0e84022
    Create SECURITY.md #555 i-v12 2025-02-04 02:43:28 +0300
  • 0e59df7e70
    Update README.md Quirinus kolhoze 2025-02-04 06:50:49 +0800
  • a5336884cf fix: Update triton dependency to use the latest version from GitHub #469 Nripesh Niketan 2025-02-03 21:50:02 +0000
  • 97b35f1fca docs: remove redundant asterisks in note #432 luislopez-developer 2025-02-03 15:02:04 -0500
  • a7d0553e80
    Merge 77c46698b9 into b5d872ead0 #533 CodingParadigm1 2025-02-03 18:49:12 +0000
  • 77c46698b9 added assert to Parser class #533 CodingParadigm1 2025-02-03 10:24:23 -0700
  • c8146ec360 small patch CodingParadigm1 2025-02-03 10:18:58 -0700
  • 267e7ba685 add functionality CodingParadigm1 2025-02-03 10:06:38 -0700
  • 1ff79421f3 fixed typo and grammer #549 Ankush1oo8 2025-02-03 15:40:48 +0530
  • 07de76f5ee small patch CodingParadigm1 2025-02-02 06:14:17 -0700
  • 35244be39f moved dir calls CodingParadigm1 2025-02-02 06:12:46 -0700
  • f2636dd366 updated async file reading CodingParadigm1 2025-02-02 03:55:47 -0700
  • 1cb3e2a63f reduced breaking changes CodingParadigm1 2025-02-02 03:49:23 -0700
  • 73efe7c631 Memory management update Nripesh Niketan 2025-02-02 10:41:20 +0000
  • 60e3466ebf Revert "optimized" CodingParadigm1 2025-02-02 03:04:15 -0700
  • d8fd403950
    Update README.md #531 Nathan Do 2025-02-02 05:55:21 +0700
  • 15ed430ffe Saw the file to spy on it, hehehehe #530 myakoobi 2025-02-01 17:25:16 -0500
  • d5c08b384b
    Update README.md #528 wangsl 2025-02-02 02:34:59 +0800
  • 970922e236 Update LICENSE-CODE copyright year range to 2023-2025 #519 Akihito Koriyama 2025-02-01 16:27:43 +0900
  • 3f5a2ebfc9 optimized CodingParadigm1 2025-01-31 17:15:10 -0700
  • 61790e1653 Update 2 Gabriel Caetano 2025-01-31 19:33:00 -0300
  • 85cc9ac9ce
    Create generator-generic-ossf-slsa3-publish.yml #506 Omony Denis 2025-01-31 16:22:49 +0300
  • 013e2213cc
    Merge 40ec3a3f21 into b5d872ead0 #499 Evan Wallace 2025-01-30 22:05:24 -0800
  • aed3416158 del other #656 Mirjakhon Kakhkharov 2025-01-31 10:58:24 +0500
  • 40ec3a3f21 Optimization to Model Script #499 Evan Wallace 2025-01-30 21:52:56 -0800
  • a8c585657a sugiero un cambio en el archivo readme #496 paultb3 2025-01-31 00:04:15 -0500
  • 89882a94f6 Change Gabriel Caetano 2025-01-30 22:47:39 -0300
  • 0c23830e9e
    fold #529 musvaage 2025-01-30 12:01:46 -0600
  • daba5c1f78
    Update generate.py #488 Ivan Lloyd Roquero 2025-01-31 01:19:43 +0800
  • b6e3910fd0
    Fix small error Nripesh Niketan 2025-01-30 16:04:00 +0000
  • 736ec4af98
    Issue #456: Fixed #486 Harikrishna Srinivasan 2025-01-30 21:08:16 +0530
  • b4e06d883e
    Improve DeepSeek-V3 Weight File Documentation for Clarity and Readability #481 Muhammad-Noraeii 2025-01-30 14:05:58 +0330
  • e75ce46245 feat: Enhance device compatibility and update PyTorch version Nripesh Niketan 2025-01-30 00:06:55 +0000
  • e0dde63571 Fix the Readme.md #467 harshsj1504 2025-01-30 04:54:15 +0530
  • a0a75d0692 Added optional GPU Memory Logging #459 nikola 2025-01-29 16:50:22 +0000
  • c2ae9bae78
    Merge e965eec9c0 into b5d872ead0 #461 Anand 2025-01-29 22:53:15 +0530
  • ae72a44356
    Update README.md #441 Fatai Alimi 2025-01-29 17:21:54 +0000