-
Notifications
You must be signed in to change notification settings - Fork 3.5k
Pull requests: NVIDIA/Megatron-LM
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
(Draft)[Main] fix cg missing wgrad hook
bug
Something isn't working
#3074
opened Jan 26, 2026 by
Wohox
Loading…
6 tasks
[Megatron-FSDP] Add dtype customization to Megatron-FSDP.
Expert Review
Apply this label to indicate that your PR is ready for expert review.
module: megatron-fsdp
Add block hash tracking for prefix caching
#3063
opened Jan 23, 2026 by
lmcafee-nvidia
•
Draft
6 tasks
μP: Maximal Update Parameterization
community-request
#3058
opened Jan 23, 2026 by
plugyawn
Loading…
3 of 6 tasks
fix(fsdp): add CLI argument for outer_dp_sharding_strategy
community-request
Final Review
Apply this label to indicate that your PR is ready for final review.
module: megatron-fsdp
needs-follow-up
Issue needs follow-up
Added --ft-num-warmup-iters option.
complexity: low
Final Review
Apply this label to indicate that your PR is ready for final review.
module: training
[fix] Bug fix for offloading in evaluate()
Expert Review
Apply this label to indicate that your PR is ready for expert review.
needs-follow-up
Issue needs follow-up
[main] perf(moe): Refine gated delta net implementation
#3042
opened Jan 22, 2026 by
yuzhongw-nvidia
•
Draft
6 tasks
Fuse MLA DOWN projection GEMMs
community-request
complexity: medium
dev2main: mbridge
dev to main: this PR is needed in main for mbridge
Expert Review
Apply this label to indicate that your PR is ready for expert review.
Previous Next
ProTip!
Updated in the last three days: updated:>2026-01-23.