Cherry-pick PR#2581 and PR#2776 to main #3075

BestJuly · 2026-01-26T07:24:14Z

What does this PR do ?

Cherry-pick two merged PRs from dev branch to main.

The followings are from original PR description.

PR#2581

The condition if self.config.mtp_num_layers is not None evaluates to True when mtp_num_layers=0, causing the code to enter the MTP branch and crash on labels.clone() when labels=None.

Since range(0) produces an empty loop and torch.chunk(..., 1) is essentially an identity operation, entering this branch with mtp_num_layers=0 does nothing useful but requires labels to be passed.

PR#2776

Add a None check for current_max_attn_logits in clip_qk.

When virtual pipeline is enabled with size > 1, in the first iteration step, only some model chunks have initialized the current_max_attn_logits, other chunks remain with current_max_attn_logits = None, which would lead to a TypeError when all_reduce is called in the following step.

⚠️ For major changes (either in lines of code or in its impact), please make sure to first share a design doc with the team. If you're unsure what's the best way to do so, contact the @mcore-oncall.

Contribution process

flowchart LR
    A[Pre-checks] --> B[PR Tests]
    subgraph Code Review/Approval
        C1[Expert Review] --> C2[Final Review]
    end
    B --> C1
    C2 --> D[Merge]

Pre-checks

I want this PR in a versioned release and have added the appropriate Milestone (e.g., Core 0.8)
I have added relevant unit tests
I have added relevant functional tests
I have added proper typing to my code Typing guidelines
I have added relevant documentation
I have run the autoformatter.sh on my PR

Code review

The following process is enforced via the CODEOWNERS file for changes into megatron/core. For changes outside of megatron/core, it is up to the PR author whether or not to tag the Final Reviewer team.

For MRs into `main` branch

Feel free to message or comment the @mcore-oncall to help accelerate your merge into main. The less complex your PR is, the faster it will be approved and merged!

(Step 1): Add PR label `Expert Review`

(Step 2): Collect the expert reviewers reviews

Attach the Expert Review label when your PR is ready for review.
GitHub auto-assigns expert reviewers based on your changes. They will get notified and pick up your PR soon.

⚠️ Only proceed to the next step once all reviewers have approved, merge-conflict are resolved and the CI is passing.
Final Review might get declined if these requirements are not fulfilled.

(Step 3): Final Review

Add Final Review label
GitHub auto-assigns final reviewers based on your changes. They will get notified and pick up your PR soon.

(Optional Step 4): Cherry-pick into release branch

If this PR also needs to be merged into core_r* release branches, after this PR has been merged, select Cherry-pick to open a new PR into the release branch.

For MRs into `dev` branch

The proposed review process for `dev` branch is under active discussion.

MRs are mergable after one approval by either eharper@nvidia.com or zijiey@nvidia.com.

Merging your PR

Any member of core-adlr and core-nemo will be able to merge your PR.

Co-authored-by: Xin Yao <xiny@nvidia.com>

maanug-nv · 2026-01-27T02:26:06Z

Since these are both bug fixes, it seems a good idea to add unit tests to this PR to ensure same things don't happen again. For example, test model creation with mtp_num_layers=0 doesn't crash.

Fix: don't enter branch if mtp_num_layers == 0 (NVIDIA#2581)

1490789

Co-authored-by: Xin Yao <xiny@nvidia.com>

BestJuly requested review from a team as code owners January 26, 2026 07:24

copy-pr-bot bot temporarily deployed to nemo-ci January 26, 2026 07:24 Inactive

ko3n1g requested a review from a team January 26, 2026 07:24

ko3n1g added this to the Core 0.16 milestone Jan 26, 2026

copy-pr-bot bot had a problem deploying to nemo-ci January 26, 2026 07:24 Failure

copy-pr-bot bot temporarily deployed to nemo-ci January 26, 2026 07:24 Inactive

copy-pr-bot bot temporarily deployed to test January 26, 2026 07:25 Inactive

Fix clip_qk for virtual pipeline size > 1 (NVIDIA#2776)

62a1cb4

Co-authored-by: Xin Yao <xiny@nvidia.com>

copy-pr-bot bot temporarily deployed to nemo-ci January 26, 2026 07:29 Inactive

copy-pr-bot bot had a problem deploying to nemo-ci January 26, 2026 07:29 Failure

copy-pr-bot bot temporarily deployed to nemo-ci January 26, 2026 07:29 Inactive

BestJuly changed the title ~~Fix: don't enter branch if mtp_num_layers == 0~~ Cherry-pick PR#2581 and PR#2776 to main Jan 26, 2026

copy-pr-bot bot temporarily deployed to test January 26, 2026 07:30 Inactive

maanug-nv added Expert Review Apply this label to indicate that your PR is ready for expert review. complexity: low module: optimizer labels Jan 27, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cherry-pick PR#2581 and PR#2776 to main #3075

Cherry-pick PR#2581 and PR#2776 to main #3075

BestJuly commented Jan 26, 2026 •

edited

Loading

Uh oh!

maanug-nv commented Jan 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Cherry-pick PR#2581 and PR#2776 to main #3075

Are you sure you want to change the base?

Cherry-pick PR#2581 and PR#2776 to main #3075

Conversation

BestJuly commented Jan 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do ?

PR#2581

PR#2776

Contribution process

Pre-checks

Code review

(Step 1): Add PR label Expert Review

(Step 2): Collect the expert reviewers reviews

(Step 3): Final Review

(Optional Step 4): Cherry-pick into release branch

Merging your PR

Uh oh!

maanug-nv commented Jan 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

BestJuly commented Jan 26, 2026 •

edited

Loading

(Step 1): Add PR label `Expert Review`