Skip to content
/ server Public

Conversation

@mariadb-TafzeelShams
Copy link
Contributor

  • The Jira issue number for this PR is: MDEV-38140

Description

MDEV-38140: InnoDB index corruption after UPDATE affecting virtual
columns

Issue:

  • Purge thread attempts to purge a secondary index record that is not
    delete-marked.

Root Cause:

  • When a secondary index includes a virtual column whose v_pos is
    greater than the number of fields in the clustered index record, the
    virtual column is incorrectly skipped while reading from the undo
    record.
  • This leads the purge logic to incorrectly assume it is safe to purge
    the secondary index record.
  • The code also confuses the nth virtual column with the nth stored
    column when writing ordering columns at the end of the undo record.

Fix:

  • In trx_undo_update_rec_get_update(): Skip a virtual column only
    when v_pos == FIL_NULL, not when v_pos is greater than the number
    of fields.
  • In trx_undo_page_report_modify(): Ensure ordering columns are
    written based on the correct stored-column positions, without
    confusing them with virtual-column positions.

Release Notes

Fixed a potential corruption issue for virtual index.

How can this PR be tested?

Modified mysql-test/suite/innodb/t/innodb-virtual-columns.test and mysql-test/suite/innodb/r/innodb-virtual-columns.result to cover the changes.

Basing the PR against the correct MariaDB version

  • This is a new feature or a refactoring, and the PR is based against the main branch.
  • This is a bug fix, and the PR is based against the earliest maintained branch in which the bug can be reproduced.

PR quality check

  • I checked the CODING_STANDARDS.md file and my PR conforms to this where appropriate.
  • For any trivial modifications to the PR, I am ok with the reviewer making the changes themselves.

Comment on lines 1224 to 1233
for (i = 0; i < update->n_fields; i++) {
const upd_field_t* fld =
upd_get_nth_field(update, i);
if (upd_fld_is_virtual_col(fld))
continue;
const ulint field_no
= upd_get_nth_field(update, i)
->field_no;
= fld->field_no;
if (field_no >= index->n_fields
|| dict_index_get_nth_field(
index, field_no)->col
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is changing the way how UPDATE of indexed virtual columns is logged. However, there is no mention how upgrades or downgrades are being affected by this.

I see that you posted some analysis to MDEV-38140. However, you did not comment on my earlier comment regarding the function row_purge_vc_matches_cluster(). Can you please review that logic as well?

Copy link
Contributor Author

@mariadb-TafzeelShams mariadb-TafzeelShams Dec 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is changing the way how UPDATE of indexed virtual columns is logged.

  • Hmm, actually this code part is about decision of whether to write non-virtual indexed columns to undo record. After we have completed writing the UPDATE changes of both non-virtual and virtual columns.
    (Before this code change we are incorrectly skipping writing the nth non-virtual indexed column into undo record if we already have nth virtual indexed column written as part of UPDATE changes.)

  • After completing the writing of non-virtual indexed columns we then jump to writing the virtual indexed columns.
    (Have made no changes to its logic)

However, there is no mention how upgrades or downgrades are being affected by this.

Regarding upgrades and downgrades, I don't see any issues because, even without patch we do write the non-virtual indexed columns into undo record. And while reading from undo record we handle these columns. For example in row_upd_replace_vcol()

Can you please review that logic as well?

Will do..

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it doesn't change how indexed virtual columns are logged.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as I understand, this change is affecting the length of undo log records when some indexes on virtual columns are affected by an UPDATE operation. Therefore, it is changing the format of the undo log records and how indexed virtual columns are logged.

The last substantial changes to the undo log record format were made in MDEV-14799 (see the merge d361401 of d384ead) and MDEV-14051 (439a7c9).

I would like to see an example that shows how the undo log record format is changed. Under which circumstances would the format change cause an UPDATE to fail with ER_UNDO_RECORD_TOO_BIG? @mariadb-TafzeelShams, can you write a test case for that, or prove that it is impossible? Even if such a regression were introduced, we may accept this fix, but I would like to make an informed decision.

Related to this, I think that it would be useful to introduce DBUG_EXECUTE_IF instrumentation to simulate the old behaviour, independently in trx_undo_page_report_modify() and trx_undo_update_rec_get_update(). This would allow the test suite to be extended with checks upgrade and downgrade test scenarios.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on this description I may be mistaken above, and this fix only ensures that updated non-virtual columns will be logged, that is, it would be a follow-up fix to MDEV-14799. In that case, my suggestion about DBUG_EXECUTE_IF instrumentation may not be meaningful.

But, what about the condition that immediately follows the changed lines; is it dead code now that !upd_fld_is_virtual_col(fld) would always hold at that spot?

Copy link
Contributor Author

@mariadb-TafzeelShams mariadb-TafzeelShams Jan 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think its a dead code.

We are in a nested loop.
Gist of outer loop is to traverse all the non-virtual columns in clustered index.
Gist of inner loop is to traverse the fields/column in update vector. (these fields are already written in undo log)
And now we need to check if columns from both outer loop and inner loop matches.

So below condition will hold if the non-virtual ordering field gets UPDATED , we will not have to log it again.

if (field_no >= index->n_fields || dict_index_get_nth_field( index, field_no)->col ==  col)

I think field_no >= index->n_fields is probably an extra condition, just to make sure that we call dict_index_get_nth_field() with valid field_no, but not sure.

Rough logic

for (col :every non-virtual ordered column present in clustered index/table)
{
    for (fld : entries in update vector i.e columns that got updated with UPDATE and are already written in undo log)
    {
        if (fld is virtual)
            continue;
        if (fld == col)
            // i.e. we have col in update vector and has
            // already been written to undo log earlier
            // no need to log it again
            goto already logged;
    }
    log col into undo record
}

Comment on lines 1224 to 1233
for (i = 0; i < update->n_fields; i++) {
const upd_field_t* fld =
upd_get_nth_field(update, i);
if (upd_fld_is_virtual_col(fld))
continue;
const ulint field_no
= upd_get_nth_field(update, i)
->field_no;
= fld->field_no;
if (field_no >= index->n_fields
|| dict_index_get_nth_field(
index, field_no)->col
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it doesn't change how indexed virtual columns are logged.

columns

Issue:
- Purge thread attempts to purge a secondary index record that is not
  delete-marked.

Root Cause:
- When a secondary index includes a virtual column whose v_pos is
  greater than the number of fields in the clustered index record, the
  virtual column is incorrectly skipped while reading from the undo
  record.
- This leads the purge logic to incorrectly assume it is safe to purge
  the secondary index record.
- The code also confuses the nth virtual column with the nth stored
  column when writing ordering columns at the end of the undo record.

Fix:
- In trx_undo_update_rec_get_update(): Skip a virtual column only
  when v_pos == FIL_NULL, not when v_pos is greater than the number
  of fields.
- In trx_undo_page_report_modify(): Ensure ordering columns are
  written based on the correct stored-column positions, without
  confusing them with virtual-column positions.
Copy link
Contributor

@dr-m dr-m left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for fixing this rather complex issue.

Comment on lines +333 to +339
BEGIN;
INSERT INTO t1(id,value) VALUES (1,'0F');
UPDATE t1 SET value = '1S';
UPDATE t1 SET value = '0F';
COMMIT;

SET GLOBAL innodb_max_purge_lag_wait=0;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I confirmed that this test is fixed (a crash in the purge of history is avoided) by the following:

diff --git a/storage/innobase/trx/trx0rec.cc b/storage/innobase/trx/trx0rec.cc
index 37c64d21532..53e802d8fcc 100644
--- a/storage/innobase/trx/trx0rec.cc
+++ b/storage/innobase/trx/trx0rec.cc
@@ -1524,7 +1524,7 @@ trx_undo_update_rec_get_update(
 				&field_no);
 			first_v_col = false;
 			/* This column could be dropped or no longer indexed */
-			if (field_no >= index->n_fields) {
+			if (field_no == FIL_NULL) {
 				/* Mark this is no longer needed */
 				upd_field->field_no = REC_MAX_N_FIELDS;
 

Comment on lines +360 to +364
INSERT INTO t1(col1,col2,col3,col4) VALUES (1,1,1,'ABCDE');
UPDATE t1 SET col2=2, col4 = 'PQRST';
SET GLOBAL innodb_max_purge_lag_wait=0;

CHECK TABLE t1 EXTENDED;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I confirmed that the change to trx_undo_page_report_modify() will prevent the following warning from being issued here:

test.t1	check	Warning	InnoDB: Clustered index record not found for index `key2` of table `test`.`t1`: COMPACT RECORD(info_bits=32, 3 fields): {[4]    (0x80000001),[4]    (0x80000001),[4]    (0x80000001)}

Note: we have KEY key2(col2, col3) defined on the non-virtual columns col2 int, col3 int. The fix is basically correcting the incorrect merge d361401 of d384ead into 10.2, which was the first major version that supported indexed virtual columns. The fix does not really constitute a change of the undo log record format, because the following change to the test would make the CHECK TABLE pass even if trx_undo_page_report_modify() were not changed:

diff --git a/mysql-test/suite/innodb/t/innodb-virtual-columns.test b/mysql-test/suite/innodb/t/innodb-virtual-columns.test
index 6bb9ddc8dc4..ec4507451f3 100644
--- a/mysql-test/suite/innodb/t/innodb-virtual-columns.test
+++ b/mysql-test/suite/innodb/t/innodb-virtual-columns.test
@@ -352,7 +352,6 @@ CREATE TABLE t1 (
   v_col3 char(1) AS (substr(col4,3,1)),
   v_col4 char(1) AS (substr(col4,4,1)),
   v_col5 char(1) AS (substr(col4,5,1)),
-  KEY key1(v_col2, v_col1, v_col3, v_col4, v_col5),
   KEY key2(col2, col3)
 ) ENGINE=InnoDB;
 

@dr-m dr-m merged commit 7fed014 into 10.6 Jan 15, 2026
12 of 13 checks passed
@dr-m dr-m deleted the 10.6-MDEV-38140 branch January 15, 2026 07:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Development

Successfully merging this pull request may close these issues.

4 participants