[auto-merge] release/26.02 to main [skip ci] [bot] #14245

nvauto · 2026-02-02T18:22:06Z

auto-merge triggered by github actions on release/26.02 to create a PR keeping main up-to-date. If this PR is unable to be merged due to conflicts, it will remain open until manually fix.

…oid> inference (#14243) Closes #14233 ## Description This PR addresses test failures in `test_parquet_testing_valid_files` for `null_list.parquet` on Spark 4.1.0+. ## Root Cause **SPARK-54220** introduced correct NullType/VOID/UNKNOWN type support in Parquet schema inference starting from Spark 4.1.0. This upstream change causes different schema inference behavior: - **Spark 3.5.0 - 4.0.x**: Incorrectly infers `array<int>` for null arrays with UNKNOWN logical type - **Spark 4.1.0+**: Correctly infers `array<void>` for null arrays with UNKNOWN logical type (per SPARK-54220) The `null_list.parquet` file from the `parquet-testing` repository has a physical schema with `optional int32 item` but uses the UNKNOWN logical type annotation. RAPIDS plugin does not support `array<void>` on GPU (TypeSig.NULL is not included in nested types for Parquet cudfRead), causing the test to fail with: ``` IllegalArgumentException: Part of the plan is not columnar class org.apache.spark.sql.execution.ColumnarToRowExec ReadSchema: struct<emptylist:array<void>> ``` ## Solution Add a version-conditional xfail for `null_list.parquet` on Spark 4.1.0+ to reflect the upstream schema inference improvement and RAPIDS' current limitation with `array<void>` support. ## Changes - Updated `parquet_testing_test.py` to xfail `null_list.parquet` for Spark 4.1.0+ - Added `is_spark_411_or_later()` import - Updated copyright year to 2026 ## Related Issues - Related to #14242 (audit issue for SPARK-54220) ### Checklists - [ ] This PR has added documentation for new or modified features or behaviors. - [x] This PR has added new tests or modified existing tests to cover new code paths. - [ ] Performance testing has been performed and its results are added in the PR description. Or, an issue has been filed with a link in the PR description. Signed-off-by: Chong Gao <[email protected]> Signed-off-by: Chong Gao <[email protected]> Co-authored-by: Chong Gao <[email protected]>

nvauto · 2026-02-02T18:22:09Z

SUCCESS - auto-merge

greptile-apps · 2026-02-02T18:24:32Z

Greptile Overview

Greptile Summary

This PR merges changes from the release/26.02 branch to main, including a fix for Spark 4.1.1+ compatibility with null_list.parquet testing.

Updated copyright year from 2025 to 2026
Added import for is_spark_411_or_later function from spark_session module
Added conditional xfail for null_list.parquet when running on Spark 4.1.1+ due to schema inference differences (array vs array)
The xfail is properly documented with a reference to issue [AUDIT][SPARK-54220] NullType/VOID/UNKNOWN Type Support in Parquet causes test failures in Spark 4.1.0+ #14242

The changes follow existing patterns in the codebase for version-specific test handling and are consistent with similar conditional logic for Spark 3.5.0+ compatibility.

Confidence Score: 5/5

This PR is safe to merge with minimal risk
The changes are straightforward and low-risk: a copyright year update, an import addition, and a version-specific xfail for a known incompatibility. The implementation follows established patterns in the codebase and is well-documented with issue references.
No files require special attention

Important Files Changed

Filename	Overview
integration_tests/src/main/python/parquet_testing_test.py	Added xfail for null_list.parquet on Spark 4.1.1+ due to array inference change, updated copyright year to 2026

Sequence Diagram

sequenceDiagram
    participant Test as Parquet Testing Test
    participant SparkSession as spark_session.py
    participant TestFramework as PyTest Framework
    
    Note over Test: Module initialization
    Test->>SparkSession: is_spark_411_or_later()
    SparkSession-->>Test: Returns boolean (True/False)
    
    alt Spark >= 4.1.1
        Test->>Test: Add null_list.parquet to _xfail_files
        Note over Test: Mark as expected failure<br/>due to array<void> inference
    end
    
    Test->>Test: gen_testing_params_for_valid_files()
    Test->>Test: Check if null_list.parquet in _xfail_files
    
    alt File is in _xfail_files
        Test->>TestFramework: Create test with xfail marker
        Note over TestFramework: Test will be marked as<br/>expected failure
    else File is valid
        Test->>TestFramework: Create regular test parameter
    end
    
    TestFramework->>Test: Execute test_parquet_testing_valid_files()
    Test->>Test: assert_gpu_and_cpu_are_equal_collect()
    
    alt Test marked as xfail and fails
        Test-->>TestFramework: Expected failure (xfail)
        Note over TestFramework: Test passes as expected
    else Test succeeds
        Test-->>TestFramework: Success
    end

greptile-apps

_{1 file reviewed, no comments}

_{Edit Code Review Agent Settings | Greptile}

nvauto merged commit c125e89 into main Feb 2, 2026

greptile-apps bot reviewed Feb 2, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[auto-merge] release/26.02 to main [skip ci] [bot] #14245

[auto-merge] release/26.02 to main [skip ci] [bot] #14245

Uh oh!

nvauto commented Feb 2, 2026

Uh oh!

nvauto commented Feb 2, 2026

Uh oh!

greptile-apps bot commented Feb 2, 2026

Uh oh!

greptile-apps bot left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

[auto-merge] release/26.02 to main [skip ci] [bot] #14245

[auto-merge] release/26.02 to main [skip ci] [bot] #14245

Uh oh!

Conversation

nvauto commented Feb 2, 2026

Uh oh!

nvauto commented Feb 2, 2026

Uh oh!

greptile-apps bot commented Feb 2, 2026

Greptile Overview

Greptile Summary

Confidence Score: 5/5

Important Files Changed

Sequence Diagram

Uh oh!

greptile-apps bot left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants