Skip to content

Conversation

@codeflash-ai
Copy link
Contributor

@codeflash-ai codeflash-ai bot commented Oct 12, 2025

⚡️ This pull request contains optimizations for PR #4019

If you approve this dependent PR, these changes will be merged into the original PR branch fix-ruff-py310.

This PR will be automatically closed if the original PR is merged.


📄 117% (1.17x) speedup for get_type_for_field in strawberry/experimental/pydantic/object_type.py

⏱️ Runtime : 826 microseconds 381 microseconds (best of 73 runs)

📝 Explanation and details

The optimization introduces a fast-path check that avoids expensive type reconstruction when recursive type processing doesn't actually change any type arguments.

Key optimization:

  • Early return for unchanged types: After recursively processing type arguments, the code now compares converted == orig_args (reference equality check). If the processed arguments are identical to the original ones, it returns the existing replaced_type immediately instead of creating new type objects.

Why this is faster:

  • Eliminates expensive copy_with() calls: The original code unconditionally called replaced_type.copy_with(converted) which creates new type objects even when no changes occurred. This operation dominated 45.6% of the original runtime.
  • Reduces object allocations: For leaf types (int, str, etc.) and unchanged nested generics, the optimization avoids creating new GenericAlias or Union objects.
  • Leverages Python's type argument immutability: Since type arguments are typically tuples of immutable types, reference equality (==) efficiently detects when no actual transformation occurred during recursion.

Performance gains by test case type:

  • Basic types (int, str): Minimal improvement since they're already fast paths
  • Generic types (List[int], Dict[str, int]): ~170-200% faster - these benefit most from avoiding copy_with() when inner types don't change
  • Complex nested types (List[Dict[str, Union[int, float]]]): ~200-450% faster - compound benefits as multiple levels avoid reconstruction
  • Annotated types: ~170% faster - similar benefits from avoiding unnecessary object creation

The optimization is most effective for type hierarchies with many unchanged leaf nodes, which is common in real-world Pydantic schemas.

Correctness verification report:

Test Status
⚙️ Existing Unit Tests 🔘 None Found
🌀 Generated Regression Tests 42 Passed
⏪ Replay Tests 🔘 None Found
🔎 Concolic Coverage Tests 🔘 None Found
📊 Tests Coverage 100.0%
🌀 Generated Regression Tests and Runtime
from types import UnionType as TypingUnionType
from typing import Annotated, Any, Dict, List, Optional, Tuple, Union

# imports
import pytest  # used for our unit tests
from strawberry.experimental.pydantic.object_type import get_type_for_field


# Mocks and stubs for dependencies
class DummyCompat:
    """A dummy PydanticCompat for testing."""
    def get_basic_type(self, type_):
        # For testing, just return the type itself
        return type_

class DummyBaseModel:
    """A dummy BaseModel for simulating Pydantic models."""
    pass

class DummyStrawberryType:
    """A dummy Strawberry type for simulating converted types."""
    pass

class DummyField:
    """A mock for CompatModelField."""
    def __init__(
        self,
        outer_type_,
        is_v1=False,
        allow_none=False,
    ):
        self.outer_type_ = outer_type_
        self.is_v1 = is_v1
        self.allow_none = allow_none

# Patch for get_args and get_origin
def get_args(type_):
    # For typing generics, get __args__ if present
    return getattr(type_, "__args__", ())

def get_origin(type_):
    # For typing generics, get __origin__ if present
    return getattr(type_, "__origin__", None)
from strawberry.experimental.pydantic.object_type import get_type_for_field

# ------------------ UNIT TESTS ------------------

# BASIC TEST CASES

def test_basic_int_type():
    """Test field with int type."""
    field = DummyField(int)
    codeflash_output = get_type_for_field(field, is_input=False, compat=DummyCompat()); result = codeflash_output # 2.85μs -> 2.87μs (0.349% slower)

def test_basic_str_type():
    """Test field with str type."""
    field = DummyField(str)
    codeflash_output = get_type_for_field(field, is_input=True, compat=DummyCompat()); result = codeflash_output # 2.81μs -> 2.67μs (5.25% faster)

def test_basic_list_int_type():
    """Test field with List[int] type."""
    field = DummyField(List[int])
    codeflash_output = get_type_for_field(field, is_input=False, compat=DummyCompat()); result = codeflash_output # 18.4μs -> 6.04μs (205% faster)

def test_basic_dict_str_int_type():
    """Test field with Dict[str, int] type."""
    field = DummyField(Dict[str, int])
    codeflash_output = get_type_for_field(field, is_input=False, compat=DummyCompat()); result = codeflash_output # 18.5μs -> 6.88μs (169% faster)

def test_basic_union_type():
    """Test field with Union[int, str] type."""
    field = DummyField(Union[int, str])
    codeflash_output = get_type_for_field(field, is_input=False, compat=DummyCompat()); result = codeflash_output # 9.15μs -> 6.86μs (33.3% faster)

def test_basic_annotated_type():
    """Test field with Annotated[int, 'meta'] type."""
    field = DummyField(Annotated[int, 'meta'])
    codeflash_output = get_type_for_field(field, is_input=False, compat=DummyCompat()); result = codeflash_output # 21.1μs -> 7.74μs (173% faster)

# EDGE TEST CASES

def test_none_type():
    """Test field with NoneType."""
    field = DummyField(type(None))
    codeflash_output = get_type_for_field(field, is_input=False, compat=DummyCompat()); result = codeflash_output # 2.85μs -> 2.75μs (3.64% faster)

def test_optional_type_v1_allow_none():
    """Test field with Optional[int] and allow_none=True (v1)."""
    field = DummyField(Optional[int], is_v1=True, allow_none=True)
    codeflash_output = get_type_for_field(field, is_input=False, compat=DummyCompat()); result = codeflash_output # 11.5μs -> 9.81μs (17.3% faster)
    # Should be Optional[int]
    try:
        pass
    except TypeError:
        pass

def test_optional_type_v1_disallow_none():
    """Test field with Optional[int] and allow_none=False (v1)."""
    field = DummyField(Optional[int], is_v1=True, allow_none=False)
    codeflash_output = get_type_for_field(field, is_input=False, compat=DummyCompat()); result = codeflash_output # 9.01μs -> 6.98μs (29.0% faster)
    if get_origin(result) is Union:
        pass

def test_empty_list_type():
    """Test field with List[NoneType] (edge case)."""
    field = DummyField(List[type(None)])
    codeflash_output = get_type_for_field(field, is_input=False, compat=DummyCompat()); result = codeflash_output # 16.1μs -> 6.20μs (160% faster)

def test_nested_union_type():
    """Test field with Union[int, Union[str, float]]."""
    field = DummyField(Union[int, Union[str, float]])
    codeflash_output = get_type_for_field(field, is_input=False, compat=DummyCompat()); result = codeflash_output # 9.59μs -> 7.64μs (25.4% faster)
    # Flattened union args
    args = set()
    for arg in get_args(result):
        if get_origin(arg) is Union:
            args.update(get_args(arg))
        else:
            args.add(arg)

def test_tuple_type():
    """Test field with Tuple[int, str]."""
    field = DummyField(Tuple[int, str])
    codeflash_output = get_type_for_field(field, is_input=False, compat=DummyCompat()); result = codeflash_output # 18.8μs -> 7.02μs (168% faster)

def test_custom_model_type_replacement():
    """Test replacement of a custom Pydantic model type."""
    field = DummyField(DummyBaseModel)
    codeflash_output = get_type_for_field(field, is_input=False, compat=DummyCompat()); result = codeflash_output # 2.83μs -> 2.83μs (0.000% faster)

def test_annotated_nested_type():
    """Test field with Annotated[List[int], 'meta']."""
    field = DummyField(Annotated[List[int], 'meta'])
    codeflash_output = get_type_for_field(field, is_input=False, compat=DummyCompat()); result = codeflash_output # 31.9μs -> 9.02μs (254% faster)

def test_union_with_none_type():
    """Test field with Union[int, NoneType]."""
    field = DummyField(Union[int, type(None)])
    codeflash_output = get_type_for_field(field, is_input=False, compat=DummyCompat()); result = codeflash_output # 9.00μs -> 7.17μs (25.4% faster)

# LARGE SCALE TEST CASES

def test_large_list_type():
    """Test field with List[int] of large scale."""
    field = DummyField(List[int])
    # Simulate large scale by checking with a large list type
    codeflash_output = get_type_for_field(field, is_input=False, compat=DummyCompat()); result = codeflash_output # 17.8μs -> 5.94μs (199% faster)
    # Actually instantiate a large list to check performance
    large_list = [1] * 1000

def test_large_dict_type():
    """Test field with Dict[str, int] of large scale."""
    field = DummyField(Dict[str, int])
    codeflash_output = get_type_for_field(field, is_input=False, compat=DummyCompat()); result = codeflash_output # 18.7μs -> 6.90μs (171% faster)
    # Instantiate a large dict
    large_dict = {str(i): i for i in range(1000)}

def test_large_nested_type():
    """Test field with List[Dict[str, Union[int, float]]] of large scale."""
    field = DummyField(List[Dict[str, Union[int, float]]])
    codeflash_output = get_type_for_field(field, is_input=False, compat=DummyCompat()); result = codeflash_output # 34.7μs -> 11.3μs (206% faster)
    inner_type = get_args(result)[0]
    key_type, value_type = get_args(inner_type)
    # Instantiate a large nested structure
    large_nested = [
        {f"key{i}": i if i % 2 == 0 else float(i) for i in range(10)}
        for _ in range(100)
    ]

def test_large_union_type():
    """Test field with Union of many types."""
    many_types = tuple([int, str, float, bool, list, dict, set, tuple, type(None)])
    field = DummyField(Union[many_types])
    codeflash_output = get_type_for_field(field, is_input=False, compat=DummyCompat()); result = codeflash_output # 13.7μs -> 11.8μs (16.9% faster)

def test_large_tuple_type():
    """Test field with Tuple of many types."""
    many_types = tuple([int, str, float, bool, list, dict, set, tuple])
    field = DummyField(Tuple[many_types])
    codeflash_output = get_type_for_field(field, is_input=False, compat=DummyCompat()); result = codeflash_output # 20.8μs -> 10.9μs (91.5% faster)
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.
#------------------------------------------------
from dataclasses import dataclass
from typing import Annotated, Any, Dict, List, Optional, Tuple, Type, Union

# imports
import pytest  # used for our unit tests
from pydantic import BaseModel
from pydantic import Field as PydanticField
from strawberry.experimental.pydantic.object_type import get_type_for_field


class StrawberryObjectDefinition:
    def __init__(self, name):
        self.name = name

class CompatModelField:
    def __init__(
        self,
        outer_type_,
        allow_none=False,
        is_v1=True,
        name="field",
    ):
        self.outer_type_ = outer_type_
        self.allow_none = allow_none
        self.is_v1 = is_v1
        self.name = name

class PydanticCompat:
    # For simplicity, assume get_basic_type returns the input type itself
    def get_basic_type(self, type_):
        return type_

def get_args(type_):
    # For typing generics and unions
    if hasattr(type_, "__args__"):
        return type_.__args__
    return ()

def get_origin(type_):
    # For typing generics and unions
    if hasattr(type_, "__origin__"):
        return type_.__origin__
    return None
from strawberry.experimental.pydantic.object_type import get_type_for_field


# Helper classes for testing
class ExampleModel(BaseModel):
    a: int
ExampleModel._strawberry_type = StrawberryObjectDefinition("ExampleModel")
ExampleModel._strawberry_input_type = StrawberryObjectDefinition("ExampleModelInput")

class CustomType:
    pass

# Unit tests for get_type_for_field

@pytest.fixture
def compat():
    # Returns a basic PydanticCompat instance
    return PydanticCompat()

# 1. Basic Test Cases

def test_basic_int_field(compat):
    # Test a simple int field
    field = CompatModelField(int)
    codeflash_output = get_type_for_field(field, is_input=False, compat=compat); result = codeflash_output # 3.54μs -> 3.71μs (4.59% slower)

def test_basic_str_field(compat):
    # Test a simple str field
    field = CompatModelField(str)
    codeflash_output = get_type_for_field(field, is_input=False, compat=compat); result = codeflash_output # 3.11μs -> 3.15μs (1.27% slower)

def test_basic_optional_field(compat):
    # Optional[int] field, allow_none=True should add None to type
    field = CompatModelField(Optional[int], allow_none=True)
    codeflash_output = get_type_for_field(field, is_input=False, compat=compat); result = codeflash_output # 12.7μs -> 10.8μs (18.0% faster)

def test_basic_list_field(compat):
    # List[int] field
    field = CompatModelField(List[int])
    codeflash_output = get_type_for_field(field, is_input=False, compat=compat); result = codeflash_output # 19.5μs -> 6.77μs (188% faster)

def test_basic_union_field(compat):
    # Union[int, str] field
    field = CompatModelField(Union[int, str])
    codeflash_output = get_type_for_field(field, is_input=False, compat=compat); result = codeflash_output # 9.16μs -> 7.47μs (22.5% faster)

def test_basic_annotated_field(compat):
    # Annotated[int, "meta"] field
    field = CompatModelField(Annotated[int, "meta"])
    codeflash_output = get_type_for_field(field, is_input=False, compat=compat); result = codeflash_output # 22.0μs -> 8.03μs (175% faster)

def test_basic_pydantic_model_type(compat):
    # Pydantic model type should be replaced with _strawberry_type
    field = CompatModelField(ExampleModel)
    codeflash_output = get_type_for_field(field, is_input=False, compat=compat); result = codeflash_output # 4.16μs -> 4.32μs (3.73% slower)

def test_basic_pydantic_model_input_type(compat):
    # Pydantic model type should be replaced with _strawberry_input_type for input
    field = CompatModelField(ExampleModel)
    codeflash_output = get_type_for_field(field, is_input=True, compat=compat); result = codeflash_output # 4.38μs -> 4.69μs (6.63% slower)

# 2. Edge Test Cases

def test_none_type_field(compat):
    # Field with type NoneType
    field = CompatModelField(type(None))
    codeflash_output = get_type_for_field(field, is_input=False, compat=compat); result = codeflash_output # 2.92μs -> 3.03μs (3.34% slower)

def test_field_with_custom_type(compat):
    # Field with a custom type
    field = CompatModelField(CustomType)
    codeflash_output = get_type_for_field(field, is_input=False, compat=compat); result = codeflash_output # 2.90μs -> 2.86μs (1.72% faster)


def test_field_with_nested_optional_union(compat):
    # Field with Union[Optional[int], str]
    field = CompatModelField(Union[Optional[int], str])
    codeflash_output = get_type_for_field(field, is_input=False, compat=compat); result = codeflash_output # 11.0μs -> 10.6μs (3.89% faster)

def test_field_with_nested_list_of_union(compat):
    # Field with List[Union[int, str]]
    field = CompatModelField(List[Union[int, str]])
    codeflash_output = get_type_for_field(field, is_input=False, compat=compat); result = codeflash_output # 27.7μs -> 9.83μs (182% faster)

def test_field_with_dict_of_list(compat):
    # Field with Dict[str, List[int]]
    field = CompatModelField(Dict[str, List[int]])
    codeflash_output = get_type_for_field(field, is_input=False, compat=compat); result = codeflash_output # 29.3μs -> 9.19μs (219% faster)

def test_field_with_tuple_types(compat):
    # Field with Tuple[int, str]
    field = CompatModelField(Tuple[int, str])
    codeflash_output = get_type_for_field(field, is_input=False, compat=compat); result = codeflash_output # 18.8μs -> 7.32μs (157% faster)

def test_field_with_annotated_union(compat):
    # Field with Annotated[Union[int, str], "meta"]
    field = CompatModelField(Annotated[Union[int, str], "meta"])
    codeflash_output = get_type_for_field(field, is_input=False, compat=compat); result = codeflash_output # 26.6μs -> 10.9μs (144% faster)


def test_field_with_allow_none_and_union(compat):
    # Field with Union[int, str], allow_none=True should add None
    field = CompatModelField(Union[int, str], allow_none=True)
    codeflash_output = get_type_for_field(field, is_input=False, compat=compat); result = codeflash_output # 11.5μs -> 9.94μs (15.3% faster)

def test_field_with_is_v2(compat):
    # Field with is_v1=False should not add None even if allow_none=True
    field = CompatModelField(int, allow_none=True, is_v1=False)
    codeflash_output = get_type_for_field(field, is_input=False, compat=compat); result = codeflash_output # 2.81μs -> 2.86μs (1.44% slower)

# 3. Large Scale Test Cases

def test_large_list_field(compat):
    # Field with List[int], with a large number of elements hypothetically
    field = CompatModelField(List[int])
    codeflash_output = get_type_for_field(field, is_input=False, compat=compat); result = codeflash_output # 18.8μs -> 6.62μs (184% faster)

def test_large_nested_union(compat):
    # Field with Union of many types
    types = tuple([int, str] + [float] * 10)
    union_type = Union[types]
    field = CompatModelField(union_type)
    codeflash_output = get_type_for_field(field, is_input=False, compat=compat); result = codeflash_output # 9.88μs -> 7.76μs (27.2% faster)

def test_large_nested_list_of_dict(compat):
    # Field with List[Dict[str, int]], nested 100 times
    type_ = int
    for _ in range(10):  # keep nesting reasonable (<1000 elements)
        type_ = List[Dict[str, type_]]
    field = CompatModelField(type_)
    codeflash_output = get_type_for_field(field, is_input=False, compat=compat); result = codeflash_output # 200μs -> 36.7μs (446% faster)
    # Check that the next layer is dict
    inner = get_args(result)[0]

def test_large_union_with_optional(compat):
    # Field with Union[int, str, NoneType, float, bool], allow_none=True
    union_type = Union[int, str, float, bool]
    field = CompatModelField(union_type, allow_none=True)
    codeflash_output = get_type_for_field(field, is_input=False, compat=compat); result = codeflash_output # 12.1μs -> 10.8μs (11.5% faster)

def test_large_tuple_field(compat):
    # Field with Tuple of many types
    types = tuple([int] * 100)
    tuple_type = Tuple[types]
    field = CompatModelField(tuple_type)
    codeflash_output = get_type_for_field(field, is_input=False, compat=compat); result = codeflash_output # 82.5μs -> 64.6μs (27.7% faster)
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.

To edit these changes git checkout codeflash/optimize-pr4019-2025-10-12T14.23.40 and push.

Codeflash

The optimization introduces a **fast-path check** that avoids expensive type reconstruction when recursive type processing doesn't actually change any type arguments.

**Key optimization:**
- **Early return for unchanged types**: After recursively processing type arguments, the code now compares `converted == orig_args` (reference equality check). If the processed arguments are identical to the original ones, it returns the existing `replaced_type` immediately instead of creating new type objects.

**Why this is faster:**
- **Eliminates expensive `copy_with()` calls**: The original code unconditionally called `replaced_type.copy_with(converted)` which creates new type objects even when no changes occurred. This operation dominated 45.6% of the original runtime.
- **Reduces object allocations**: For leaf types (int, str, etc.) and unchanged nested generics, the optimization avoids creating new GenericAlias or Union objects.
- **Leverages Python's type argument immutability**: Since type arguments are typically tuples of immutable types, reference equality (`==`) efficiently detects when no actual transformation occurred during recursion.

**Performance gains by test case type:**
- **Basic types** (int, str): Minimal improvement since they're already fast paths
- **Generic types** (List[int], Dict[str, int]): **~170-200% faster** - these benefit most from avoiding `copy_with()` when inner types don't change  
- **Complex nested types** (List[Dict[str, Union[int, float]]]): **~200-450% faster** - compound benefits as multiple levels avoid reconstruction
- **Annotated types**: **~170% faster** - similar benefits from avoiding unnecessary object creation

The optimization is most effective for type hierarchies with many unchanged leaf nodes, which is common in real-world Pydantic schemas.
@codeflash-ai codeflash-ai bot added the ⚡️ codeflash Optimization PR opened by Codeflash AI label Oct 12, 2025
Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Greptile Overview

Summary

This PR introduces a performance optimization to replace_types_recursively in strawberry/experimental/pydantic/fields.py by adding an early-return check when recursively processed type arguments are unchanged.

Key Changes:

  • Added early-return optimization at line 50-51 that checks if converted == orig_args
  • When type arguments remain unchanged after recursive processing, returns the existing replaced_type immediately instead of reconstructing it
  • Avoids expensive copy_with() calls and object allocations for unchanged types

Performance Impact:

  • The PR description claims 117% speedup (1.17x faster) with 42 passing regression tests
  • Most beneficial for generic types (List, Dict, etc.) and nested type hierarchies where leaf types don't change
  • Generated tests show speedups ranging from 160-450% for complex nested types

Correctness Verification:

  • The optimization is sound because StrawberryType.__eq__ uses identity equality (self is other), ensuring the tuple equality check only returns True when type arguments are truly unchanged (same object references)
  • For basic Python types (int, str, etc.), the recursive call returns the same object, making the optimization effective
  • The check correctly leverages Python's tuple equality semantics combined with type identity to detect when reconstruction can be skipped

Confidence Score: 5/5

  • This PR is safe to merge with minimal risk
  • The optimization is mathematically sound and preserves correctness. The early-return check leverages StrawberryType's identity-based equality semantics, ensuring it only skips reconstruction when type arguments are genuinely unchanged. The change is minimal (3 lines), well-tested (42 regression tests passing), and provides significant performance benefits without altering behavior.
  • No files require special attention

Important Files Changed

File Analysis

Filename Score Overview
strawberry/experimental/pydantic/fields.py 5/5 Adds performance optimization with early-return check when recursive type conversion yields unchanged type arguments, avoiding expensive object reconstruction

Sequence Diagram

sequenceDiagram
    participant Caller
    participant replace_types_recursively
    participant get_args
    participant replace_pydantic_types
    
    Caller->>replace_types_recursively: type_ (e.g., List[int])
    replace_types_recursively->>replace_pydantic_types: basic_type
    replace_pydantic_types-->>replace_types_recursively: replaced_type
    
    alt has type args
        replace_types_recursively->>get_args: replaced_type
        get_args-->>replace_types_recursively: orig_args
        
        loop for each arg in orig_args
            replace_types_recursively->>replace_types_recursively: recursive call
            replace_types_recursively-->>replace_types_recursively: converted arg
        end
        
        alt NEW: converted == orig_args
            Note over replace_types_recursively: Early return optimization!<br/>Avoids expensive reconstruction
            replace_types_recursively-->>Caller: replaced_type (unchanged)
        else args changed
            replace_types_recursively->>replace_types_recursively: reconstruct with new args<br/>(copy_with, GenericAlias, etc.)
            replace_types_recursively-->>Caller: new type object
        end
    else no type args
        replace_types_recursively-->>Caller: replaced_type
    end
Loading

1 file reviewed, no comments

Edit Code Review Agent Settings | Greptile

@bellini666 bellini666 closed this Oct 14, 2025
@codeflash-ai codeflash-ai bot deleted the codeflash/optimize-pr4019-2025-10-12T14.23.40 branch October 14, 2025 16:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

⚡️ codeflash Optimization PR opened by Codeflash AI

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants