Add a threshold to the CorrelationSimilarity metric

### Problem Description
In the context of evaluating synthetic data, we care more about whether the synthetic data has preserved trends that were _strongly present in the real data_. If the real data didn't have a strong trend to begin with, then it's fairly typical for synthetic data to also not have a trend; synthesizer's don't usually invent correlations when there is none to begin with. So this case is uninteresting.

Right now, if the real data doesn't have any strong trends, then the [CorrelationSimilarity](https://docs.sdv.dev/sdmetrics/data-metrics/quality/correlationsimilarity#usage) metric will typically report a very high score -- as the synthetic data would also not have such a strong trend. Instead of this, it would be better if the metric itself could be set up with a threshold. If that threshold is not met for the real data (aka there is no strong trend to begin with), then the metric would return a `NaN` instead.

### Expected behavior
In CorrelationSimilarity, add a parameter called `real_correlation_threshold`. The metric would then work as follows:

1. It would compute the correlation on the real data
2. If the _absolute value of the real data's correlation_ exceeds the threshold, then the correlation is considered "strong" and the rest of the metric computation continues.
3. Otherwise, the metric score would be a `NaN` instead. There is no need to compute the correlation on the synthetic data.

The default value of this parameter can be 0 (meaning that the behavior is the same as status quo), however it will be easy for the user (or a report) to set a new value when running it.

```python
from sdmetrics.column_pairs import CorrelationSimilarity

CorrelationSimilarity.compute(
    real_data=real_table[['column_1', 'column_2']],
    synthetic_data=synthetic_table[['column_1', 'column_2']],
    coefficient='Pearson',
    real_correlation_threshold=0.5
)
```



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add a threshold to the CorrelationSimilarity metric #816

Problem Description

Expected behavior

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Add a threshold to the CorrelationSimilarity metric #816

Description

Problem Description

Expected behavior

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions