Skip to content

Commit 37254bb

Browse files
committed
add references to the paper
1 parent c9429c1 commit 37254bb

File tree

1 file changed

+15
-1
lines changed

1 file changed

+15
-1
lines changed

README.md

Lines changed: 15 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@
44
The basic idea is running a linear or logistic regression of the target on the Shapley values of
55
the original features, on the validation set,
66
discarding the features with negative coefficients, and ranking/filtering the rest according to their
7-
statistical significance. For motivation and details, see the [example notebook](https://github.com/transferwise/shap-select/blob/main/docs/Quick%20feature%20selection%20through%20regression%20on%20Shapley%20values.ipynb)
7+
statistical significance. For motivation and details, refer to our [research paper](https://arxiv.org/abs/2410.06815) see the [example notebook](https://github.com/transferwise/shap-select/blob/main/docs/Quick%20feature%20selection%20through%20regression%20on%20Shapley%20values.ipynb)
88

99
Earlier packages using Shapley values for feature selection exist, the advantages of this one are
1010
* Regression on the **validation set** to combat overfitting
@@ -109,3 +109,17 @@ selected_features_df = shap_select(model, X_val, y_val, task="multiclass", thres
109109
</table>
110110

111111

112+
## Citation
113+
114+
If you use `shap-select` in your research, please cite our paper:
115+
116+
```bibtex
117+
@misc{kraev2024shapselectlightweightfeatureselection,
118+
title={Shap-Select: Lightweight Feature Selection Using SHAP Values and Regression},
119+
author={Egor Kraev and Baran Koseoglu and Luca Traverso and Mohammed Topiwalla},
120+
year={2024},
121+
eprint={2410.06815},
122+
archivePrefix={arXiv},
123+
primaryClass={cs.LG},
124+
url={https://arxiv.org/abs/2410.06815},
125+
}

0 commit comments

Comments
 (0)