Uniqueness Shapley

Uniqueness Shapley is an EDA tool based on the feature importance method Cohort Shapley, and it quantifies the extent to which different features in a dataset make subjects in that dataset more identifiable. Using the code in the repository, one can calculate the Uniqueness Shapley value for feature corresponding to each subject or instance in the dataset, which can also be aggregated to answer questions about subpopulations of interest. For more details on the method and how to interpret the results, please see the paper:

Seiler, B., Mase, M., & Owen, A. B. "What makes you unique?," Electronic Journal of Statistics, Electron. J. Statist. 17(1), 1-18, (2023)

Install

Install the package locally with pip command.

git clone https://github.com/cohortshapley/uniquenessshapley
pip install -e uniquenessshapley

Prerequisites

This code is tested on:

Python 3.8.8
NumPy 1.20.1
Pandas 1.2.4
scipy 1.6.2
requests 2.25.1

For example notebooks, we need:

jupyter 1.0.0

Getting Started

See Jupyter notebook example here

Usages

This implementation as described in section 4 of the paper uses ADTrees and has a runtime linear in the number of rows, but exponential in the number of features.

Future Additions

We will be adding an approximate method for dealing with larger numbers of features.

Sources

The files ArrayRecord.py, IteratedTreeContingencyTable.py, and SparseADTree.py are from uraplutonium and used under their license. No changes have been made to these files except to include references to their source at the top.

The dataset.py script allows you to pull the data used for examples in the paper. The voter registration data from The North Carolina State Board of Elections and the solar flare data from UCI Machine Learning Repository.

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
dockerfile		dockerfile
notebooks		notebooks
uniquenessshapley		uniquenessshapley
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Uniqueness Shapley

Install

Prerequisites

Getting Started

Usages

Future Additions

Sources

About

Releases

Packages

Languages

License

cohortshapley/uniquenessshapley

Folders and files

Latest commit

History

Repository files navigation

Uniqueness Shapley

Install

Prerequisites

Getting Started

Usages

Future Additions

Sources

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages