speeddating_AI2

General

Author: Yap Jheng Khin

The notebook is available in https://www.kaggle.com/polarbearyap/speeddating-part-ii

Note that this is the continuation from Part 1, which was done in here. I have also discover many mistakes from part I, and part II will serve as an improvement or postmortem.

List of mistakes that I have made in part I are:

Preprocess on whole dataset, which cause train-test contamination.
Perform cross validation instead of nested cross validation.

My learning expection in Part II are:

Discover various ways to detect correlated features.
Perform feature selection to reduce model complexity.
Apply nested cross validation on areas like hyperarameter tuning.
Discover XAI techniques that can be used in explaining black box models.

Metadata

This data was gathered from participants in experimental speed dating events from 2002-2004.
During the events, the attendees would have a four-minute "first date" with every other participant of the opposite sex. At the end of their four minutes, participants were asked if they would like to see their date again. They were also asked to rate their date on six attributes: Attractiveness, Sincerity, Intelligence, Fun, Ambition, and Shared Interests.
The dataset also includes questionnaire data gathered from participants at different points in the process. These fields include: demographics, dating habits, self-perception across key attributes, beliefs on what others find valuable in a mate, and lifestyle information.

Attribute Information

There are totally 56 preprocessed features which have undergone the data preprocessing in the dataset such as 'd_ funny' which show the particular attributes in discrete form. The dataset also 'has_ null' which represents whether the particular sample consisting null values. Several features with 'expected_' means the expectations of the users towards partners.

Features' Type	Example
age-related features	age, age_o, d_age
unknown feature	wave
field	field_sociology, field_money
interest-related features	shopping, music
partner-related features	intelligence_partner, funny_partner
race-related features	race, importance_same_race
features about partner's preference	pref_o_intelligence, pref_o_ambitious
features about partner's rating on self	intelligence_o, funny_o
features about self's preference	ambition_important, funny_important
features about self's rating on herself/himself	funny, intelligence

Source of the Dataset

Published by: Joaquin Vanschoren @ 2016 on https://www.openml.org/d/40536
Available at:
- csv, arff
- json
- xml
- rdf
This dataset is also available at kaggle

Relevant Paper

Raymond Fisman; Sheena S. Iyengar; Emir Kamenica; Itamar Simonson. Gender Differences in Mate Selection: Evidence From a Speed Dating Experiment. The Quarterly Journal of Economics, Volume 121, Issue 2, 1 May 2006, Pages 673–697, https://doi.org/10.1162/qjec.2006.121.2.673

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
pickles		pickles
README.md		README.md
speeddating.csv		speeddating.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

speeddating_AI2

General

Metadata

Attribute Information

Source of the Dataset

Relevant Paper

About

Releases

Packages

polarBearYap/speeddating_AI2

Folders and files

Latest commit

History

Repository files navigation

speeddating_AI2

General

Metadata

Attribute Information

Source of the Dataset

Relevant Paper

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Packages