-
Notifications
You must be signed in to change notification settings - Fork 1
/
Copy pathREADME.txt
64 lines (53 loc) · 1.9 KB
/
README.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
Air BnB Project Description
What are we trying to accomplish?
Based on data available through Air BnB predict which country a new user's first booking destination will be
What data is available?
List of users:
demographics
web session records
some summary statistics
Training data:
User data
User's first booking destiniation
There are 12 possible outcomes of the destination country:
'US',
'FR',
'CA',
'GB',
'ES',
'IT',
'PT',
'NL',
'DE',
'AU',
'NDF' (no destination found, implying they did not book),
and 'other'.
File descriptions
train_users.csv - the training set of users
test_users.csv - the test set of users
id: user id
date_account_created: the date of account creation
timestamp_first_active: timestamp of the first activity, note that it can be earlier than date_account_created or date_first_booking because a user can search before signing up
date_first_booking: date of first booking
gender
age
signup_method
signup_flow: the page a user came to signup up from
language: international language preference
affiliate_channel: what kind of paid marketing
affiliate_provider: where the marketing is e.g. google, craigslist, other
first_affiliate_tracked: whats the first marketing the user interacted with before the signing up
signup_app
first_device_type
first_browser
country_destination: this is the target variable you are to predict
sessions.csv - web sessions log for users
user_id: to be joined with the column 'id' in users table
action
action_type
action_detail
device_type
secs_elapsed
countries.csv - summary statistics of destination countries in this dataset and their locations
age_gender_bkts.csv - summary statistics of users' age group, gender, country of destination
sample_submission.csv - correct format for submitting your predictions