-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathCLT
32 lines (24 loc) · 1004 Bytes
/
CLT
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
The central limit theorem states that a sampling distribution of a sample statistic approaches the normal distribution as you take more samples, no matter the original distribution being sampled from.
This means, if we draw repeated samples enough time:
mean
median
proportion
are all given in a normal distribtution, with average representing the whole population figures. This is regardless of the type of distribution we are drawing our data from: norma, flat, biomial, poisson etc. as long as the draws are independednt
```
# Set seed to 104
np.random.seed(104)
sample_means = []
# Loop 100 times
for i in range(100):
# Take sample of 20 num_users
samp_20 = amir_deals['num_users'].sample(20, replace=True)
# Calculate mean of samp_20
samp_20_mean = np.mean(samp_20)
# Append samp_20_mean to sample_means
sample_means.append(samp_20_mean)
# Convert to Series and plot histogram
sample_means_series = pd.Series(sample_means)
sample_means_series.hist()
# Show plot
plt.show()
```