A parameter for is the "true" value of interest. We typically estimate the parameter using a sample statistic as a point estimate.
A parameter for is the "true" value of interest. We typically estimate the parameter using a sample statistic as a point estimate.
Common population parameters of interest and their corresponding sample statistic:
Quantity | Parameter | Statistic |
---|---|---|
Mean | μ | ˉx |
Variance | σ2 | s2 |
Standard deviation | σ | s |
Proportion | p | ˆp |
Statistical hypothesis testing is the procedure that assesses evidence provided by the data in favor of or against some claim about the population (often about a population parameter or potential associations).
People providing an organ for donation sometimes seek the help of a special medical consultant. These consultants assist the patient in all aspects of the surgery, with the goal of reducing the possibility of complications during the medical procedure and recovery. Patients might choose a consultant based in part on the historical complication rate of the consultant's clients.
People providing an organ for donation sometimes seek the help of a special medical consultant. These consultants assist the patient in all aspects of the surgery, with the goal of reducing the possibility of complications during the medical procedure and recovery. Patients might choose a consultant based in part on the historical complication rate of the consultant's clients.
One consultant tried to attract patients by noting that the average complication rate for liver donor surgeries in the US is about 10%, but her clients have only had 3 complications in the 62 liver donor surgeries she has facilitated. She claims this is strong evidence that her work meaningfully contributes to reducing complications (and therefore she should be hired!).
organ_donor %>% count(outcome)
## # A tibble: 2 x 2## outcome n## <chr> <int>## 1 complication 3## 2 no complication 59
Parameter, p : true rate of complication
Statistic, ˆp : rate of complication in the sample = 362 = 0.048
Is it possible to assess the consultant's claim using the data?
Is it possible to assess the consultant's claim using the data?
No. The claim is that there is a causal connection, but the data are observational. For example, maybe patients who can afford a medical consultant can afford better medical care, which can also lead to a lower complication rate.
While it is not possible to assess the causal claim, it is still possible to test for an association using these data.
For this question we ask, how likely is it that the low complication rate observed of ˆp = 0.048 be due solely to chance?
Complication rate for this consultant is no different than the US average of 10%
Complication rate for this consultant is no different than the US average of 10%
Complication rate for this consultant is lower than the US average of 10%
Complication rate for this consultant is no different than the US average of 10%
Complication rate for this consultant is lower than the US average of 10%
In statistical hypothesis testing we always first assume that the null hypothesis is true and then see whether we reject or fail to reject this claim.
Null hypothesis, H0: Defendant is innocent
Alternative hypothesis, Ha: Defendant is guilty
Null hypothesis, H0: Defendant is innocent
Alternative hypothesis, Ha: Defendant is guilty
Present the evidence: Collect data
Null hypothesis, H0: Defendant is innocent
Alternative hypothesis, Ha: Defendant is guilty
Present the evidence: Collect data
Judge the evidence: "Could these data plausibly have happened by chance if the null hypothesis were true?"
1️⃣ Start with two hypotheses about the population: the null hypothesis and the alternative hypothesis.
1️⃣ Start with two hypotheses about the population: the null hypothesis and the alternative hypothesis.
2️⃣ Choose a (representative) sample, collect data, and analyze the data.
1️⃣ Start with two hypotheses about the population: the null hypothesis and the alternative hypothesis.
2️⃣ Choose a (representative) sample, collect data, and analyze the data.
3️⃣ Figure out how likely it is to see data like what we observed, IF the null hypothesis were in fact true (called a p-value)
1️⃣ Start with two hypotheses about the population: the null hypothesis and the alternative hypothesis.
2️⃣ Choose a (representative) sample, collect data, and analyze the data.
3️⃣ Figure out how likely it is to see data like what we observed, IF the null hypothesis were in fact true (called a p-value)
4️⃣ If our data would have been extremely unlikely if the null hypothesis were true, then we reject it in favor of the alternative hypothesis.
Otherwise, we cannot reject the null hypothesis
Remember, the null and alternative hypotheses are defined for parameters, not statistics
What will our null and alternative hypotheses be for this example?
Remember, the null and alternative hypotheses are defined for parameters, not statistics
What will our null and alternative hypotheses be for this example?
Remember, the null and alternative hypotheses are defined for parameters, not statistics
What will our null and alternative hypotheses be for this example?
Expressed in symbols:
With these two hypotheses, we now take our sample and summarize the data.
The choice of summary statistic calculated depends on the type of data. In our example, we use the sample proportion
ˆp=3/62≈0.048
Next, we calculate the probability of getting data like ours, or more extreme, if H0 were in fact actually true.
This is a conditional probability: "given that H0 is true, p=0.1, what would the the probability of observing ˆp=3/62 or less?"
This probability is known as the p-value.
Let's return to the organ transplant scenario.
Since H0:p=0.10, we need to simulate a distribution for ˆp under the null hypothesis such that the probability of complication for each patient is 0.10 for 62 patients.
This null distribution for ˆp represents the distribution of the observed proportions we might expect, if the null hypothesis were true.
When sampling from the null distribution, what is the expected proportion of complications?
glimpse(organ_donor)
## Rows: 62## Columns: 1## $ outcome <chr> "complication", "complication", "complication", "no complicat…
organ_donor %>% count(outcome)
## # A tibble: 2 x 2## outcome n## <chr> <int>## 1 complication 3## 2 no complication 59
null_dist <- organ_donor %>% specify(response = outcome, success = "complication") %>% hypothesize(null = "point", p = c("complication" = 0.10, "no complication" = 0.90) ) %>% generate(reps = 100, type = "simulate") %>% calculate(stat = "prop")
null_dist <- organ_donor %>% specify(response = outcome, success = "complication") %>% hypothesize(null = "point", p = c("complication" = 0.10, "no complication" = 0.90) ) %>% generate(reps = 100, type = "simulate") %>% calculate(stat = "prop")
response
: outcome
in the organ_donor
data frame
success
: "complication"
, the level of outcome we're interested in studying
null_dist <- organ_donor %>% specify(response = outcome, success = "complication") %>% hypothesize(null = "point", p = c("complication" = 0.10, "no complication" = 0.90) ) %>% generate(reps = 100, type = "simulate") %>% calculate(stat = "prop")
null
: Since we're testing the point null hypothesis that H0:p=0.10, we
choose "point"
"complication" = 0.10, "no complication" = 0.90
null_dist <- organ_donor %>% specify(response = outcome, success = "complication") %>% hypothesize(null = "point", p = c("complication" = 0.10, "no complication" = 0.90) ) %>% generate(reps = 100, type = "simulate") %>% calculate(stat = "prop")
reps
: We will generate 100 repetitions heretype
: Choose "simulate"
for testing a point null for categorical databootstrap
for estimation permute
for testing independence null_dist <- organ_donor %>% specify(response = outcome, success = "complication") %>% hypothesize(null = "point", p = c("complication" = 0.10, "no complication" = 0.90) ) %>% generate(reps = 100, type = "simulate") %>% calculate(stat = "prop")
stat = "prop"
null_dist <- organ_donor %>% specify(response = outcome, success = "complication") %>% hypothesize(null = "point", p = c("complication" = 0.10, "no complication" = 0.90) ) %>% generate(reps = 100, type = "simulate") %>% calculate(stat = "prop")
null_dist <- organ_donor %>% specify(response = outcome, success = "complication") %>% hypothesize(null = "point", p = c("complication" = 0.10, "no complication" = 0.90) ) %>% generate(reps = 100, type = "simulate") %>% calculate(stat = "prop")
## # A tibble: 100 x 2## replicate stat## <dbl> <dbl>## 1 1 0.048## 2 2 0.097## 3 3 0.081## 4 4 0.081## 5 5 0.161## 6 6 0.081## 7 7 0.081## 8 8 0.032## 9 9 0.113## 10 10 0.113## # … with 90 more rows
What would you expect the center of the null distribution to be?
What would you expect the center of the null distribution to be?
null_dist %>% filter(stat <= (3/62)) %>% summarise(p_value = n()/nrow(null_dist))
## # A tibble: 1 x 1## p_value## <dbl>## 1 0.15
We reject the null hypothesis if the p-value is probability is small enough, i.e. it is very unlikely to observe our data or more extreme if H0 were actually true.
We reject the null hypothesis if the p-value is probability is small enough, i.e. it is very unlikely to observe our data or more extreme if H0 were actually true.
What is "small enough"? We often consider a threshold (the significance level or α-level) defined prior to conducting the analysis.
We often use 5% as the cutoff for whether the p-value is low enough that the data are unlikely to have come from the null model.
If p-value < α, reject H0 in favor of Ha:
If p-value ≥α, fail to reject H0 in favor of Ha
If p-value ≥α we fail to reject H0.
Importantly, we never "accept" the null hypothesis.
If p-value ≥α we fail to reject H0.
Importantly, we never "accept" the null hypothesis.
When we fail to reject the null hypothesis, we are stating that there is insufficient evidence to conclude that it is false. This could be due to any number of reasons:
The p-value 0.15 is greater than the significance level, α=0.05, so we fail to reject the null hypothesis.
The data do not provide sufficient evidence that the true complication rate for this consultant's clients is less than the US rate, p=0.1.
Keyboard shortcuts
↑, ←, Pg Up, k | Go to previous slide |
↓, →, Pg Dn, Space, j | Go to next slide |
Home | Go to first slide |
End | Go to last slide |
Number + Return | Go to specific slide |
b / m / f | Toggle blackout / mirrored / fullscreen mode |
c | Clone slideshow |
p | Toggle presenter mode |
t | Restart the presentation timer |
?, h | Toggle this help |
Esc | Back to slideshow |