Hypothesis Testing in Python
Hypothesis testing is a statistical technique that allows us to draw conclusions about a population based on a sample of data. It is often used in fields like medicine, psychology, and economics to test the effectiveness of new treatments, analyze consumer behavior, or estimate the impact of policy changes.
In Python, hypothesis testing is facilitated by modules such as scipy.stats
and statsmodels.stats
. In this article, we’ll explore three examples of hypothesis testing in Python: the one sample t-test, the two sample t-test, and the paired samples t-test.
For each test, we’ll provide a brief explanation of the underlying concepts, an example of a research question that can be answered using the test, and a step-by-step guide to performing the test in Python. Let’s get started!
One Sample t-test
The one sample t-test is used to compare a sample mean to a known or hypothesized population mean. This allows us to determine whether the sample mean is significantly different from the population mean.
The test assumes that the data are normally distributed and that the sample is randomly drawn from the population. Example research question: Is the mean weight of a species of turtle significantly different from a known or hypothesized value?
Step-by-step guide:
- Define the null hypothesis (H0) and alternative hypothesis (Ha).
- Collect a random sample of data.
- Calculate the sample mean (x), sample standard deviation (s), and standard error (SE).
- Calculate the t-value using the formula:
t = (x - μ) / (SE)
, where μ is the hypothesized population mean. - Calculate the p-value using a t-distribution table or a Python function like
scipy.stats.ttest_1samp()
. - Compare the p-value to the level of significance (α), typically set to 0.05.
The null hypothesis is typically that the sample mean is equal to the population mean. The alternative hypothesis is that they are not equal.
For example:
H0: The mean weight of a species of turtle is 100 grams. Ha: The mean weight of a species of turtle is not 100 grams.
This can be done using Python’s random
module or by importing data from a file. For example:
weight_sample = [95, 105, 110, 98, 102, 116, 101, 99, 104, 108]
For example:
x = sum(weight_sample)/len(weight_sample)
s = np.std(weight_sample)
SE = s / (len(weight_sample)**0.5)
For example:
t = (x - 100) / SE
For example:
p_value = scipy.stats.ttest_1samp(weight_sample, 100).pvalue
If the p-value is less than α, reject the null hypothesis and conclude that there is sufficient evidence to support the alternative hypothesis.
If the p-value is greater than α, fail to reject the null hypothesis and conclude that there is insufficient evidence to support the alternative hypothesis. For example:
if p_value < 0.05:
print("Reject the null hypothesis.")
else:
print("Fail to reject the null hypothesis.")
Two Sample t-test
The two sample t-test is used to compare the means of two independent samples. This allows us to determine whether the means are significantly different from each other.
The test assumes that the data are normally distributed and that the samples are randomly drawn from their respective populations. Example research question: Is the mean weight of two different species of turtles significantly different from each other?
Step-by-step guide:
- Define the null hypothesis (H0) and alternative hypothesis (Ha).
- Collect two random samples of data.
- Calculate the sample means (x1, x2), sample standard deviations (s1, s2), and pooled standard error (SE).
- Calculate the t-value using the formula:
t = (x1 - x2) / (SE)
, where x1 and x2 are the sample means. - Calculate the p-value using a t-distribution table or a Python function like
scipy.stats.ttest_ind()
. - Compare the p-value to the level of significance (α), typically set to 0.05.
The null hypothesis is typically that the sample means are equal. The alternative hypothesis is that they are not equal.
For example:
H0: The mean weight of species A is equal to the mean weight of species B. Ha: The mean weight of species A is not equal to the mean weight of species B.
This can be done using Python's random
module or by importing data from a file. For example:
species_a = [95, 105, 110, 98, 102]
species_b = [116, 101, 99, 104, 108]
For example:
x1 = sum(species_a)/len(species_a)
x2 = sum(species_b)/len(species_b)
s1 = np.std(species_a)
s2 = np.std(species_b)
n1 = len(species_a)
n2 = len(species_b)
SE = (((n1-1)*s1**2 + (n2-1)*s2**2)/(n1+n2-2))**0.5 * (1/n1 + 1/n2)**0.5
For example:
t = (x1 - x2) / SE
For example:
p_value = scipy.stats.ttest_ind(species_a, species_b).pvalue
If the p-value is less than α, reject the null hypothesis and conclude that there is sufficient evidence to support the alternative hypothesis.
If the p-value is greater than α, fail to reject the null hypothesis and conclude that there is insufficient evidence to support the alternative hypothesis. For example:
if p_value < 0.05:
print("Reject the null hypothesis.")
else:
print("Fail to reject the null hypothesis.")
Paired Samples t-test
The paired samples t-test is used to compare the means of two related samples. This allows us to determine whether the means are significantly different from each other, while accounting for individual differences between the samples.
The test assumes that the differences between paired observations are normally distributed. Example research question: Is there a significant difference in the max vertical jump of basketball players before and after a training program?
Step-by-step guide:
- Define the null hypothesis (H0) and alternative hypothesis (Ha).
- Collect two related samples of data.
- Calculate the differences between the paired observations and the sample mean difference (d), sample standard deviation (s), and standard error (SE).
- Calculate the t-value using the formula:
t = (d - μ) / (SE)
, where μ is the hypothesized population mean difference (usually zero). - Calculate the p-value using a t-distribution table or a Python function like
scipy.stats.ttest_rel()
. - Compare the p-value to the level of significance (α), typically set to 0.05.
The null hypothesis is typically that the mean difference is equal to zero. The alternative hypothesis is that it is not equal to zero.
For example:
H0: The mean difference in max vertical jump before and after training is zero. Ha: The mean difference in max vertical jump before and after training is not zero.
This can be done by measuring the same variable in the same subjects before and after a treatment or intervention. For example:
before = [72, 69, 77, 71, 76]
after = [80, 70, 75, 74, 78]
For example:
differences = [after[i]-before[i] for i in range(len(before))]
d = sum(differences)/len(differences)
s = np.std(differences)
SE = s / (len(differences)**0.5)
For example:
t = (d - 0) / SE
For example:
p_value = scipy.stats.ttest_rel(after, before).pvalue
If the p-value is less than α, reject the null hypothesis and conclude that there is sufficient evidence to support the alternative hypothesis.
If the p-value is greater than α, fail to reject the null hypothesis and conclude that there is insufficient evidence to support the alternative hypothesis. For example:
if p_value < 0.05:
print("Reject the null hypothesis.")
else:
print("Fail to reject the null hypothesis.")
Two Sample t-test in Python
The two sample t-test is used to compare two independent samples and determine if there is a significant difference between the means of the two populations. In this test, the null hypothesis is that the means of the two samples are equal, while the alternative hypothesis is that they are not equal.
Example research question: Is the mean weight of two different species of turtles significantly different from each other? Step-by-step guide:
- Define the null hypothesis (H0) and alternative hypothesis (Ha). The null hypothesis is that the mean weight of the two turtle species is the same.
- Collect a random sample of data for each species. For example:
- Calculate the sample mean (x1, x2), sample standard deviation (s1, s2), and pooled standard error (SE).
- Calculate the t-value using the formula:
t = (x1 - x2) / (SE)
, where x1 and x2 are the sample means. - Calculate the p-value using a t-distribution table or a Python function like
ttest_ind()
. - Compare the p-value to the level of significance (α), typically set to 0.05.
The alternative hypothesis is that they are not equal. For example:
H0: The mean weight of species A is equal to the mean weight of species B.
Ha: The mean weight of species A is not equal to the mean weight of species B. 2.
species_a = [4.3, 3.9, 5.1, 4.6, 4.2, 4.8]
species_b = [4.9, 5.2, 5.5, 5.3, 5.0, 4.7]
For example:
import numpy as np
from scipy.stats import ttest_ind
x1 = np.mean(species_a)
x2 = np.mean(species_b)
s1 = np.std(species_a)
s2 = np.std(species_b)
n1 = len(species_a)
n2 = len(species_b)
SE = np.sqrt(s1**2/n1 + s2**2/n2)
For example:
t = (x1 - x2) / SE
For example:
p_value = ttest_ind(species_a, species_b).pvalue
If the p-value is less than α, reject the null hypothesis and conclude that there is sufficient evidence to support the alternative hypothesis. If the p-value is greater than α, fail to reject the null hypothesis and conclude that there is insufficient evidence to support the alternative hypothesis.
For example:
alpha = 0.05
if p_value < alpha:
print("Reject the null hypothesis.")
else:
print("Fail to reject the null hypothesis.")
In this example, if the p-value is less than 0.05, we would reject the null hypothesis and conclude that there is a significant difference between the mean weight of the two turtle species.
Paired Samples t-test in Python
The paired samples t-test is used to compare the means of two related samples. In this test, the null hypothesis is that the difference between the two means is equal to zero, while the alternative hypothesis is that they are not equal.
Example research question: Is there a significant difference in the max vertical jump of basketball players before and after a training program? Step-by-step guide:
- Define the null hypothesis (H0) and alternative hypothesis (Ha). The null hypothesis is that the mean difference in max vertical jump before and after the training program is zero.
- Collect two related samples of data, such as the max vertical jump of basketball players before and after a training program. For example:
- Calculate the differences between the paired observations and the sample mean difference (d), sample standard deviation (s), and standard error (SE).
- Calculate the t-value using the formula:
t = (d - μ) / (SE)
, where μ is the hypothesized population mean difference (usually zero). - Calculate the p-value using a t-distribution table or a Python function like
ttest_rel()
. - Compare the p-value to the level of significance (α), typically set to 0.05.
The alternative hypothesis is that it is not zero. For example:
H0: The mean difference in max vertical jump before and after the training program is zero.
Ha: The mean difference in max vertical jump before and after the training program is not zero. 2.
before_training = [58, 64, 62, 70, 68]
after_training = [62, 66, 64, 74, 70]
For example:
differences = [after_training[i]-before_training[i] for i in range(len(before_training))]
d = np.mean(differences)
s = np.std(differences)
n = len(differences)
SE = s / np.sqrt(n)
For example:
t = d / SE
For example:
p_value = ttest_rel(after_training, before_training).pvalue
If the p-value is less than α, reject the null hypothesis and conclude that there is sufficient evidence to support the alternative hypothesis.
If the p-value is greater than α, fail to reject the null hypothesis and conclude that there is insufficient evidence to support the alternative hypothesis. For example:
if p_value < alpha:
print("Reject the null hypothesis.")
else:
print("Fail to reject the null hypothesis.")
In this example, if the p-value is less than 0.05, we would reject the null hypothesis and conclude that there is a significant difference in the max vertical jump of basketball players before and after the training program.
Conclusion
Hypothesis testing is an essential tool in statistical analysis, which gives us insights into populations based on limited data. The two sample t-test and paired samples t-test are two popular statistical methods that enable researchers to compare means of samples and determine whether they are significantly different.
With the help of Python, hypothesis testing in practice is made more accessible and convenient than ever before. In this article, we have provided a step-by-step guide to performing these tests in Python, enabling researchers to perform rigorous analyses that generate meaningful and accurate results.
In conclusion, hypothesis testing in Python is a crucial step in making conclusions about populations based on data samples. The three common hypothesis tests in Python; one-sample t-test, two-sample t-test, and paired samples t-test can be effectively applied to explore various research questions.
By setting null and alternative hypotheses, collecting data, calculating mean and standard deviation values, computing t-value, and comparing it with the set significance level of α, we can determine if there's enough evidence to reject the null hypothesis. With the use of such powerful methods, scientists can give more accurate and informed conclusions to real-world problems and take critical decisions when needed.
Continual learning and expertise with hypothesis testing in Python tools can enable researchers to leverage this powerful statistical tool for better outcomes.