Join our Free Webinar
10th December 2024 | 1PM CET
2 Proportion Test

Two Proportion Z Test: The Six Sigma Basics | Leanscape

Search

The Two Proportion Z Test is a statistical method used to determine if there is a significant difference between the proportions of two groups. By comparing sample proportions, this test helps to ascertain whether observed differences could arise by chance or reflect actual disparities in the larger populations from which the samples are drawn. It relies on the Z statistic to evaluate the null hypothesis that there is no difference between the proportions. This tool is invaluable in fields such as marketing research, clinical trials, and social sciences, where understanding differences between groups can inform decision-making and strategies.

A Two Proportion Z Test, also known as a Two Proportions Test or Z-Test for Difference of Proportions, is a statistical method to see if the proportions of categories in two group variables are different from each other. This test is used to compare two independent groups, like the proportion of males and females in a population or the proportion of people who support a particular policy in two different regions. The Two Proportion Z Test is a hypothesis test that uses the z-test statistic to determine the significance of the difference between the two proportions.

Requirements to Run a Two-Proportion Z Test

 

Before running a Two Proportion Z Test, you must meet the following requirements:

  • Independence of observations: The observations in each group must be independent of each other. Meaning the outcome of one observation should not affect the outcome of another.

  • Random sampling: The samples must be randomly selected from the population, ideally as a simple random sample, to be representative and to minimize bias.

  • Equal variances: The variances of the two groups must be equal. This helps to estimate the standard error accurately.

  • Normal distribution of the differences between the proportions: The differences between the proportions must be normally distributed, especially when the sample sizes are large.

  • More than 10 values in every cell: Each group should have more than 10 values in every cell to be reliable.

These requirements are important for the Two Proportion Z Test to be valid and reliable.

Assumptions and Conditions

 

Before you run a Two Proportion Z Test, you must ensure that certain assumptions and conditions are met. These requirements are important for the test results to be valid and reliable.

First, the data must be normally distributed. This assumption is more important when dealing with large sample sizes as it ensures the sampling distribution of the difference between proportions is approximately normally distributed. And each sample should have more than 10 values in every cell to meet the large sample size requirement.

Independence of observations is another requirement. Meaning the outcome of one observation should not affect the outcome of another. Randomly selecting the samples from the population helps to achieve this independence and to minimize bias.

Also, the population standard deviation must be known. This will allow us to calculate the test statistic and interpret the results.

In summary, to run a Two Proportion Z Test:

  • The samples must be independent and randomly selected from the population.

  • More than 10 values in every cell.

  • Data must be normally distributed.

  • Population standard deviation must be known.

By meeting these requirements you can proceed with the Two Proportion Z Test and be sure your results are valid and reliable.

How Two-Proportion Z Test Works

 

The 2 proportion z-test works by first calculating what is called the pooled proportion of the population proportions. This is done by adding the number of successes of both groups and dividing it by the total number of observations in both groups. For example, we might compare the proportion of late students in different classes or the proportion of residents who support a particular law in different counties.

Now that we have the pooled proportion, we can calculate the standard error. The standard error will tell us how much variation we should expect from our sample statistic; in this case, it will tell us how much variation we should expect from our pooled proportion.

Once we have both values, we can calculate the test statistic. The test statistic will tell us how many standard deviations away from the mean of our sample statistic is. In this case, it will tell us how many standard deviations away from 0.5 our pooled proportion is. Sometimes continuity correction is applied to the test results to account for the discrete nature of the data to make the p-values more accurate.

Whether our null hypothesis is that there is a difference or there is no difference, our alternative hypothesis will be directional or non-directional respectively. If our alternative hypothesis is directional, we will look for the test statistic to fall into either the upper or lower tail depending on the direction we are testing for. For example, if we are testing if Group 1 has a higher success rate than Group 2, we will look for the test statistic to fall into the upper tail.

But if our alternative hypothesis is non-directional, we will look for the test statistic to fall into either extreme of the distribution. In other words, we will look for the test statistic to be below -1.96 OR above 1.96.

The null hypothesis is a statement of no difference between the two groups. If the test statistic falls into the critical region, we reject the null hypothesis. This means we have enough evidence to say there is a significant difference between the two population proportions.

The test involves collecting sample data, stating the hypotheses, calculating the test statistic, calculate the p-value and conclude based on the results.

The test statistic z is calculated using the formula z=(p1-p2)/√p(1-p)(1/n1+1/n2). This will give us the p-value and we can conclude about the difference in proportions between the groups.

To the extent that the test statistic falls into the critical region we can decide to reject or fail to reject the null hypothesis based on the p-value.

Null Hypothesis


In a Two Proportion Z Test, the null hypothesis is a statement of no difference between the two population proportions. Mathematically it can be written as:

H0: p1 = p2

Here p1 and p2 are the population proportions of the two groups being compared. The null hypothesis is saying that any difference between the sample proportions is due to chance rather than a true difference in the population.

And the alternative hypothesis is saying that there is a difference between the two population proportions.

H1: p1 ≠ p2

The alternative hypothesis can be directional or non-directional depending on the research question. A non-directional alternative hypothesis is simply p1 ≠ p2 while a directional alternative hypothesis is p1 > p2 or p1 < p2.

By stating the null and alternative hypothesis you set the stage for the hypothesis test and can decide whether the data is sufficient to reject the null hypothesis in favour of the alternative hypothesis.

Test Statistic


The test statistic for a Two Proportion Z Test is the key that determines the probability of observing the difference between the two sample proportions assuming the null hypothesis is true. The formula for the test statistic is:

z = (p1 – p2) / √[p(1-p)(1/n1 + 1/n2)]

In this formula:

  • p1 and p2 are the sample proportions of the two groups.

  • p is the pooled proportion which is a weighted mean of the two sample proportions.

  • n1 and n2 are the sample sizes of the two groups.

The pooled proportion (p) is the combination of the successes and total observations from both groups. It is a single proportion that represents the combined data of both samples.

Once the pooled proportion is calculated, the standard error is calculated using the formula √[p(1-p)(1/n1 + 1/n2)]. This standard error is the expected variation in the difference between the sample proportions.

The test statistic (z) is then calculated by dividing the difference between the sample proportions (p1 – p2) by the standard error. This z-value is how many standard deviations the observed difference is from the expected difference under the null hypothesis.

By comparing the test statistic to the critical values from the standard normal distribution you can decide to reject the null hypothesis and conclude that there is a significant difference between the two population proportions.

Interpreting the Results of a Z Test


The results of a Two Proportion Z Test are interpreted by comparing the calculated z-score with the critical value from the z-table. If the calculated z-score is greater than the critical value we reject the null hypothesis, if it is less than the critical value we fail to reject the null hypothesis.

The null hypothesis in this case is a statement that there is no difference between the two groups. By comparing the z-score to the critical value we can decide if the observed difference in proportions is due to random chance or if it is a true difference in the population.

Example of a 2 Proportion Test


Let’s do an example to illustrate how to do a Two Proportion Z Test. Suppose we want to compare the proportion of students who pass a certain exam in two different schools. We randomly select 100 students from each school and find that 80 students from School A pass the exam while 70 students from School B pass the exam.

First we calculate the sample proportions:

  • p1 = 80/100 = 0.8

  • p2 = 70/100 = 0.7

Next we calculate the pooled proportion:

  • p = (80 + 70) / (100 + 100) = 150/200 = 0.75

Then we calculate the standard error:

  • SE = √[0.75 * (1-0.75) * (1/100 + 1/100)] = √[0.75 * 0.25 * 0.02] = √0.00375 ≈ 0.0612

Now we calculate the test statistic:

  • z = (0.8 – 0.7) / 0.0612 ≈ 1.63

Using the z-table we find that the probability of observing a z-score greater than 1.63 is approximately 0.051. Since this probability is greater than our chosen significance level (e.g. 0.05) we fail to reject the null hypothesis. This means we don’t have enough evidence to conclude that there is a significant difference between the proportions of students who pass the exam in the two schools.

This is an example of how to do a Two Proportion Z Test from start to finish.

Two Proportion Z Test Applications


Two Proportion Z Test has many applications in many fields:

  • Medicine: To compare the proportion of patients who respond to a treatment in two different groups. For example to see if a new drug is more effective than the standard treatment.

  • Social sciences: To compare the proportion of people who support a policy in two different areas. This will help policy makers understand regional differences in public opinion.

  • Business: To compare the proportion of customers who like a product in two different markets. Companies can use this to tailor their marketing.

  • Education: To compare the proportion of students who pass an exam in two different schools. This will help educators to identify the gaps and address them.

In each of these applications the Two Proportion Z Test is used to see if there is a statistically significant difference between two independent groups and to make decisions.

Common Errors to Watch Out For


When doing a Two Proportion Z Test you should be aware of the common errors that can affect the results. Here are some to watch out for:

  1. Not Checking Assumptions: Check that the assumptions of normality, independence and random sampling are met. If not the results will be invalid.

  2. Using the wrong formula: Use the correct formula for the test statistic. Miscalculations will lead to wrong conclusions.

  3. Incorrect Pooled Proportion: Make sure to combine the successes and total observations from both groups correctly. Errors here will affect the test statistic.

  4. Misinterpreting the results: Be careful when interpreting the p-value. A p-value greater than the significance level means you fail to reject the null hypothesis not that the null hypothesis is true.

  5. Ignoring other explanations: Consider other factors that might explain the differences, like sample size and population standard deviations. These will affect the test results.

By avoiding these common errors you will do the Two Proportion Z Test correctly and get interpretable results.

Conclusion:


The two-proportion z-test is a useful statistical tool to compare two proportions and see if there is a significant difference between them. This can be used in many situations and help researchers to see if one is significantly higher than the other using the two-proportion z method. The two-proportion z-test will help researchers to determine the statistical significance of the difference between the two groups and to make decisions.

The only way to make change happen is to take the next step. Transform your organisation into a competitive leader.

Related Articles

LEAN SIX SIGMA Online Courses

Ready to start your journey into the world of Lean with this free course?

FREE COURSE

A Lean focused continious improvement certification course

only £119

Propel your career forward, tackle complex problems and drive change

Only £167

The ultimate fast-track for future leadership

only £849

Become an expert in change management and complex problem-solving.

Only £1649

TAKE OUR QUIZ

WHICH COURSE is right for you?

Take our short quiz to find out which of our courses is right for you. 

Join us for a Free Demo

We will talk you through our business transformation & lean training solutions

Learn to Lead The Change

Days :
Hours :
Minutes :
Seconds

— NEXT ONLINE WEBINAR STARTS 10th DECEMBER —

JOIN OUR WEBINER
25% OFF

Join us for a Free Demo

We will talk you through our business transformation & lean training solutions

Learn to Lead The Change

Take our Business Transformation Quiz

Are you ready for change?

Learn to Lead The Change

Take our Business Transformation Quiz

Are you ready for change?

Learn to Lead The Change