Chi-Square Goodness Of Fit Test ~ Formula & Examples

The chi-square goodness of fit test determines if the observed proportions drawn from a random sample follow the suggested theory. Within statistics, analysts use the chi-square goodness of fit test to determine the proportions to use in the test and if the outcomes follow a probability distribution. Discover how to use the hypothesis test, the important formulas and examples of applying the chi-square goodness of fit test.

Index

Inhaltsverzeichnis

1 Chi-Square Goodness of Fit Test – In a Nutshell
2 Definition: Chi-square goodness of fit test
3 When to use the chi-square goodness of fit test
4 Chi-square goodness of fit test: Hypotheses
5 How to calculate the chi-square goodness of fit test
6 FAQs

Chi-Square Goodness of Fit Test – In a Nutshell

The chi-square goodness of fit test determines if a variable comes from a distribution.
The test evaluates if sample data represents the entire population.
For the chi-square goodness of fit test, you need a hypothesis and one variable.
Applying the test to a data set requires categorical variables or nominal data.

Definition: Chi-square goodness of fit test

Chi-square goodness of fit test is a Pearson’s test that determines whether the observed distribution differs from the expectations. Since it is a statistical hypothesis test, you can use it to determine if the data groupings represent the full population. For instance, if the chi-square goodness of fit test is low, the expectations don’t match the observed values. When the chi-square goodness of fit test is high, the hypothesis is close to the observed values.

When to use the chi-square goodness of fit test

Most statistical tests are based on distributional assumptions. Unlike Anderson-Darling and Kolmogorov-Smirnov tests, which are restricted to continuous distributions, the chi-square goodness of fit test can be applied to discrete distributions like Poisson and binomials.

The test can evaluate how well the theoretical distribution matches the empirical distribution. You use it to check if your sample data matches your specific theoretical distribution. For instance, if you have a set of data values and an idea of how they are distributed, you use the chi-square goodness of fit test to determine if your hypothesis is true.

When performing a chi-square goodness of fit test, you must fulfil the following conditions:

You can only test categorical variables.
If the variable is continuous, convert it to a categorical variable.
The sample is randomly selected from the population.
The distribution should provide a minimum of five observations.

Chi-square goodness of fit test: Hypotheses

Since the chi-square goodness of fit test evaluates if a sample was drawn from the population, you need a hypothesis to test. A hypothesis allows you to draw conclusions about the population distribution and whether the goodness of fit is high enough to conclude that your hypothesis is right. A chi-square goodness of fit test has two types of hypotheses:

Null hypothesis(H₀): The hypothesis observes that the difference between the expected and observed values is insignificant.

Example

The population follows the categorical grouping or specified distribution.

Alternative hypothesis(H_a): The hypothesis reveals a significant difference between the expected and observed values.

Example

The population does not follow the categorical grouping or specified distribution.

Note: You can specify your hypothesis by describing the sample distribution or providing the proportions for the categorical groupings.

How to calculate the chi-square goodness of fit test

In the chi-square goodness of fit test, the distribution for the hypothesis is chi-square (X2). The formula for the goodness of fit test is as follows:

where

Σ = Is the summation symbol
O = Observed value
E = Expected value

If the p-value representing the X2 test statistic with n-1 degrees of freedom3 is lower than your significance level, the alternative hypothesis is true, and you can reject the null hypothesis. The degree of freedom you use depends on the distribution of the sample. For instance, if it is binomial distribution, the degree of freedom is n-1. Poisson distribution has a degree of freedom of n-2 while normal distribution is n-3.

In the example below, the shop owner projects that an equal number of customers visit the shop. From observation, the researcher found the actual number of customers in a given week. The information can be used to calculate the chi-square goodness of fit test.

Example

Step 1: Actual vs. Expected

Days of the week	Number of customers(O)	Expected number of customers(E)
Monday	40	50
Tuesday	50	50
Wednesday	60	50
Thursday	45	50
Friday	55	50

Example

Step 2: Observed vs. expected

Days of the week	Number of customers(O)	Expected number of customers(E)	Observed – Expected(0-E)
Monday	40	50	-10
Tuesday	50	50	0
Wednesday	60	50	10
Thursday	45	50	-5
Firday	55	50	5

Example

Step 3: Squared difference between observed vs. expected

Days of the week	Number of customers(O)	Expected number of customers(E)	Observed-Expected(0-E)	(0-E)²
Monday	40	50	-10	100
Tuesday	50	50	0	0
Wednesday	60	50	10	100
Thursday	45	50	-5	25
Friday	55	50	5	25

Example

Step 4: Calculation of the squared difference

Days of the week	Number of customers(O)	Expected number of customers(E)	Observed-Expected(O-E)	(O-E)²
Monday	40	50	-10	100	2
Tuesday	50	50	0	0	0
Wednesday	60	50	10	100	2
Thursday	45	50	-5	25	0.5
Friday	55	50	5	25	0.5

The chi-square statistic doesn’t tell you much without interpretation. For instance, you need to compare the chi-square value with the appropriate distribution to determine whether to reject the null hypothesis. For example, you need to determine the degree of freedom and significance level to get the critical value. From the critical value, you can reject or accept the null hypothesis. For example, if the chi-square figure is lower than the critical value, the observed values are not different from the expected values.

Printing Your Thesis With BachelorPrint

High-quality bindings with customizable embossing
3D live preview to check your work before ordering
Free express delivery

Configure your binding now!

to printing services

FAQs

How to apply the Chi-square goodness of fit test?

The chi-square goodness of fit test determines if your hypothesis is true. The test is applied to categorical variables from the population, and you want to determine whether the data is consistent with the hypothesized distribution. When applying the test, start by stating the null and alternative hypotheses. You will formulate an analysis plan, sample the data and interpret the results to reject or accept the hypothesis.

What conditions are necessary for performing the chi-square goodness of fit test?

You can perform a chi-square goodness of fit test if the distribution is a categorical variable. If the variable is continuous, convert it to a categorical variable by dividing the observations into intervals for easy interpretation. The sample should also be randomly selected from the population. Your sample should have a minimum of five observations expected.

What happens when you have two categorical variables?

Use the chi-square goodness of fit test with one categorical variable and would like to test the hypothesis of the distribution. If you have two categorical variables, you can use the chi-square test of independence. You also use the test to determine the hypothesis about the relationship. In both tests, you need to formulate the null and alternative hypotheses and test before accepting or rejecting the hypotheses.

Category

Your Steps to Success

Chi-Square Goodness Of Fit Test – Formula & Examples

How do you like this article? Cancel reply

Chi-Square Goodness of Fit Test – In a Nutshell

Definition: Chi-square goodness of fit test

When to use the chi-square goodness of fit test

Chi-square goodness of fit test: Hypotheses

How to calculate the chi-square goodness of fit test

FAQs

How to apply the Chi-square goodness of fit test?

What conditions are necessary for performing the chi-square goodness of fit test?

What happens when you have two categorical variables?

How do you like this article? Cancel reply