Description of analysis of variance

Aseen Saxena
3 min readDec 8, 2020

Analysis of variance is mainly used to test statistical hypothesis or estimate components of variance assigned to the different factors. The goal of statistical analysis is to identify trends. For example, a retail business might use statistical analysis to find patterns in unstructured and semi-structured customer data that can be used to create a more positive customer experience and increase sales. ANOVA is an analytical tool used to splits an observed aggregate variability found inside a dataset into two parts:

1. Systematic factors — it have a statistical influence on the given data set.

2. Random factors — it don’t have statistical influence on the given data set.

ANOVA is used to determine the influence that independent variables have on the dependent variable in a regression study. Analysis of variance is also known as the Fisher analysis of variance. The t-test and z-test methods developed in the 20th century were used for statistical analysis until 1918, when Ronald Fisher created the analysis of variance method. ANOVA is the extension of the t-test and z-test.

The formula of ANOVA is:

Formula of ANOVA

F = ANOVA cofficient

MST = Mean sum of squares due to treatment

MSE = Mean sum of squares due to error

There are two type of ANOVA:

1. One-way — it has one independent variable.

2. Two-way — it has two independent variable.

The null hypothesis for an ANOVA is that there is no significant difference among the groups. The alternative hypothesis assumes that there is at least one significant difference among the groups. After cleaning the data, the researcher must test the assumptions of ANOVA. They must then calculate the F-ratio and the associated probability value (p-value). In general, if the p-value associated with the F is smaller than .05, then the null hypothesis is rejected and the alternative hypothesis is supported. If the null hypothesis is rejected, one concludes that the means of all the groups are not equal. Post-hoc tests tell the researcher which groups are different from each other.

Why we use ANOVA?

Researchers use analysis of variance to test causal relationships in controlled experiments. In a controlled experiment, an experimenter manipulates an independent variable (a potential cause) and measures the effect on a dependent variable.

The goal of the experiment is to determine whether the independent variable has a causal effect on the dependent variable. Analysis of variance provides objective decision rules for determining whether observed differences (in mean scores) between groups are attributable to random chance or to the independent variable(s) manipulated by the experimenter.

When Can ANOVA Be Used?

Because analysis of variance compares mean scores between groups, it can be used with a compatible experimental design when the dependent variable in an experiment is measured on an interval scale or a ratio scale.

Analysis of variance would not be the right technique when the dependent variable in an experiment is measured on an ordinal scale or a categorical scale, because you cannot compute a mean score from ordinal or categorical data.

--

--