Menu Top
Complete Course of Mathematics
Topic 1: Numbers & Numerical Applications Topic 2: Algebra Topic 3: Quantitative Aptitude
Topic 4: Geometry Topic 5: Construction Topic 6: Coordinate Geometry
Topic 7: Mensuration Topic 8: Trigonometry Topic 9: Sets, Relations & Functions
Topic 10: Calculus Topic 11: Mathematical Reasoning Topic 12: Vectors & Three-Dimensional Geometry
Topic 13: Linear Programming Topic 14: Index Numbers & Time-Based Data Topic 15: Financial Mathematics
Topic 16: Statistics & Probability


Content On This Page
Coefficient of Variation: Definition and Calculation Comparing Variability using Coefficient of Variation Moments (Raw and Central - Implicit Introduction)


Measures of Relative Dispersion and Moments




Coefficient of Variation: Definition and Calculation


Need for Relative Dispersion

Absolute measures of dispersion, such as the range, mean deviation, variance ($\sigma^2$), and standard deviation ($\sigma$), provide a measure of the spread of data in the original units of measurement. While these measures are useful for understanding the variability within a single dataset, they pose challenges when comparing the variability of two or more datasets:

To overcome these limitations and allow for a meaningful comparison of variability between datasets with different units or different average magnitudes, we use relative measures of dispersion. These measures express dispersion as a ratio or percentage, which is unitless and relates the measure of spread to a measure of central tendency.


Definition of Coefficient of Variation

The most widely used relative measure of dispersion is the Coefficient of Variation (CV). It is defined as the ratio of the standard deviation ($\sigma$) to the arithmetic mean ($\bar{x}$), usually expressed as a percentage. It essentially measures the standard deviation relative to the mean.

The Coefficient of Variation is a standardized measure of dispersion that allows us to compare the degree of variability between datasets that may have different means or units of measurement.


Calculation

The formula for the Coefficient of Variation (CV) is:

CV $= \frac{\sigma}{\bar{x}} \times 100\%$

... (1)

Where:

It is important that the mean ($\bar{x}$) is not zero when calculating the CV. The CV is typically used for data measured on a ratio scale (where zero indicates the complete absence of the quantity) and where the mean is positive. If the mean is negative, the interpretation becomes less straightforward.

Steps to Calculate Coefficient of Variation:

  1. Calculate the arithmetic mean ($\bar{x}$) of the data using the appropriate method (ungrouped or grouped data).
  2. Calculate the standard deviation ($\sigma$) of the data using the appropriate method (ungrouped or grouped data), taking the positive square root of the variance.
  3. Divide the standard deviation ($\sigma$) by the mean ($\bar{x}$).
  4. Multiply the resulting ratio by 100 to express the Coefficient of Variation as a percentage.

Example

Example 1. The mean and standard deviation of the heights of a group of students are 160 cm and 8 cm, respectively. The mean and standard deviation of their weights are 55 kg and 5.5 kg, respectively. Calculate the coefficient of variation for both heights and weights.

Answer:

Given:

  • For Heights: Mean ($\bar{x}_H$) = 160 cm, Standard Deviation ($\sigma_H$) = 8 cm.
  • For Weights: Mean ($\bar{x}_W$) = 55 kg, Standard Deviation ($\sigma_W$) = 5.5 kg.

To Calculate: Coefficient of Variation (CV) for heights and weights.

Solution:

Using the formula CV $= \frac{\sigma}{\bar{x}} \times 100\%$:

For Heights:

CV (Height) $= \frac{\sigma_H}{\bar{x}_H} \times 100\%$

... (i)

CV (Height) $= \frac{8 \text{ cm}}{160 \text{ cm}} \times 100\%$

CV (Height) $= \frac{\cancel{8}^{1}}{\cancel{160}_{20}} \times 100\%$

(Cancelling the fraction)

CV (Height) $= \frac{1}{20} \times 100\%$

CV (Height) $= 0.05 \times 100\%$

CV (Height) $= 5\%$

... (ii)

For Weights:

CV (Weight) $= \frac{\sigma_W}{\bar{x}_W} \times 100\%$

... (iii)

CV (Weight) $= \frac{5.5 \text{ kg}}{55 \text{ kg}} \times 100\%$

CV (Weight) $= \frac{5.5}{55} \times 100\%$

CV (Weight) $= \frac{55/10}{55} \times 100\% = \frac{55}{10 \times 55} \times 100\%$

CV (Weight) $= \frac{1}{10} \times 100\%$

CV (Weight) $= 0.10 \times 100\%$

CV (Weight) $= 10\%$

... (iv)

The Coefficient of Variation for height is 5%, and for weight is 10%.



Comparing Variability using Coefficient of Variation


Principle of Comparison

The primary use of the Coefficient of Variation (CV) is to provide a standardized measure of dispersion that can be used to compare the relative variability of different datasets. Since the CV is a ratio ($\sigma / \bar{x}$) and is typically expressed as a percentage, it is a unitless measure. This makes it possible to compare variability even when the datasets have different units of measurement (like height in cm and weight in kg) or when they have vastly different average values (like salaries of different professions).

Therefore, to compare the variability or consistency of two or more datasets, we calculate the CV for each dataset and compare these CV values. The dataset with the lowest CV is considered the most consistent or least variable in relative terms.


Applications of CV for Comparison

The Coefficient of Variation is widely used in various fields for comparing relative variability:


Example

Example 1. Using the results from Example 1 in the previous section (I1), which measurement (height or weight) shows greater variability relative to its mean for the group of students?

Answer:

Given: Coefficient of Variation for Height = 5%, Coefficient of Variation for Weight = 10% (from Example 1, Section I1).

To Compare: Relative variability of height vs. weight.

Solution:

To compare relative variability, we compare the Coefficients of Variation:

  • CV (Height) = 5%
  • CV (Weight) = 10%

Since $10\% > 5\%$, the Coefficient of Variation for weight is higher than that for height.

A higher CV indicates greater variability relative to the mean.

Therefore, weight shows greater variability relative to its mean than height does for this group of students.

Interpretation:

Although the standard deviation of height (8 cm) is numerically larger than the standard deviation of weight (5.5 kg), the standard deviation of weight is a larger fraction (or percentage) of the average weight compared to how the standard deviation of height relates to the average height. This indicates that the weights of students in this group are relatively more spread out around their average weight than their heights are around their average height.


Example 2. Two factories, A and B, produce electric bulbs. A sample of bulbs from Factory A has a mean lifetime of 2000 hours and a standard deviation of 200 hours. A sample of bulbs from Factory B has a mean lifetime of 1800 hours and a standard deviation of 144 hours. Which factory produces bulbs with greater consistency in lifetime?

Answer:

Given:

  • Factory A: $\bar{x}_A = 2000$ hours, $\sigma_A = 200$ hours.
  • Factory B: $\bar{x}_B = 1800$ hours, $\sigma_B = 144$ hours.

To Determine: Which factory produces bulbs with greater consistency (less relative variability).

Solution:

To compare consistency, we calculate the Coefficient of Variation (CV) for each factory.

Using the formula CV $= \frac{\sigma}{\bar{x}} \times 100\%$:

For Factory A:

CV$_A = \frac{\sigma_A}{\bar{x}_A} \times 100\%$

... (i)

CV$_A = \frac{200 \text{ hours}}{2000 \text{ hours}} \times 100\%$

CV$_A = \frac{\cancel{200}^{1}}{\cancel{2000}_{10}} \times 100\%$

CV$_A = \frac{1}{10} \times 100\%$

CV$_A = 10\%$

... (ii)

For Factory B:

CV$_B = \frac{\sigma_B}{\bar{x}_B} \times 100\%$

... (iii)

CV$_B = \frac{144 \text{ hours}}{1800 \text{ hours}} \times 100\%$

CV$_B = \frac{\cancel{144}^{1}}{\cancel{1800}_{12.5}} \times 100\%$

($1800 \div 144 = 12.5$)

CV$_B = \frac{1}{12.5} \times 100\%$

CV$_B = 0.08 \times 100\%$

($1/12.5 = 1/(25/2) = 2/25 = 0.08$)

CV$_B = 8\%$

... (iv)

Comparison:

CV$_A = 10\%$ and CV$_B = 8\%$.

Since $8\% < 10\%$, Factory B has a lower Coefficient of Variation than Factory A.

A lower CV indicates greater consistency (less relative variability).

Therefore, Factory B produces bulbs with greater consistency in lifetime compared to Factory A.



Moments (Raw and Central - Implicit Introduction)


Beyond Mean and Variance: Describing Shape

While measures of central tendency (like mean, median, mode) describe where the data is centered, and measures of dispersion (like variance, standard deviation, range) describe how spread out the data is, these two types of measures alone do not fully characterize a frequency distribution. Distributions can have the same mean and variance but differ in their shape – specifically, their asymmetry (skewness) and peakedness (kurtosis).

To describe these higher-order characteristics of a distribution's shape, statisticians use quantities called **moments**. Moments provide a more complete set of summary statistics that can describe various features of a probability distribution or a dataset.


Raw Moments (Moments About the Origin)

The **$k^{\text{th}}$ raw moment** of a dataset is defined as the arithmetic mean of the $k^{\text{th}}$ powers of the observations. It is also called the moment about the origin because it's calculated relative to zero.

For a dataset of $n$ individual observations $x_1, x_2, \dots, x_n$, the $k^{\text{th}}$ raw moment, denoted by $m'_k$ or $\mu'_k$, is given by:

$m'_k = \frac{\sum\limits_{i=1}^{n} x_i^k}{n}$

... (1)

For a frequency distribution with distinct values or class marks $x_1, x_2, \dots, x_m$ and frequencies $f_1, f_2, \dots, f_m$, and total frequency $N = \sum f_i$, the $k^{\text{th}}$ raw moment is:

$m'_k = \frac{\sum\limits_{i=1}^{m} f_i x_i^k}{N}$

... (2)

Special Cases:


Central Moments (Moments About the Mean)

The **$k^{\text{th}}$ central moment** of a dataset is defined as the arithmetic mean of the $k^{\text{th}}$ powers of the deviations of the observations from the **mean** ($\bar{x}$). Central moments are more informative about the shape of the distribution relative to its center.

For a dataset of $n$ individual observations $x_1, x_2, \dots, x_n$ with mean $\bar{x}$, the $k^{\text{th}}$ central moment, denoted by $m_k$ or $\mu_k$, is given by:

$m_k = \frac{\sum\limits_{i=1}^{n} (x_i - \bar{x})^k}{n}$

... (3)

For a frequency distribution with distinct values or class marks $x_i$ and frequencies $f_i$, and total frequency $N = \sum f_i$ and mean $\bar{x}$, the $k^{\text{th}}$ central moment is:

$m_k = \frac{\sum\limits_{i=1}^{m} f_i (x_i - \bar{x})^k}{N}$

... (4)

Special Cases:

Relationship between Raw and Central Moments:

Central moments can be expressed in terms of raw moments. Some common relationships are:

Moments provide a more comprehensive mathematical description of a distribution's shape and properties. The lower-order moments (mean and variance) are related to location and spread, while higher-order moments (especially the third and fourth) are used to quantify asymmetry and peakedness, which are key aspects of describing the shape of a distribution.