Menu Top
Non-Rationalised NCERT Books Solution
6th 7th 8th 9th 10th 11th 12th

Class 11th Chapters
1. Sets 2. Relations and Functions 3. Trigonometric Functions
4. Principle of Mathematical Induction 5. Complex Numbers and Quadratic Equations 6. Linear Inequalities
7. Permutations and Combinations 8. Binomial Theorem 9. Sequences and Series
10. Straight Lines 11. Conic Sections 12. Introduction to Three Dimensional Geometry
13. Limits and Derivatives 14. Mathematical Reasoning 15. Statistics
16. Probability

Content On This Page
Example 1 to 7 (Before Exercise 15.1) Exercise 15.1 Example 8 to 12 (Before Exercise 15.2)
Exercise 15.2 Example 13 to 15 (Before Exercise 15.3) Exercise 15.3
Example 16 to 19 - Miscellaneous Examples Miscellaneous Exercise On Chapter 15


Chapter 15 Statistics

Welcome to the solutions for Chapter 15: Statistics. While previous encounters with statistics likely focused on summarizing data using measures of central tendency (like mean, median, and mode), which describe the 'typical' value within a dataset, this chapter delves into another crucial aspect of data analysis: understanding its dispersion or variability. Central tendency measures alone provide an incomplete picture. For instance, two datasets might have the exact same mean but differ vastly in how spread out their values are. One set might cluster tightly around the mean, while the other might have values scattered widely. Measuring this spread, or dispersion, is essential for comprehending the distribution's nature, assessing consistency, comparing different datasets reliably, and making informed inferences. This chapter introduces several key statistical tools designed specifically to quantify the extent to which data points deviate from the average or spread out across the range of observations. We will explore methods applicable to both ungrouped (raw) data and grouped data presented in frequency distributions.

The solutions explore various measures of dispersion, starting with the simplest and progressing to more robust and widely used metrics:

Beyond calculation, the solutions emphasize the interpretation of these measures. For instance, they cover the analysis of frequency distributions that might share the same mean but exhibit different variances, illustrating how standard deviation effectively quantifies the consistency or spread within each dataset – a smaller $\sigma$ indicates data points are clustered more closely around the mean (more consistent), while a larger $\sigma$ signifies greater variability. For comparing the relative variability of two or more datasets, especially if they have different means or different units, the Coefficient of Variation (CV) is introduced. It's a unit-less measure calculated as $CV = \left(\frac{\sigma}{\bar{x}}\right) \times 100\%$. A lower CV indicates greater consistency relative to the mean. These tools provide a far more comprehensive understanding of data characteristics than central tendency alone.



Example 1 to 7 (Before Exercise 15.1)

Example 1: Find the mean deviation about the mean for the following data:

671012134812

Answer:

The given data is: 6, 7, 10, 12, 13, 4, 8, 12.

The number of observations is $n = 8$.


First, we find the mean of the data.

Mean ($\overline{x}$) = $\frac{\text{Sum of observations}}{\text{Number of observations}}$

Sum of observations = $6 + 7 + 10 + 12 + 13 + 4 + 8 + 12 = 72$

$\overline{x} = \frac{72}{8} = 9$

The mean of the data is 9.


Next, we find the absolute deviation of each observation from the mean, i.e., $|x_i - \overline{x}|$.

$|6 - 9| = |-3| = 3$

$|7 - 9| = |-2| = 2$

$|10 - 9| = |1| = 1$

$|12 - 9| = |3| = 3$

$|13 - 9| = |4| = 4$

$|4 - 9| = |-5| = 5$

$|8 - 9| = |-1| = 1$

$|12 - 9| = |3| = 3$


Now, we find the sum of the absolute deviations.

$\sum\limits |x_i - \overline{x}| = 3 + 2 + 1 + 3 + 4 + 5 + 1 + 3 = 22$


Finally, we calculate the mean deviation about the mean.

Mean Deviation about the mean = $\frac{\sum\limits |x_i - \overline{x}|}{n}$

MD($\overline{x}$) = $\frac{22}{8} = \frac{11}{4} = 2.75$

The mean deviation about the mean for the given data is 2.75.

Example 2: Find the mean deviation about the mean for the following data :

12318174917192015
8172316113105

Answer:

The given data is:

12, 3, 18, 17, 4, 9, 17, 19, 20, 15, 8, 17, 2, 3, 16, 11, 3, 1, 0, 5.

The number of observations is $n$. By counting the data points, we have $n = 20$.


First, we find the mean of the data ($\overline{x}$).

$\overline{x} = \frac{\text{Sum of observations}}{\text{Number of observations}} = \frac{\sum\limits x_i}{n}$

Sum of observations ($\sum\limits x_i$) = $12 + 3 + 18 + 17 + 4 + 9 + 17 + 19 + 20 \ $$ + 15 + 8 + 17 + 2 + 3 + 16 + 11 + 3 + 1 + 0 + 5 = 200$

$\overline{x} = \frac{200}{20} = 10$

The mean of the data is 10.


Next, we find the absolute deviation of each observation from the mean, i.e., $|x_i - \overline{x}| = |x_i - 10|$.

$|12 - 10| = 2$

$|3 - 10| = 7$

$|18 - 10| = 8$

$|17 - 10| = 7$

$|4 - 10| = 6$

$|9 - 10| = 1$

$|17 - 10| = 7$

$|19 - 10| = 9$

$|20 - 10| = 10$

$|15 - 10| = 5$

$|8 - 10| = 2$

$|17 - 10| = 7$

$|2 - 10| = 8$

$|3 - 10| = 7$

$|16 - 10| = 6$

$|11 - 10| = 1$

$|3 - 10| = 7$

$|1 - 10| = 9$

$|0 - 10| = 10$

$|5 - 10| = 5$


Now, we find the sum of the absolute deviations.

$\sum\limits |x_i - \overline{x}| = 2 + 7 + 8 + 7 + 6 + 1 + 7 + 9 + 10 + 5 + 2 + 7 + 8 + 7 \ $$ + 6 + 1 + 7 + 9 + 10 + 5 = 124$


Finally, we calculate the mean deviation about the mean (MD($\overline{x}$)).

MD($\overline{x}$) = $\frac{\sum\limits |x_i - \overline{x}|}{n} = \frac{124}{20}$

MD($\overline{x}$) = $6.2$

The mean deviation about the mean for the given data is 6.2.

Example 3: Find the mean deviation about the median for the following data:

39531210184719
21

Answer:

The given data is: 3, 9, 5, 3, 12, 10, 18, 4, 7, 19, 21.

The number of observations is $n = 11$.


First, we need to arrange the data in ascending order to find the median.

Arranged data: 3, 3, 4, 5, 7, 9, 10, 12, 18, 19, 21.


Since the number of observations ($n = 11$) is odd, the median (M) is the value of the $\left(\frac{n+1}{2}\right)^{\text{th}}$ observation.

Median (M) = $\left(\frac{11+1}{2}\right)^{\text{th}}$ observation = $6^{\text{th}}$ observation.

From the arranged data, the $6^{\text{th}}$ observation is 9.

So, the median M = 9.


Next, we find the absolute deviation of each observation from the median, i.e., $|x_i - M| = |x_i - 9|$.

$|3 - 9| = |-6| = 6$

$|3 - 9| = |-6| = 6$

$|4 - 9| = |-5| = 5$

$|5 - 9| = |-4| = 4$

$|7 - 9| = |-2| = 2$

$|9 - 9| = |0| = 0$

$|10 - 9| = |1| = 1$

$|12 - 9| = |3| = 3$

$|18 - 9| = |9| = 9$

$|19 - 9| = |10| = 10$

$|21 - 9| = |12| = 12$


Now, we find the sum of the absolute deviations.

$\sum\limits |x_i - M| = 6 + 6 + 5 + 4 + 2 + 0 + 1 + 3 + 9 + 10 + 12 = 58$


Finally, we calculate the mean deviation about the median (MD(M)).

MD(M) = $\frac{\sum\limits |x_i - M|}{n} = \frac{58}{11}$

MD(M) $\approx 5.27$ (approximately)

The mean deviation about the median for the given data is $\frac{58}{11}$ or approximately 5.27.

Example 4: Find mean deviation about the mean for the following data :

$x_i$ 2 5 6 8 10 12
$f_i$ 2 8 10 7 8 5

Answer:

The given data is a discrete frequency distribution:

$x_i$ $f_i$
22
58
610
87
108
125

To find the mean deviation about the mean, we first need to calculate the mean ($\overline{x}$).

We calculate $f_i x_i$ for each class and the total frequency $\sum\limits f_i$ and the sum $\sum\limits f_i x_i$.

$x_i$ $f_i$ $f_i x_i$
22$2 \times 2 = 4$
58$8 \times 5 = 40$
610$10 \times 6 = 60$
87$7 \times 8 = 56$
108$8 \times 10 = 80$
125$5 \times 12 = 60$
$\sum\limits f_i = 40$ $\sum\limits f_i x_i = 300$

The mean ($\overline{x}$) is calculated as:

$\overline{x} = \frac{\sum\limits f_i x_i}{\sum\limits f_i}$

$\overline{x} = \frac{300}{40} = \frac{30}{4} = 7.5$

The mean of the data is 7.5.


Next, we calculate the absolute deviation of each observation from the mean, $|x_i - \overline{x}| = |x_i - 7.5|$, and the product $f_i |x_i - 7.5|$.

$x_i$ $f_i$ $|x_i - 7.5|$ $f_i |x_i - 7.5|$
22$|2 - 7.5| = 5.5$$2 \times 5.5 = 11.0$
58$|5 - 7.5| = 2.5$$8 \times 2.5 = 20.0$
610$|6 - 7.5| = 1.5$$10 \times 1.5 = 15.0$
87$|8 - 7.5| = 0.5$$7 \times 0.5 = 3.5$
108$|10 - 7.5| = 2.5$$8 \times 2.5 = 20.0$
125$|12 - 7.5| = 4.5$$5 \times 4.5 = 22.5$
$\sum\limits f_i = 40$ $\sum\limits f_i |x_i - 7.5| = 92.0$

Finally, we calculate the mean deviation about the mean (MD($\overline{x}$)).

MD($\overline{x}$) = $\frac{\sum\limits f_i |x_i - \overline{x}|}{\sum\limits f_i}$

MD($\overline{x}$) = $\frac{92.0}{40} = \frac{92}{40} = \frac{23}{10} = 2.3$

The mean deviation about the mean for the given data is 2.3.

Example 5: Find the mean deviation about the median for the following data:

$x_i$ 3 6 9 12 13 15 21 22
$f_i$ 3 4 5 2 4 5 4 3

Answer:

The given data is a discrete frequency distribution:

$x_i$ $f_i$
33
64
95
122
134
155
214
223

To find the mean deviation about the median, we first need to calculate the median (M).

We calculate the total frequency $N = \sum\limits f_i$ and the cumulative frequencies (c.f.).

$x_i$ $f_i$ Cumulative Frequency (c.f.)
333
64$3 + 4 = 7$
95$7 + 5 = 12$
122$12 + 2 = 14$
134$14 + 4 = 18$
155$18 + 5 = 23$
214$23 + 4 = 27$
223$27 + 3 = 30$
$N = \sum\limits f_i = 30$

The total number of observations is $N = 30$, which is an even number.

For an even number of observations, the median is the average of the $\left(\frac{N}{2}\right)^{\text{th}}$ and $\left(\frac{N}{2} + 1\right)^{\text{th}}$ observations.

$\frac{N}{2} = \frac{30}{2} = 15^{\text{th}}$ observation.

$\frac{N}{2} + 1 = 15 + 1 = 16^{\text{th}}$ observation.

From the cumulative frequency table, the $15^{\text{th}}$ observation falls in the class where c.f. is 18, which corresponds to $x_i = 13$.

The $16^{\text{th}}$ observation also falls in the class where c.f. is 18, which corresponds to $x_i = 13$.

So, the median (M) = $\frac{13 + 13}{2} = 13$.

The median of the data is 13.


Next, we calculate the absolute deviation of each observation from the median, $|x_i - M| = |x_i - 13|$, and the product $f_i |x_i - 13|$.

$x_i$ $f_i$ $|x_i - 13|$ $f_i |x_i - 13|$
33$|3 - 13| = 10$$3 \times 10 = 30$
64$|6 - 13| = 7$$4 \times 7 = 28$
95$|9 - 13| = 4$$5 \times 4 = 20$
122$|12 - 13| = 1$$2 \times 1 = 2$
134$|13 - 13| = 0$$4 \times 0 = 0$
155$|15 - 13| = 2$$5 \times 2 = 10$
214$|21 - 13| = 8$$4 \times 8 = 32$
223$|22 - 13| = 9$$3 \times 9 = 27$
$\sum\limits f_i = 30$ $\sum\limits f_i |x_i - 13| = 149$

Finally, we calculate the mean deviation about the median (MD(M)).

MD(M) = $\frac{\sum\limits f_i |x_i - M|}{\sum\limits f_i}$

MD(M) = $\frac{149}{30}$

MD(M) $\approx 4.97$ (approximately)

The mean deviation about the median for the given data is $\frac{149}{30}$ or approximately 4.97.

Example 6: Find the mean deviation about the mean for the following data

Marks obtained 10-20 20-30 30-40 40-50 50-60 60-70 70-80
Number of students 2 3 8 14 8 3 2

Answer:

The given data is a grouped frequency distribution:

Marks obtained (Class Interval) Number of students ($f_i$)
10-202
20-303
30-408
40-5014
50-608
60-703
70-802

To find the mean deviation about the mean, we first need to calculate the mean ($\overline{x}$).

For grouped data, the mean is calculated using the midpoints of the class intervals.

Let $x_i$ be the midpoint of the $i$-th class interval and $f_i$ be the corresponding frequency.

Calculate the midpoints ($x_i$) and the product $f_i x_i$:

Class Interval Frequency ($f_i$) Midpoint ($x_i$) $f_i x_i$
10-202$\frac{10+20}{2} = 15$$2 \times 15 = 30$
20-303$\frac{20+30}{2} = 25$$3 \times 25 = 75$
30-408$\frac{30+40}{2} = 35$$8 \times 35 = 280$
40-5014$\frac{40+50}{2} = 45$$14 \times 45 = 630$
50-608$\frac{50+60}{2} = 55$$8 \times 55 = 440$
60-703$\frac{60+70}{2} = 65$$3 \times 65 = 195$
70-802$\frac{70+80}{2} = 75$$2 \times 75 = 150$
Total $\sum\limits f_i = 40$ $\sum\limits f_i x_i = 1800$

The mean ($\overline{x}$) is given by the formula:

$\overline{x} = \frac{\sum\limits f_i x_i}{\sum\limits f_i}$

$\overline{x} = \frac{1800}{40} = 45$

The mean of the data is 45.


Next, we calculate the absolute deviation of each midpoint from the mean, $|x_i - \overline{x}| = |x_i - 45|$, and the product $f_i |x_i - 45|$.

Class Interval $x_i$ $f_i$ $|x_i - 45|$ $f_i |x_i - 45|$
10-20152$|15 - 45| = |-30| = 30$$2 \times 30 = 60$
20-30253$|25 - 45| = |-20| = 20$$3 \times 20 = 60$
30-40358$|35 - 45| = |-10| = 10$$8 \times 10 = 80$
40-504514$|45 - 45| = |0| = 0$$14 \times 0 = 0$
50-60558$|55 - 45| = |10| = 10$$8 \times 10 = 80$
60-70653$|65 - 45| = |20| = 20$$3 \times 20 = 60$
70-80752$|75 - 45| = |30| = 30$$2 \times 30 = 60$
Total $\sum\limits f_i = 40$ $\sum\limits f_i |x_i - 45| \ $$ = 400$

Finally, we calculate the mean deviation about the mean (MD($\overline{x}$)).

MD($\overline{x}$) = $\frac{\sum\limits f_i |x_i - \overline{x}|}{\sum\limits f_i}$

MD($\overline{x}$) = $\frac{400}{40} = 10$

The mean deviation about the mean for the given data is 10.

Example 7: Calculate the mean deviation about median for the following data :

Class 0-10 10-20 20-30 30-40 40-50 50-60
Frequency 6 7 15 16 4 2

Answer:

The given data is a grouped frequency distribution:

Class Interval Frequency ($f_i$)
0-106
10-207
20-3015
30-4016
40-504
50-602

To find the mean deviation about the median, we first need to calculate the median (M).

We calculate the cumulative frequencies (c.f.) and the total frequency $N = \sum\limits f_i$.

Class Interval Frequency ($f_i$) Cumulative Frequency (c.f.)
0-1066
10-207$6 + 7 = 13$
20-3015$13 + 15 = 28$
30-4016$28 + 16 = 44$
40-504$44 + 4 = 48$
50-602$48 + 2 = 50$
Total $N = \sum\limits f_i = 50$

The total number of observations is $N = 50$. We need to find the class containing the $\left(\frac{N}{2}\right)^{\text{th}}$ observation.

$\frac{N}{2} = \frac{50}{2} = 25^{\text{th}}$ observation.

The cumulative frequency just greater than or equal to 25 is 28, which corresponds to the class interval 20-30.

So, the median class is 20-30.

For the median class (20-30):

Lower boundary (L) = 20

Frequency of the median class (f) = 15

Cumulative frequency of the class preceding the median class (c.f.) = 13 (c.f. of 10-20 class)

Class size (h) = $30 - 20 = 10$

The median (M) is calculated using the formula:

$M = L + \frac{\frac{N}{2} - c.f.}{f} \times h$

$M = 20 + \frac{25 - 13}{15} \times 10$

$M = 20 + \frac{12}{15} \times 10$

$M = 20 + \frac{4}{5} \times 10$

$M = 20 + 4 \times 2$

$M = 20 + 8$

$M = 28$

The median of the data is 28.


Next, we calculate the midpoints ($x_i$) of each class interval, the absolute deviation from the median $|x_i - 28|$, and the product $f_i |x_i - 28|$.

Class Interval $f_i$ Midpoint ($x_i$) $|x_i - 28|$ $f_i |x_i - 28|$
0-106$\frac{0+10}{2} = 5$$|5 - 28| = |-23| = 23$$6 \times 23 = 138$
10-207$\frac{10+20}{2} = 15$$|15 - 28| = |-13| = 13$$7 \times 13 = 91$
20-3015$\frac{20+30}{2} = 25$$|25 - 28| = |-3| = 3$$15 \times 3 = 45$
30-4016$\frac{30+40}{2} = 35$$|35 - 28| = |7| = 7$$16 \times 7 = 112$
40-504$\frac{40+50}{2} = 45$$|45 - 28| = |17| = 17$$4 \times 17 = 68$
50-602$\frac{50+60}{2} = 55$$|55 - 28| = |27| = 27$$2 \times 27 = 54$
Total $\sum\limits f_i = 50$ $\sum\limits f_i |x_i - 28| \ $$ = 508$

Finally, we calculate the mean deviation about the median (MD(M)).

MD(M) = $\frac{\sum\limits f_i |x_i - M|}{\sum\limits f_i}$

MD(M) = $\frac{508}{50} = 10.16$

The mean deviation about the median for the given data is 10.16.



Exercise 15.1

Find the mean deviation about the mean for the data in Exercises 1 and 2.

Question 1.

478910121317

Answer:

The given data is: 4, 7, 8, 9, 10, 12, 13, 17.

The number of observations is $n = 8$.


First, we find the mean of the data.

Mean ($\overline{x}$) = $\frac{\text{Sum of observations}}{\text{Number of observations}}$

Sum of observations = $4 + 7 + 8 + 9 + 10 + 12 + 13 + 17 = 80$

$\overline{x} = \frac{80}{8} = 10$

The mean of the data is 10.


Next, we find the absolute deviation of each observation from the mean, i.e., $|x_i - \overline{x}| = |x_i - 10|$.

$|4 - 10| = |-6| = 6$

$|7 - 10| = |-3| = 3$

$|8 - 10| = |-2| = 2$

$|9 - 10| = |-1| = 1$

$|10 - 10| = |0| = 0$

$|12 - 10| = |2| = 2$

$|13 - 10| = |3| = 3$

$|17 - 10| = |7| = 7$


Now, we find the sum of the absolute deviations.

$\sum\limits_{i=1}^{8} |x_i - \overline{x}| = 6 + 3 + 2 + 1 + 0 + 2 + 3 + 7 = 24$


Finally, we calculate the mean deviation about the mean (MD($\overline{x}$)).

MD($\overline{x}$) = $\frac{\sum\limits_{i=1}^{n} |x_i - \overline{x}|}{n}$

MD($\overline{x}$) = $\frac{24}{8} = 3$

The mean deviation about the mean for the given data is 3.

Question 2.

38704840425563465444

Answer:

The given data is: 38, 70, 48, 40, 42, 55, 63, 46, 54, 44.

The number of observations is $n$. By counting the data points, we have $n = 10$.


First, we find the mean of the data ($\overline{x}$).

$\overline{x} = \frac{\text{Sum of observations}}{\text{Number of observations}} = \frac{\sum\limits x_i}{n}$

Sum of observations ($\sum\limits x_i$) = $38 + 70 + 48 + 40 + 42 + 55 + 63 \ $$ + 46 + 54 + 44 = 500$

$\overline{x} = \frac{500}{10} = 50$

The mean of the data is 50.


Next, we find the absolute deviation of each observation from the mean, i.e., $|x_i - \overline{x}| = |x_i - 50|$.

$|38 - 50| = |-12| = 12$

$|70 - 50| = |20| = 20$

$|48 - 50| = |-2| = 2$

$|40 - 50| = |-10| = 10$

$|42 - 50| = |-8| = 8$

$|55 - 50| = |5| = 5$

$|63 - 50| = |13| = 13$

$|46 - 50| = |-4| = 4$

$|54 - 50| = |4| = 4$

$|44 - 50| = |-6| = 6$


Now, we find the sum of the absolute deviations.

$\sum\limits |x_i - \overline{x}| = 12 + 20 + 2 + 10 + 8 + 5 + 13 + 4 + 4 + 6 = 84$


Finally, we calculate the mean deviation about the mean (MD($\overline{x}$)).

MD($\overline{x}$) = $\frac{\sum\limits |x_i - \overline{x}|}{n} = \frac{84}{10}$

MD($\overline{x}$) = $8.4$

The mean deviation about the mean for the given data is 8.4.

Find the mean deviation about the median for the data in Exercises 3 and 4.

Question 3.

13171614111310161118
1217

Answer:

The given data is: 13, 17, 16, 14, 11, 13, 10, 16, 11, 18, 12, 17.

The number of observations is $n$. By counting the data points, we have $n = 12$.


To find the mean deviation about the median, we first need to calculate the median (M).

We arrange the data in ascending order:

10, 11, 11, 12, 13, 13, 14, 16, 16, 17, 17, 18.


Since the number of observations ($n = 12$) is even, the median is the average of the $\left(\frac{n}{2}\right)^{\text{th}}$ and $\left(\frac{n}{2} + 1\right)^{\text{th}}$ observations.

$\frac{n}{2} = \frac{12}{2} = 6^{\text{th}}$ observation.

$\frac{n}{2} + 1 = 6 + 1 = 7^{\text{th}}$ observation.

The $6^{\text{th}}$ observation in the arranged data is 13.

The $7^{\text{th}}$ observation in the arranged data is 14.

Median (M) = $\frac{6^{\text{th}} \text{ observation} + 7^{\text{th}} \text{ observation}}{2} = \frac{13 + 14}{2} = \frac{27}{2} = 13.5$

The median of the data is 13.5.


Next, we find the absolute deviation of each observation from the median, i.e., $|x_i - M| = |x_i - 13.5|$.

$|10 - 13.5| = 3.5$

$|11 - 13.5| = 2.5$

$|11 - 13.5| = 2.5$

$|12 - 13.5| = 1.5$

$|13 - 13.5| = 0.5$

$|13 - 13.5| = 0.5$

$|14 - 13.5| = 0.5$

$|16 - 13.5| = 2.5$

$|16 - 13.5| = 2.5$

$|17 - 13.5| = 3.5$

$|17 - 13.5| = 3.5$

$|18 - 13.5| = 4.5$


Now, we find the sum of the absolute deviations.

$\sum\limits_{i=1}^{12} |x_i - M| = 3.5 + 2.5 + 2.5 + 1.5 + 0.5 + 0.5 + 0.5 + 2.5 + 2.5 \ $$ + 3.5 + 3.5 + 4.5 = 28$


Finally, we calculate the mean deviation about the median (MD(M)).

MD(M) = $\frac{\sum\limits_{i=1}^{n} |x_i - M|}{n}$

MD(M) = $\frac{28}{12} = \frac{7}{3}$

MD(M) $\approx 2.33$ (approximately)

The mean deviation about the median for the given data is $\frac{7}{3}$ or approximately 2.33.

Question 4.

36724642604553465149

Answer:

The given data is: 36, 72, 46, 42, 60, 45, 53, 46, 51, 49.

The number of observations is $n$. By counting the data points, we have $n = 10$.


To find the mean deviation about the median, we first need to calculate the median (M).

We arrange the data in ascending order:

36, 42, 45, 46, 46, 49, 51, 53, 60, 72.


Since the number of observations ($n = 10$) is even, the median is the average of the $\left(\frac{n}{2}\right)^{\text{th}}$ and $\left(\frac{n}{2} + 1\right)^{\text{th}}$ observations.

$\frac{n}{2} = \frac{10}{2} = 5^{\text{th}}$ observation.

$\frac{n}{2} + 1 = 5 + 1 = 6^{\text{th}}$ observation.

The $5^{\text{th}}$ observation in the arranged data is 46.

The $6^{\text{th}}$ observation in the arranged data is 49.

Median (M) = $\frac{5^{\text{th}} \text{ observation} + 6^{\text{th}} \text{ observation}}{2} = \frac{46 + 49}{2} = \frac{95}{2} = 47.5$

The median of the data is 47.5.


Next, we find the absolute deviation of each observation from the median, i.e., $|x_i - M| = |x_i - 47.5|$.

$|36 - 47.5| = |-11.5| = 11.5$

$|42 - 47.5| = |-5.5| = 5.5$

$|45 - 47.5| = |-2.5| = 2.5$

$|46 - 47.5| = |-1.5| = 1.5$

$|46 - 47.5| = |-1.5| = 1.5$

$|49 - 47.5| = |1.5| = 1.5$

$|51 - 47.5| = |3.5| = 3.5$

$|53 - 47.5| = |5.5| = 5.5$

$|60 - 47.5| = |12.5| = 12.5$

$|72 - 47.5| = |24.5| = 24.5$


Now, we find the sum of the absolute deviations.

$\sum\limits_{i=1}^{10} |x_i - M| = 11.5 + 5.5 + 2.5 + 1.5 + 1.5 + 1.5 + 3.5 + 5.5 \ $$ + 12.5 + 24.5 = 70.0$


Finally, we calculate the mean deviation about the median (MD(M)).

MD(M) = $\frac{\sum\limits_{i=1}^{n} |x_i - M|}{n}$

MD(M) = $\frac{70.0}{10} = 7.0$

The mean deviation about the median for the given data is 7.

Find the mean deviation about the mean for the data in Exercises 5 and 6.

Question 5.

$x_i$ 5 10 15 20 25
$f_i$ 7 4 6 3 5

Answer:

The given data is a discrete frequency distribution:

$x_i$ $f_i$
57
104
156
203
255

To find the mean deviation about the mean, we first need to calculate the mean ($\overline{x}$).

We calculate $f_i x_i$ for each value and the total frequency $\sum\limits f_i$ and the sum $\sum\limits f_i x_i$.

$x_i$ $f_i$ $f_i x_i$
57$7 \times 5 = 35$
104$4 \times 10 = 40$
156$6 \times 15 = 90$
203$3 \times 20 = 60$
255$5 \times 25 = 125$
$\sum\limits f_i = 25$ $\sum\limits f_i x_i = 350$

The mean ($\overline{x}$) is calculated as:

$\overline{x} = \frac{\sum\limits f_i x_i}{\sum\limits f_i}$

$\overline{x} = \frac{350}{25} = 14$

The mean of the data is 14.


Next, we calculate the absolute deviation of each observation from the mean, $|x_i - \overline{x}| = |x_i - 14|$, and the product $f_i |x_i - 14|$.

$x_i$ $f_i$ $|x_i - 14|$ $f_i |x_i - 14|$
57$|5 - 14| = 9$$7 \times 9 = 63$
104$|10 - 14| = 4$$4 \times 4 = 16$
156$|15 - 14| = 1$$6 \times 1 = 6$
203$|20 - 14| = 6$$3 \times 6 = 18$
255$|25 - 14| = 11$$5 \times 11 = 55$
$\sum\limits f_i = 25$ $\sum\limits f_i |x_i - 14| = 158$

Finally, we calculate the mean deviation about the mean (MD($\overline{x}$)).

MD($\overline{x}$) = $\frac{\sum\limits f_i |x_i - \overline{x}|}{\sum\limits f_i}$

MD($\overline{x}$) = $\frac{158}{25}$

MD($\overline{x}$) = $6.32$

The mean deviation about the mean for the given data is 6.32.

Question 6.

$x_i$ 10 30 50 70 90
$f_i$ 4 24 28 16 8

Answer:

The given data is a discrete frequency distribution:

$x_i$ $f_i$
104
3024
5028
7016
908

To find the mean deviation about the mean, we first need to calculate the mean ($\overline{x}$).

We calculate $f_i x_i$ for each value and the total frequency $\sum\limits f_i$ and the sum $\sum\limits f_i x_i$.

$x_i$ $f_i$ $f_i x_i$
104$4 \times 10 = 40$
3024$24 \times 30 = 720$
5028$28 \times 50 = 1400$
7016$16 \times 70 = 1120$
908$8 \times 90 = 720$
$\sum\limits f_i = 80$ $\sum\limits f_i x_i = 4000$

The mean ($\overline{x}$) is calculated as:

$\overline{x} = \frac{\sum\limits f_i x_i}{\sum\limits f_i}$

$\overline{x} = \frac{4000}{80} = 50$

The mean of the data is 50.


Next, we calculate the absolute deviation of each observation from the mean, $|x_i - \overline{x}| = |x_i - 50|$, and the product $f_i |x_i - 50|$.

$x_i$ $f_i$ $|x_i - 50|$ $f_i |x_i - 50|$
104$|10 - 50| = 40$$4 \times 40 = 160$
3024$|30 - 50| = 20$$24 \times 20 = 480$
5028$|50 - 50| = 0$$28 \times 0 = 0$
7016$|70 - 50| = 20$$16 \times 20 = 320$
908$|90 - 50| = 40$$8 \times 40 = 320$
$\sum\limits f_i = 80$ $\sum\limits f_i |x_i - 50| = 1280$

Finally, we calculate the mean deviation about the mean (MD($\overline{x}$)).

MD($\overline{x}$) = $\frac{\sum\limits f_i |x_i - \overline{x}|}{\sum\limits f_i}$

MD($\overline{x}$) = $\frac{1280}{80} = 16$

The mean deviation about the mean for the given data is 16.

Find the mean deviation about the median for the data in Exercises 7 and 8.

Question 7.

$x_i$ 5 7 9 10 12 15
$f_i$ 8 6 2 2 2 6

Answer:

The given data is a discrete frequency distribution:

$x_i$ $f_i$
58
76
92
102
122
156

To find the mean deviation about the median, we first need to calculate the median (M).

We calculate the total frequency $N = \sum\limits f_i$ and the cumulative frequencies (c.f.).

$x_i$ $f_i$ Cumulative Frequency (c.f.)
588
76$8 + 6 = 14$
92$14 + 2 = 16$
102$16 + 2 = 18$
122$18 + 2 = 20$
156$20 + 6 = 26$
$N = \sum\limits f_i = 26$

The total number of observations is $N = 26$, which is an even number.

For an even number of observations in a discrete frequency distribution, the median is the average of the values of the $\left(\frac{N}{2}\right)^{\text{th}}$ and $\left(\frac{N}{2} + 1\right)^{\text{th}}$ observations.

$\frac{N}{2} = \frac{26}{2} = 13^{\text{th}}$ observation.

$\frac{N}{2} + 1 = 13 + 1 = 14^{\text{th}}$ observation.

From the cumulative frequency table, the $13^{\text{th}}$ observation falls in the class where c.f. is 14, which corresponds to $x_i = 7$.

The $14^{\text{th}}$ observation also falls in the class where c.f. is 14, which corresponds to $x_i = 7$.

So, the median (M) = $\frac{7 + 7}{2} = 7$.

The median of the data is 7.


Next, we calculate the absolute deviation of each observation from the median, $|x_i - M| = |x_i - 7|$, and the product $f_i |x_i - 7|$.

$x_i$ $f_i$ $|x_i - 7|$ $f_i |x_i - 7|$
58$|5 - 7| = 2$$8 \times 2 = 16$
76$|7 - 7| = 0$$6 \times 0 = 0$
92$|9 - 7| = 2$$2 \times 2 = 4$
102$|10 - 7| = 3$$2 \times 3 = 6$
122$|12 - 7| = 5$$2 \times 5 = 10$
156$|15 - 7| = 8$$6 \times 8 = 48$
$\sum\limits f_i = 26$ $\sum\limits f_i |x_i - 7| = 84$

Finally, we calculate the mean deviation about the median (MD(M)).

MD(M) = $\frac{\sum\limits f_i |x_i - M|}{\sum\limits f_i}$

MD(M) = $\frac{84}{26} = \frac{42}{13}$

MD(M) $\approx 3.23$ (approximately)

The mean deviation about the median for the given data is $\frac{42}{13}$ or approximately 3.23.

Question 8.

$x_i$ 15 21 27 30 35
$f_i$ 3 5 6 7 8

Answer:

The given data is a discrete frequency distribution:

$x_i$ $f_i$
153
215
276
307
358

To find the mean deviation about the median, we first need to calculate the median (M).

We calculate the total frequency $N = \sum\limits f_i$ and the cumulative frequencies (c.f.).

$x_i$ $f_i$ Cumulative Frequency (c.f.)
1533
215$3 + 5 = 8$
276$8 + 6 = 14$
307$14 + 7 = 21$
358$21 + 8 = 29$
$N = \sum\limits f_i = 29$

The total number of observations is $N = 29$, which is an odd number.

For an odd number of observations in a discrete frequency distribution, the median is the value of the $\left(\frac{N+1}{2}\right)^{\text{th}}$ observation.

$\frac{N+1}{2} = \frac{29+1}{2} = \frac{30}{2} = 15^{\text{th}}$ observation.

From the cumulative frequency table, the $15^{\text{th}}$ observation falls in the class where c.f. is 21, which corresponds to $x_i = 30$.

So, the median (M) = 30.

The median of the data is 30.


Next, we calculate the absolute deviation of each observation from the median, $|x_i - M| = |x_i - 30|$, and the product $f_i |x_i - 30|$.

$x_i$ $f_i$ $|x_i - 30|$ $f_i |x_i - 30|$
153$|15 - 30| = 15$$3 \times 15 = 45$
215$|21 - 30| = 9$$5 \times 9 = 45$
276$|27 - 30| = 3$$6 \times 3 = 18$
307$|30 - 30| = 0$$7 \times 0 = 0$
358$|35 - 30| = 5$$8 \times 5 = 40$
$\sum\limits f_i = 29$ $\sum\limits f_i |x_i - 30| = 148$

Finally, we calculate the mean deviation about the median (MD(M)).

MD(M) = $\frac{\sum\limits f_i |x_i - M|}{\sum\limits f_i}$

MD(M) = $\frac{148}{29}$

MD(M) $\approx 5.103$ (approximately)

The mean deviation about the median for the given data is $\frac{148}{29}$ or approximately 5.103.

Find the mean deviation about the mean for the data in Exercises 9 and 10.

Question 9.

Income per day in ₹ 0-100 100-200 200-300 300-400 400-500 500-600 600-700 700-800
Number of persons 4 8 9 10 7 5 4 3

Answer:

The given data is a grouped frequency distribution:

Income per day in $\textsf{₹}$ (Class Interval) Number of persons ($f_i$)
0-1004
100-2008
200-3009
300-40010
400-5007
500-6005
600-7004
700-8003

To find the mean deviation about the mean, we first need to calculate the mean ($\overline{x}$).

For grouped data, the mean is calculated using the midpoints of the class intervals.

Let $x_i$ be the midpoint of the $i$-th class interval and $f_i$ be the corresponding frequency.

Calculate the midpoints ($x_i$) and the product $f_i x_i$:

Class Interval Frequency ($f_i$) Midpoint ($x_i$) $f_i x_i$
0-1004$\frac{0+100}{2} = 50$$4 \times 50 = 200$
100-2008$\frac{100+200}{2} = 150$$8 \times 150 = 1200$
200-3009$\frac{200+300}{2} = 250$$9 \times 250 = 2250$
300-40010$\frac{300+400}{2} = 350$$10 \times 350 = 3500$
400-5007$\frac{400+500}{2} = 450$$7 \times 450 = 3150$
500-6005$\frac{500+600}{2} = 550$$5 \times 550 = 2750$
600-7004$\frac{600+700}{2} = 650$$4 \times 650 = 2600$
700-8003$\frac{700+800}{2} = 750$$3 \times 750 = 2250$
Total $\sum\limits f_i = 50$ $\sum\limits f_i x_i = 17900$

The mean ($\overline{x}$) is given by the formula:

$\overline{x} = \frac{\sum\limits f_i x_i}{\sum\limits f_i}$

$\overline{x} = \frac{17900}{50} = \frac{1790}{5} = 358$

The mean income per day is $\textsf{₹}$ 358.


Next, we calculate the absolute deviation of each midpoint from the mean, $|x_i - \overline{x}| = |x_i - 358|$, and the product $f_i |x_i - 358|$.

Class Interval $x_i$ $f_i$ $|x_i - 358|$ $f_i |x_i - 358|$
0-100504$|50 - 358| = |-308| = 308$$4 \times 308 = 1232$
100-2001508$|150 - 358| = |-208| = 208$$8 \times 208 = 1664$
200-3002509$|250 - 358| = |-108| = 108$$9 \times 108 = 972$
300-40035010$|350 - 358| = |-8| = 8$$10 \times 8 = 80$
400-5004507$|450 - 358| = |92| = 92$$7 \times 92 = 644$
500-6005505$|550 - 358| = |192| = 192$$5 \times 192 = 960$
600-7006504$|650 - 358| = |292| = 292$$4 \times 292 = 1168$
700-8007503$|750 - 358| = |392| = 392$$3 \times 392 = 1176$
Total $\sum\limits f_i = 50$ $\sum\limits f_i |x_i - 358| \ $$ = 7896$

Finally, we calculate the mean deviation about the mean (MD($\overline{x}$)).

MD($\overline{x}$) = $\frac{\sum\limits f_i |x_i - \overline{x}|}{\sum\limits f_i}$

MD($\overline{x}$) = $\frac{7896}{50} = \frac{3948}{25} = 157.92$

The mean deviation about the mean for the given data is $\textsf{₹}$ 157.92.

Question 10.

Height in cms 95-105 105-115 115-125 125-135 135-145 145-155
Number of boys 9 13 26 30 12 10

Answer:

The given data is a grouped frequency distribution of height and number of boys:

Height in cms (Class Interval) Number of boys ($f_i$)
95-1059
105-11513
115-12526
125-13530
135-14512
145-15510

To find the mean deviation about the median, we first need to calculate the median (M).

We calculate the cumulative frequencies (c.f.) and the total frequency $N = \sum\limits f_i$.

Class Interval Frequency ($f_i$) Cumulative Frequency (c.f.)
95-10599
105-11513$9 + 13 = 22$
115-12526$22 + 26 = 48$
125-13530$48 + 30 = 78$
135-14512$78 + 12 = 90$
145-15510$90 + 10 = 100$
Total $N = \sum\limits f_i = 100$

The total number of observations is $N = 100$. We need to find the class containing the $\left(\frac{N}{2}\right)^{\text{th}}$ observation.

$\frac{N}{2} = \frac{100}{2} = 50^{\text{th}}$ observation.

The cumulative frequency just greater than or equal to 50 is 78, which corresponds to the class interval 125-135.

So, the median class is 125-135.

For the median class (125-135):

Lower boundary (L) = 125

Frequency of the median class (f) = 30

Cumulative frequency of the class preceding the median class (c.f.) = 48 (c.f. of 115-125 class)

Class size (h) = $135 - 125 = 10$

The median (M) is calculated using the formula:

$M = L + \frac{\frac{N}{2} - c.f.}{f} \times h$

$M = 125 + \frac{50 - 48}{30} \times 10$

$M = 125 + \frac{2}{30} \times 10$

$M = 125 + \frac{1}{15} \times 10$

$M = 125 + \frac{10}{15} = 125 + \frac{2}{3}$

$M = \frac{125 \times 3 + 2}{3} = \frac{375 + 2}{3} = \frac{377}{3}$

The median of the data is $\frac{377}{3}$.


Next, we calculate the midpoints ($x_i$) of each class interval, the absolute deviation from the median $|x_i - \frac{377}{3}|$, and the product $f_i |x_i - \frac{377}{3}|$.

Class Interval $f_i$ Midpoint ($x_i$) $|x_i - \frac{377}{3}|$ $f_i |x_i - \frac{377}{3}|$
95-1059100$|100 - \frac{377}{3}| \ $$ = |\frac{300 - 377}{3}| = \frac{77}{3}$$9 \times \frac{77}{3} = 3 \times 77 = 231$
105-11513110$|110 - \frac{377}{3}| \ $$ = |\frac{330 - 377}{3}| = \frac{47}{3}$$13 \times \frac{47}{3} = \frac{611}{3}$
115-12526120$|120 - \frac{377}{3}| \ $$ = |\frac{360 - 377}{3}| = \frac{17}{3}$$26 \times \frac{17}{3} = \frac{442}{3}$
125-13530130$|130 - \frac{377}{3}| \ $$ = |\frac{390 - 377}{3}| = \frac{13}{3}$$30 \times \frac{13}{3} = 10 \times 13 = 130$
135-14512140$|140 - \frac{377}{3}| \ $$ = |\frac{420 - 377}{3}| = \frac{43}{3}$$12 \times \frac{43}{3} = 4 \times 43 = 172$
145-15510150$|150 - \frac{377}{3}| \ $$ = |\frac{450 - 377}{3}| = \frac{73}{3}$$10 \times \frac{73}{3} = \frac{730}{3}$
Total $\sum\limits f_i = 100$ $\sum\limits f_i |x_i - \frac{377}{3}| \ $$ = 231 + \frac{611}{3} + \frac{442}{3} \ $$ + 130 + 172 + \frac{730}{3}$

Sum of $f_i |x_i - \frac{377}{3}| = (231 + 130 + 172) + (\frac{611 + 442 + 730}{3})$

$= 533 + \frac{1783}{3} = \frac{533 \times 3 + 1783}{3} = \frac{1599 + 1783}{3} = \frac{3382}{3}$


Finally, we calculate the mean deviation about the median (MD(M)).

MD(M) = $\frac{\sum\limits f_i |x_i - M|}{\sum\limits f_i}$

MD(M) = $\frac{\frac{3382}{3}}{100} = \frac{3382}{3 \times 100} = \frac{3382}{300}$

MD(M) = $\frac{1691}{150}$

MD(M) $\approx 11.2733...$

The mean deviation about the median for the given data is $\frac{1691}{150}$ or approximately 11.27.

Question 11. Find the mean deviation about median for the following data :

Marks 0-10 10-20 20-30 30-40 40-50 50-60
Number of Girls 6 8 14 16 4 2

Answer:

The given data is a grouped frequency distribution of marks obtained by girls:

Marks (Class Interval) Number of Girls ($f_i$)
0-106
10-208
20-3014
30-4016
40-504
50-602

To find the mean deviation about the median, we first need to calculate the median (M).

We calculate the cumulative frequencies (c.f.) and the total frequency $N = \sum\limits f_i$.

Class Interval Frequency ($f_i$) Cumulative Frequency (c.f.)
0-1066
10-208$6 + 8 = 14$
20-3014$14 + 14 = 28$
30-4016$28 + 16 = 44$
40-504$44 + 4 = 48$
50-602$48 + 2 = 50$
Total $N = \sum\limits f_i = 50$

The total number of observations is $N = 50$. We need to find the class containing the $\left(\frac{N}{2}\right)^{\text{th}}$ observation.

$\frac{N}{2} = \frac{50}{2} = 25^{\text{th}}$ observation.

The cumulative frequency just greater than or equal to 25 is 28, which corresponds to the class interval 20-30.

So, the median class is 20-30.

For the median class (20-30):

Lower boundary (L) = 20

Frequency of the median class (f) = 14

Cumulative frequency of the class preceding the median class (c.f.) = 14 (c.f. of 10-20 class)

Class size (h) = $30 - 20 = 10$

The median (M) is calculated using the formula:

$M = L + \frac{\frac{N}{2} - c.f.}{f} \times h$

$M = 20 + \frac{25 - 14}{14} \times 10$

$M = 20 + \frac{11}{14} \times 10$

$M = 20 + \frac{110}{14} = 20 + \frac{55}{7}$

$M = \frac{20 \times 7 + 55}{7} = \frac{140 + 55}{7} = \frac{195}{7}$

The median of the data is $\frac{195}{7}$.


Next, we calculate the midpoints ($x_i$) of each class interval, the absolute deviation from the median $|x_i - \frac{195}{7}|$, and the product $f_i |x_i - \frac{195}{7}|$.

Note: $\frac{195}{7} \approx 27.857$

Class Interval $f_i$ Midpoint ($x_i$) $|x_i - \frac{195}{7}|$ $f_i |x_i - \frac{195}{7}|$
0-1065$|5 - \frac{195}{7}| \ $$ = |\frac{35 - 195}{7}| = \frac{160}{7}$$6 \times \frac{160}{7} = \frac{960}{7}$
10-20815$|15 - \frac{195}{7}| \ $$ = |\frac{105 - 195}{7}| = \frac{90}{7}$$8 \times \frac{90}{7} = \frac{720}{7}$
20-301425$|25 - \frac{195}{7}| \ $$ = |\frac{175 - 195}{7}| = \frac{20}{7}$$14 \times \frac{20}{7} = 2 \times 20 = 40$
30-401635$|35 - \frac{195}{7}| \ $$ = |\frac{245 - 195}{7}| = \frac{50}{7}$$16 \times \frac{50}{7} = \frac{800}{7}$
40-50445$|45 - \frac{195}{7}| \ $$ = |\frac{315 - 195}{7}| = \frac{120}{7}$$4 \times \frac{120}{7} = \frac{480}{7}$
50-60255$|55 - \frac{195}{7}| \ $$ = |\frac{385 - 195}{7}| = \frac{190}{7}$$2 \times \frac{190}{7} = \frac{380}{7}$
Total $\sum\limits f_i = 50$ $\sum\limits f_i |x_i - \frac{195}{7}| \ $$ = \frac{960}{7} + \frac{720}{7} + 40 \ $$ + \frac{800}{7} + \frac{480}{7} + \frac{380}{7}$

Sum of $f_i |x_i - \frac{195}{7}| = \frac{960 + 720 + 800 + 480 + 380}{7} + 40$

$= \frac{3340}{7} + 40 = \frac{3340 + 40 \times 7}{7} = \frac{3340 + 280}{7} = \frac{3620}{7}$


Finally, we calculate the mean deviation about the median (MD(M)).

MD(M) = $\frac{\sum\limits f_i |x_i - M|}{\sum\limits f_i}$

MD(M) = $\frac{\frac{3620}{7}}{50} = \frac{3620}{7 \times 50} = \frac{362}{7 \times 5} = \frac{362}{35}$

MD(M) $\approx 10.34$ (approximately)

The mean deviation about the median for the given data is $\frac{362}{35}$ or approximately 10.34.

Question 12. Calculate the mean deviation about median age for the age distribution of 100 persons given below:

Age (in years) 16-20 21-25 26-30 31-35 36-40 41-45 46-50 51-55
Number 5 6 12 14 26 12 16 9

[Hint: Convert the given data into continuous frequency distribution by subtracting 0.5 from the lower limit and adding 0.5 to the upper limit of each class interval]

Answer:

To calculate the mean deviation about the median, we first need to find the median of the given data. The class intervals are in the inclusive form, so we must first convert them into a continuous (exclusive) form as suggested in the hint.


Step 1: Preparing the Frequency Distribution Table and Finding the Median Class

We convert the given class intervals to continuous intervals by subtracting 0.5 from the lower limit and adding 0.5 to the upper limit of each class. We then prepare a table with the cumulative frequencies (c.f.).

Age (Continuous) Number of Persons ($f_i$) Cumulative Frequency (c.f.)
15.5 - 20.555
20.5 - 25.5611
25.5 - 30.51223
30.5 - 35.51437
35.5 - 40.52663
40.5 - 45.51275
45.5 - 50.51691
50.5 - 55.59100
Total$N = 100$

Here, the total number of observations is $N = 100$.

Now, we find the value of $\frac{N}{2} = \frac{100}{2} = 50$.

From the cumulative frequency column, we see that the cumulative frequency just greater than 50 is 63, which corresponds to the class interval 35.5 - 40.5. Therefore, this is the median class.


Step 2: Calculating the Median

The formula for the median (M) of a continuous frequency distribution is:

$M = l + \frac{\frac{N}{2} - C}{f} \times h$

Where:

  • $l$ = lower limit of the median class = 35.5
  • $N$ = total frequency = 100
  • $C$ = cumulative frequency of the class preceding the median class = 37
  • $f$ = frequency of the median class = 26
  • $h$ = class size = $40.5 - 35.5 = 5$

Substituting these values into the formula:

$M = 35.5 + \frac{50 - 37}{26} \times 5$

$M = 35.5 + \frac{13}{26} \times 5$

$M = 35.5 + 0.5 \times 5$

$M = 35.5 + 2.5 = 38$

So, the median age is 38 years.


Step 3: Calculating the Mean Deviation about the Median

The formula for the mean deviation about the median (M.D.(M)) is:

$M.D.(M) = \frac{1}{N} \sum\limits_{i=1}^{n} f_i |x_i - M|$

We now create a table to calculate the required values.

Age Class Mid-point ($x_i$) Frequency ($f_i$) $|x_i - M| = |x_i - 38|$ $f_i |x_i - M|$
15.5 - 20.5185$|18 - 38| = 20$$5 \times 20 = 100$
20.5 - 25.5236$|23 - 38| = 15$$6 \times 15 = 90$
25.5 - 30.52812$|28 - 38| = 10$$12 \times 10 = 120$
30.5 - 35.53314$|33 - 38| = 5$$14 \times 5 = 70$
35.5 - 40.53826$|38 - 38| = 0$$26 \times 0 = 0$
40.5 - 45.54312$|43 - 38| = 5$$12 \times 5 = 60$
45.5 - 50.54816$|48 - 38| = 10$$16 \times 10 = 160$
50.5 - 55.5539$|53 - 38| = 15$$9 \times 15 = 135$
Total$N = 100$$\sum\limits f_i |x_i - M| = 735$

From the table, we have $\sum\limits f_i |x_i - M| = 735$.

Now, we substitute the values into the formula for mean deviation:

$M.D.(M) = \frac{1}{100} \times 735$

$M.D.(M) = 7.35$


Answer:

The mean deviation about the median age for the given distribution is 7.35 years.



Example 8 to 12 (Before Exercise 15.2)

Example 8: Find the variance of the following data:

681012141618202224

Answer:

The given data is: 6, 8, 10, 12, 14, 16, 18, 20, 22, 24.

The number of observations is $n = 10$.


First, we find the mean of the data.

Mean ($\overline{x}$) = $\frac{\text{Sum of observations}}{\text{Number of observations}} = \frac{\sum\limits x_i}{n}$

Sum of observations ($\sum\limits x_i$) = $6 + 8 + 10 + 12 + 14 + 16 + 18 + 20 + 22 + 24 = 150$

$\overline{x} = \frac{150}{10} = 15$

The mean of the data is 15.


Next, we calculate the deviations from the mean ($x_i - \overline{x}$) and the squared deviations ($(x_i - \overline{x})^2$).

$x_i$ $x_i - \overline{x} = x_i - 15$ $(x_i - \overline{x})^2$
6$6 - 15 = -9$$(-9)^2 = 81$
8$8 - 15 = -7$$(-7)^2 = 49$
10$10 - 15 = -5$$(-5)^2 = 25$
12$12 - 15 = -3$$(-3)^2 = 9$
14$14 - 15 = -1$$(-1)^2 = 1$
16$16 - 15 = 1$$1^2 = 1$
18$18 - 15 = 3$$3^2 = 9$
20$20 - 15 = 5$$5^2 = 25$
22$22 - 15 = 7$$7^2 = 49$
24$24 - 15 = 9$$9^2 = 81$
Total $\sum\limits (x_i - \overline{x}) = 0$ $\sum\limits (x_i - \overline{x})^2 = 330$

The variance ($\sigma^2$) for ungrouped data is given by the formula:

$\sigma^2 = \frac{\sum\limits_{i=1}^{n} (x_i - \overline{x})^2}{n}$

$\sigma^2 = \frac{330}{10}$

$\sigma^2 = 33$

The variance of the given data is 33.

Example 9: Find the variance and standard deviation for the following data:

$x_i$ 4 8 11 17 20 24 32
$f_i$ 3 5 9 5 4 3 1

Answer:

The given data is a discrete frequency distribution:

$x_i$ $f_i$
43
85
119
175
204
243
321

To find the variance and standard deviation, we first need to calculate the mean ($\overline{x}$).

We calculate $f_i x_i$ for each value and the total frequency $\sum\limits f_i$ and the sum $\sum\limits f_i x_i$.

$x_i$ $f_i$ $f_i x_i$
43$3 \times 4 = 12$
85$5 \times 8 = 40$
119$9 \times 11 = 99$
175$5 \times 17 = 85$
204$4 \times 20 = 80$
243$3 \times 24 = 72$
321$1 \times 32 = 32$
$\sum\limits f_i = 30$ $\sum\limits f_i x_i = 420$

The mean ($\overline{x}$) is calculated as:

$\overline{x} = \frac{\sum\limits f_i x_i}{\sum\limits f_i}$

$\overline{x} = \frac{420}{30} = 14$

The mean of the data is 14.


Next, we calculate the deviations from the mean ($x_i - \overline{x}$), the squared deviations ($(x_i - \overline{x})^2$), and the product $f_i (x_i - \overline{x})^2$.

$x_i$ $f_i$ $x_i - 14$ $(x_i - 14)^2$ $f_i (x_i - 14)^2$
43$4 - 14 = -10$$(-10)^2 = 100$$3 \times 100 = 300$
85$8 - 14 = -6$$(-6)^2 = 36$$5 \times 36 = 180$
119$11 - 14 = -3$$(-3)^2 = 9$$9 \times 9 = 81$
175$17 - 14 = 3$$3^2 = 9$$5 \times 9 = 45$
204$20 - 14 = 6$$6^2 = 36$$4 \times 36 = 144$
243$24 - 14 = 10$$10^2 = 100$$3 \times 100 = 300$
321$32 - 14 = 18$$18^2 = 324$$1 \times 324 = 324$
$\sum\limits f_i = 30$ $\sum\limits f_i (x_i - 14)^2 = 1374$

The variance ($\sigma^2$) is given by the formula:

$\sigma^2 = \frac{\sum\limits f_i (x_i - \overline{x})^2}{\sum\limits f_i}$

$\sigma^2 = \frac{1374}{30} = \frac{137.4}{3} = 45.8$

The variance of the data is 45.8.


The standard deviation ($\sigma$) is the square root of the variance.

$\sigma = \sqrt{\sigma^2} = \sqrt{45.8}$

Calculating the square root:

$\sqrt{45.8} \approx 6.76757$

The standard deviation is approximately 6.77.

Example 10: Calculate the mean, variance and standard deviation for the following distribution :

Class 30-40 40-50 50-60 60-70 70-80 80-90 90-100
Frequency 3 7 12 15 8 3 2

Answer:

The given data is a grouped frequency distribution:

Class Interval Frequency ($f_i$)
30-403
40-507
50-6012
60-7015
70-808
80-903
90-1002

First, we calculate the mean ($\overline{x}$).

We find the midpoints ($x_i$) of each class interval and the product $f_i x_i$. We also find the total frequency $N = \sum\limits f_i$ and the sum $\sum\limits f_i x_i$.

Class Interval Frequency ($f_i$) Midpoint ($x_i$) $f_i x_i$
30-403$\frac{30+40}{2} = 35$$3 \times 35 = 105$
40-507$\frac{40+50}{2} = 45$$7 \times 45 = 315$
50-6012$\frac{50+60}{2} = 55$$12 \times 55 = 660$
60-7015$\frac{60+70}{2} = 65$$15 \times 65 = 975$
70-808$\frac{70+80}{2} = 75$$8 \times 75 = 600$
80-903$\frac{80+90}{2} = 85$$3 \times 85 = 255$
90-1002$\frac{90+100}{2} = 95$$2 \times 95 = 190$
Total $N = \sum\limits f_i = 50$ $\sum\limits f_i x_i = 3100$

The mean ($\overline{x}$) is given by the formula:

$\overline{x} = \frac{\sum\limits f_i x_i}{N}$

$\overline{x} = \frac{3100}{50} = 62$

The mean of the distribution is 62.


Next, we calculate the variance ($\sigma^2$).

We calculate the deviations from the mean $(x_i - \overline{x})$, the squared deviations $(x_i - \overline{x})^2$, and the product $f_i (x_i - \overline{x})^2$.

Class Interval $x_i$ $f_i$ $x_i - \overline{x} = x_i - 62$ $(x_i - \overline{x})^2$ $f_i (x_i - \overline{x})^2$
30-40353$35 - 62 = -27$$(-27)^2 = 729$$3 \times 729 = 2187$
40-50457$45 - 62 = -17$$(-17)^2 = 289$$7 \times 289 = 2023$
50-605512$55 - 62 = -7$$(-7)^2 = 49$$12 \times 49 = 588$
60-706515$65 - 62 = 3$$3^2 = 9$$15 \times 9 = 135$
70-80758$75 - 62 = 13$$13^2 = 169$$8 \times 169 = 1352$
80-90853$85 - 62 = 23$$23^2 = 529$$3 \times 529 = 1587$
90-100952$95 - 62 = 33$$33^2 = 1089$$2 \times 1089 = 2178$
Total $N = 50$ $\sum\limits (x_i - \overline{x}) = 0$ $\sum\limits f_i (x_i - \overline{x})^2 \ $$ = 10050$

The variance ($\sigma^2$) is given by the formula:

$\sigma^2 = \frac{\sum\limits f_i (x_i - \overline{x})^2}{N}$

$\sigma^2 = \frac{10050}{50}$

$\sigma^2 = 201$

The variance of the distribution is 201.


Finally, we calculate the standard deviation ($\sigma$), which is the square root of the variance.

$\sigma = \sqrt{\sigma^2} = \sqrt{201}$

Using a calculator, $\sqrt{201} \approx 14.177$

The standard deviation is approximately 14.18.

Example 11: Find the standard deviation for the following data :

$x_i$ 3 8 13 18 23
$f_i$ 7 10 15 10 6

Answer:

The given data is a discrete frequency distribution:

$x_i$ $f_i$
37
810
1315
1810
236

First, we find the mean ($\overline{x}$).

We calculate $f_i x_i$ for each value and the total frequency $\sum\limits f_i$ and the sum $\sum\limits f_i x_i$.

$x_i$ $f_i$ $f_i x_i$
37$7 \times 3 = 21$
810$10 \times 8 = 80$
1315$15 \times 13 = 195$
1810$10 \times 18 = 180$
236$6 \times 23 = 138$
$\sum\limits f_i = 48$ $\sum\limits f_i x_i = 614$

The mean ($\overline{x}$) is calculated as:

$\overline{x} = \frac{\sum\limits f_i x_i}{\sum\limits f_i} = \frac{614}{48} = \frac{307}{24}$

The mean of the data is $\frac{307}{24} \approx 12.79$.


Next, we calculate the variance ($\sigma^2$) and standard deviation ($\sigma$).

We can use the formula $\sigma^2 = \frac{1}{N} \sum\limits f_i x_i^2 - (\overline{x})^2$. For this, we need $x_i^2$ and $f_i x_i^2$.

$x_i$ $f_i$ $x_i^2$ $f_i x_i^2$
37$3^2 = 9$$7 \times 9 = 63$
810$8^2 = 64$$10 \times 64 = 640$
1315$13^2 = 169$$15 \times 169 = 2535$
1810$18^2 = 324$$10 \times 324 = 3240$
236$23^2 = 529$$6 \times 529 = 3174$
$N = \sum\limits f_i = 48$ $\sum\limits f_i x_i^2 = 9652$

The variance ($\sigma^2$) is:

$\sigma^2 = \frac{\sum\limits f_i x_i^2}{N} - (\overline{x})^2$

$\sigma^2 = \frac{9652}{48} - \left(\frac{614}{48}\right)^2$

$\sigma^2 = \frac{2413}{12} - \left(\frac{307}{24}\right)^2$

$\sigma^2 = \frac{2413}{12} - \frac{94249}{576}$

To combine the fractions, we find a common denominator, which is 576 ($12 \times 48 = 576$).

$\sigma^2 = \frac{2413 \times 48}{12 \times 48} - \frac{94249}{576}$

$\sigma^2 = \frac{115824}{576} - \frac{94249}{576}$

$\sigma^2 = \frac{115824 - 94249}{576} = \frac{21575}{576}$

The variance is $\frac{21575}{576}$.


The standard deviation ($\sigma$) is the square root of the variance.

$\sigma = \sqrt{\sigma^2} = \sqrt{\frac{21575}{576}} = \frac{\sqrt{21575}}{\sqrt{576}} = \frac{\sqrt{21575}}{24}$

Calculating the square root of 21575:

$\sqrt{21575} \approx 146.8849$

$\sigma \approx \frac{146.8849}{24} \approx 6.1199$

The standard deviation is approximately 6.12.

Example 12: Calculate mean, variance and standard deviation for the following distribution.

Classes 30-40 40-50 50-60 60-70 70-80 80-90 90-100
Frequency 3 7 12 15 8 3 2

Answer:

To calculate the mean, variance, and standard deviation for the given grouped data, we will use the step-deviation method, which simplifies the calculations.


Step 1: Construct the Calculation Table

We create a table to organize the data and intermediate calculations. Let's choose an assumed mean (A) from the mid-points. A good choice is the mid-point of the class with the highest frequency. Here, the class 60-70 has the highest frequency (15), so we'll set the assumed mean $A = 65$. The class size ($h$) is 10.

Classes Frequency ($f_i$) Mid-point ($x_i$) $y_i = \frac{x_i - A}{h} \ $$ = \frac{x_i - 65}{10}$ $f_i y_i$ $y_i^2$ $f_i y_i^2$
30-40335-3-9927
40-50745-2-14428
50-601255-1-12112
60-7015650000
70-808751818
80-9038526412
90-10029536918
Total$N = \sum\limits f_i = 50$$\sum\limits f_i y_i \ $$ = -15$$\sum\limits f_i y_i^2 \ $$ = 105$

From the table, we have:

$N = 50$, $\sum\limits f_i y_i = -15$, $\sum\limits f_i y_i^2 = 105$, $A = 65$, $h = 10$.


Step 2: Calculate the Mean ($\bar{x}$)

The formula for the mean using the step-deviation method is:

$\bar{x} = A + \left( \frac{\sum\limits f_i y_i}{N} \right) \times h$

Substituting the values from our table:

$\bar{x} = 65 + \left( \frac{-15}{50} \right) \times 10$

$\bar{x} = 65 - \frac{150}{50}$

$\bar{x} = 65 - 3 = 62$

The mean of the distribution is 62.


Step 3: Calculate the Variance ($\sigma^2$)

The formula for the variance using the step-deviation method is:

$\sigma^2 = h^2 \left[ \frac{\sum\limits f_i y_i^2}{N} - \left( \frac{\sum\limits f_i y_i}{N} \right)^2 \right]$

Substituting the values from our table:

$\sigma^2 = 10^2 \left[ \frac{105}{50} - \left( \frac{-15}{50} \right)^2 \right]$

$\sigma^2 = 100 \left[ 2.1 - \left( -0.3 \right)^2 \right]$

$\sigma^2 = 100 [2.1 - 0.09]$

$\sigma^2 = 100 [2.01] = 201$

The variance of the distribution is 201.


Step 4: Calculate the Standard Deviation ($\sigma$)

The standard deviation is the square root of the variance.

$\sigma = \sqrt{\sigma^2} = \sqrt{201}$

$\sigma \approx 14.177$

The standard deviation of the distribution is approximately 14.18.



Exercise 15.2

Find the mean and variance for each of the data in Exercies 1 to 5.

Question 1.

671012134812

Answer:

The given data is: 6, 7, 10, 12, 13, 4, 8, 12.

The number of observations is $n = 8$.


First, we find the mean of the data.

Mean ($\overline{x}$) = $\frac{\text{Sum of observations}}{\text{Number of observations}}$

Sum of observations = $6 + 7 + 10 + 12 + 13 + 4 + 8 + 12 = 72$

$\overline{x} = \frac{72}{8} = 9$

The mean of the data is 9.


Next, we calculate the deviations from the mean ($x_i - \overline{x}$) and the squared deviations ($(x_i - \overline{x})^2$).

$x_i$ $x_i - \overline{x} = x_i - 9$ $(x_i - \overline{x})^2$
6$6 - 9 = -3$$(-3)^2 = 9$
7$7 - 9 = -2$$(-2)^2 = 4$
10$10 - 9 = 1$$1^2 = 1$
12$12 - 9 = 3$$3^2 = 9$
13$13 - 9 = 4$$4^2 = 16$
4$4 - 9 = -5$$(-5)^2 = 25$
8$8 - 9 = -1$$(-1)^2 = 1$
12$12 - 9 = 3$$3^2 = 9$
Total $\sum\limits (x_i - \overline{x}) = 0$ $\sum\limits (x_i - \overline{x})^2 = 74$

The variance ($\sigma^2$) is given by the formula:

$\sigma^2 = \frac{\sum\limits_{i=1}^{n} (x_i - \overline{x})^2}{n}$

$\sigma^2 = \frac{74}{8} = \frac{37}{4} = 9.25$

The mean of the data is 9 and the variance is 9.25.

Question 2. First n natural numbers

Answer:

The data consists of the first $n$ natural numbers: $1, 2, 3, \dots, n$.

The number of observations is $n$.


First, we find the mean of the data.

The sum of the first $n$ natural numbers is given by the formula $\sum\limits_{i=1}^{n} i = \frac{n(n+1)}{2}$.

Mean ($\overline{x}$) = $\frac{\text{Sum of observations}}{\text{Number of observations}} = \frac{\sum\limits_{i=1}^{n} i}{n}$

$\overline{x} = \frac{\frac{n(n+1)}{2}}{n} = \frac{n(n+1)}{2n}$

$\overline{x} = \frac{n+1}{2}$

The mean of the first $n$ natural numbers is $\frac{n+1}{2}$.


Next, we find the variance ($\sigma^2$).

The variance can be calculated using the formula $\sigma^2 = \frac{\sum\limits_{i=1}^{n} x_i^2}{n} - (\overline{x})^2$.

The sum of the squares of the first $n$ natural numbers is given by the formula $\sum\limits_{i=1}^{n} i^2 = \frac{n(n+1)(2n+1)}{6}$.

Substituting the values for $\sum\limits x_i^2$ and $\overline{x}$ into the variance formula:

$\sigma^2 = \frac{\frac{n(n+1)(2n+1)}{6}}{n} - \left(\frac{n+1}{2}\right)^2$

$\sigma^2 = \frac{n(n+1)(2n+1)}{6n} - \frac{(n+1)^2}{4}$

$\sigma^2 = \frac{(n+1)(2n+1)}{6} - \frac{(n+1)^2}{4}$

To subtract the fractions, we find a common denominator, which is 12.

$\sigma^2 = \frac{2(n+1)(2n+1)}{12} - \frac{3(n+1)^2}{12}$

$\sigma^2 = \frac{(n+1)[2(2n+1) - 3(n+1)]}{12}$

$\sigma^2 = \frac{(n+1)[4n + 2 - 3n - 3]}{12}$

$\sigma^2 = \frac{(n+1)(n - 1)}{12}$

$\sigma^2 = \frac{n^2 - 1}{12}$

The variance of the first $n$ natural numbers is $\frac{n^2 - 1}{12}$.

The mean is $\frac{n+1}{2}$ and the variance is $\frac{n^2 - 1}{12}$.

Question 3. First 10 multiples of 3

Answer:

The data consists of the first 10 multiples of 3: 3, 6, 9, 12, 15, 18, 21, 24, 27, 30.

The number of observations is $n = 10$.


First, we find the mean of the data.

Sum of observations ($\sum\limits x_i$) = $3 + 6 + 9 + 12 + 15 + 18 + 21 + 24 + 27 + 30$

$\sum\limits x_i = 165$

Mean ($\overline{x}$) = $\frac{\sum\limits x_i}{n}$

$\overline{x} = \frac{165}{10} = 16.5$

The mean of the data is 16.5.


Next, we calculate the variance ($\sigma^2$). We will use the formula $\sigma^2 = \frac{\sum\limits x_i^2}{n} - (\overline{x})^2$.

We calculate the square of each observation ($x_i^2$) and their sum ($\sum\limits x_i^2$).

$x_i$ $x_i^2$
3$3^2 = 9$
6$6^2 = 36$
9$9^2 = 81$
12$12^2 = 144$
15$15^2 = 225$
18$18^2 = 324$
21$21^2 = 441$
24$24^2 = 576$
27$27^2 = 729$
30$30^2 = 900$
$\sum\limits x_i^2 = 3465$

The mean squared is $(\overline{x})^2 = (16.5)^2 = 272.25$.

The variance ($\sigma^2$) is given by the formula:

$\sigma^2 = \frac{\sum\limits x_i^2}{n} - (\overline{x})^2$

$\sigma^2 = \frac{3465}{10} - 272.25$

$\sigma^2 = 346.5 - 272.25$

$\sigma^2 = 74.25$

The mean of the data is 16.5 and the variance is 74.25.

Question 4.

$x_i$ 6 10 14 18 24 28 30
$f_i$ 2 4 7 12 8 4 3

Answer:

The given data is a discrete frequency distribution:

$x_i$ $f_i$
62
104
147
1812
248
284
303

First, we find the mean ($\overline{x}$).

We calculate $f_i x_i$ for each value and the total frequency $N = \sum\limits f_i$ and the sum $\sum\limits f_i x_i$.

$x_i$ $f_i$ $f_i x_i$
62$2 \times 6 = 12$
104$4 \times 10 = 40$
147$7 \times 14 = 98$
1812$12 \times 18 = 216$
248$8 \times 24 = 192$
284$4 \times 28 = 112$
303$3 \times 30 = 90$
$N = \sum\limits f_i = 40$ $\sum\limits f_i x_i = 760$

The mean ($\overline{x}$) is calculated as:

$\overline{x} = \frac{\sum\limits f_i x_i}{N}$

$\overline{x} = \frac{760}{40} = 19$

The mean of the data is 19.


Next, we calculate the variance ($\sigma^2$).

We calculate the deviations from the mean $(x_i - \overline{x})$, the squared deviations $(x_i - \overline{x})^2$, and the product $f_i (x_i - \overline{x})^2$.

$x_i$ $f_i$ $x_i - \overline{x} = x_i - 19$ $(x_i - \overline{x})^2$ $f_i (x_i - \overline{x})^2$
62$6 - 19 = -13$$(-13)^2 = 169$$2 \times 169 = 338$
104$10 - 19 = -9$$(-9)^2 = 81$$4 \times 81 = 324$
147$14 - 19 = -5$$(-5)^2 = 25$$7 \times 25 = 175$
1812$18 - 19 = -1$$(-1)^2 = 1$$12 \times 1 = 12$
248$24 - 19 = 5$$5^2 = 25$$8 \times 25 = 200$
284$28 - 19 = 9$$9^2 = 81$$4 \times 81 = 324$
303$30 - 19 = 11$$11^2 = 121$$3 \times 121 = 363$
Total $N = 40$ $\sum\limits (x_i - \overline{x}) = 0$ $\sum\limits f_i (x_i - \overline{x})^2 = 1736$

The variance ($\sigma^2$) is given by the formula:

$\sigma^2 = \frac{\sum\limits f_i (x_i - \overline{x})^2}{N}$

$\sigma^2 = \frac{1736}{40}$

$\sigma^2 = \frac{173.6}{4} = 43.4$

The mean of the data is 19 and the variance is 43.4.

Question 5.

$x_i$ 92 93 97 98 102 104 109
$f_i$ 3 2 3 2 6 3 3

Answer:

The given data is a discrete frequency distribution:

$x_i$ $f_i$
923
932
973
982
1026
1043
1093

First, we find the mean ($\overline{x}$).

We calculate $f_i x_i$ for each value and the total frequency $N = \sum\limits f_i$ and the sum $\sum\limits f_i x_i$.

$x_i$ $f_i$ $f_i x_i$
923$3 \times 92 = 276$
932$2 \times 93 = 186$
973$3 \times 97 = 291$
982$2 \times 98 = 196$
1026$6 \times 102 = 612$
1043$3 \times 104 = 312$
1093$3 \times 109 = 327$
$N = \sum\limits f_i = 22$ $\sum\limits f_i x_i = 2200$

The mean ($\overline{x}$) is calculated as:

$\overline{x} = \frac{\sum\limits f_i x_i}{N}$

$\overline{x} = \frac{2200}{22} = 100$

The mean of the data is 100.


Next, we calculate the variance ($\sigma^2$).

We calculate the deviations from the mean $(x_i - \overline{x})$, the squared deviations $(x_i - \overline{x})^2$, and the product $f_i (x_i - \overline{x})^2$.

$x_i$ $f_i$ $x_i - \overline{x} = x_i - 100$ $(x_i - \overline{x})^2$ $f_i (x_i - \overline{x})^2$
923$92 - 100 = -8$$(-8)^2 = 64$$3 \times 64 = 192$
932$93 - 100 = -7$$(-7)^2 = 49$$2 \times 49 = 98$
973$97 - 100 = -3$$(-3)^2 = 9$$3 \times 9 = 27$
982$98 - 100 = -2$$(-2)^2 = 4$$2 \times 4 = 8$
1026$102 - 100 = 2$$2^2 = 4$$6 \times 4 = 24$
1043$104 - 100 = 4$$4^2 = 16$$3 \times 16 = 48$
1093$109 - 100 = 9$$9^2 = 81$$3 \times 81 = 243$
Total $N = 22$ $\sum\limits (x_i - \overline{x}) = 0$ $\sum\limits f_i (x_i - \overline{x})^2 = 640$

The variance ($\sigma^2$) is given by the formula:

$\sigma^2 = \frac{\sum\limits f_i (x_i - \overline{x})^2}{N}$

$\sigma^2 = \frac{640}{22} = \frac{320}{11}$

$\sigma^2 \approx 29.09$ (approximately)

The mean of the data is 100 and the variance is $\frac{320}{11}$ or approximately 29.09.

Question 6. Find the mean and standard deviation using short-cut method.

$x_i$ 60 61 62 63 64 65 66 67 68
$f_i$ 2 1 12 29 25 12 10 4 5

Answer:

The given data is a discrete frequency distribution:

$x_i$ $f_i$
602
611
6212
6329
6425
6512
6610
674
685

We will use the short-cut method to find the mean and standard deviation.

Let the assumed mean be $A = 64$. We calculate the deviations $d_i = x_i - A = x_i - 64$, and the products $f_i d_i$ and $f_i d_i^2$. We also find the total frequency $N = \sum\limits f_i$.

$x_i$ $f_i$ $d_i = x_i - 64$ $f_i d_i$ $d_i^2$ $f_i d_i^2$
602-4$2 \times (-4) = -8$16$2 \times 16 = 32$
611-3$1 \times (-3) = -3$9$1 \times 9 = 9$
6212-2$12 \times (-2) = -24$4$12 \times 4 = 48$
6329-1$29 \times (-1) = -29$1$29 \times 1 = 29$
64250$25 \times 0 = 0$0$25 \times 0 = 0$
65121$12 \times 1 = 12$1$12 \times 1 = 12$
66102$10 \times 2 = 20$4$10 \times 4 = 40$
6743$4 \times 3 = 12$9$4 \times 9 = 36$
6854$5 \times 4 = 20$16$5 \times 16 = 80$
Total $N = \sum\limits f_i = 100$ $\sum\limits f_i d_i = 0$ $\sum\limits f_i d_i^2 = 286$

Mean ($\overline{x}$)

The mean is given by the formula: $\overline{x} = A + \frac{\sum\limits f_i d_i}{N}$

$\overline{x} = 64 + \frac{0}{100}$

$\overline{x} = 64 + 0 = 64$

The mean of the data is 64.


Variance ($\sigma^2$)

The variance is given by the formula: $\sigma^2 = \frac{\sum\limits f_i d_i^2}{N} - \left(\frac{\sum\limits f_i d_i}{N}\right)^2$

$\sigma^2 = \frac{286}{100} - \left(\frac{0}{100}\right)^2$

$\sigma^2 = 2.86 - 0^2 = 2.86$

The variance of the data is 2.86.


Standard Deviation ($\sigma$)

The standard deviation is the square root of the variance.

$\sigma = \sqrt{\sigma^2} = \sqrt{2.86}$

Using a calculator, $\sqrt{2.86} \approx 1.691$

The standard deviation is approximately 1.691.

Find the mean and variance for the following frequency distributions in Exercises 7 and 8.

Question 7.

Classes 0-30 30-60 60-90 90-120 120-150 150-180 180-210
Frequencies 2 3 5 10 3 5 2

Answer:

The given data is a grouped frequency distribution:

Class Interval Frequency ($f_i$)
0-302
30-603
60-905
90-12010
120-1503
150-1805
180-2102

First, we calculate the mean ($\overline{x}$).

We find the midpoints ($x_i$) of each class interval and the product $f_i x_i$. We also find the total frequency $N = \sum\limits f_i$ and the sum $\sum\limits f_i x_i$.

Class Interval Frequency ($f_i$) Midpoint ($x_i$) $f_i x_i$
0-302$\frac{0+30}{2} = 15$$2 \times 15 = 30$
30-603$\frac{30+60}{2} = 45$$3 \times 45 = 135$
60-905$\frac{60+90}{2} = 75$$5 \times 75 = 375$
90-12010$\frac{90+120}{2} = 105$$10 \times 105 = 1050$
120-1503$\frac{120+150}{2} = 135$$3 \times 135 = 405$
150-1805$\frac{150+180}{2} = 165$$5 \times 165 = 825$
180-2102$\frac{180+210}{2} = 195$$2 \times 195 = 390$
Total $N = \sum\limits f_i = 30$ $\sum\limits f_i x_i = 3210$

The mean ($\overline{x}$) is given by the formula:

$\overline{x} = \frac{\sum\limits f_i x_i}{N}$

$\overline{x} = \frac{3210}{30} = \frac{321}{3} = 107$

The mean of the distribution is 107.


Next, we calculate the variance ($\sigma^2$).

We can use the formula $\sigma^2 = \frac{\sum\limits f_i (x_i - \overline{x})^2}{N}$ or $\sigma^2 = \frac{\sum\limits f_i x_i^2}{N} - (\overline{x})^2$. Let's use the first formula by calculating the deviations from the mean $(x_i - \overline{x})$ and their squares.

Class Interval $x_i$ $f_i$ $x_i - \overline{x} = x_i - 107$ $(x_i - \overline{x})^2$ $f_i (x_i - \overline{x})^2$
0-30152$15 - 107 = -92$$(-92)^2 = 8464$$2 \times 8464 = 16928$
30-60453$45 - 107 = -62$$(-62)^2 = 3844$$3 \times 3844 = 11532$
60-90755$75 - 107 = -32$$(-32)^2 = 1024$$5 \times 1024 = 5120$
90-12010510$105 - 107 = -2$$(-2)^2 = 4$$10 \times 4 = 40$
120-1501353$135 - 107 = 28$$28^2 = 784$$3 \times 784 = 2352$
150-1801655$165 - 107 = 58$$58^2 = 3364$$5 \times 3364 = 16820$
180-2101952$195 - 107 = 88$$88^2 = 7744$$2 \times 7744 = 15488$
Total $N = 30$ $\sum\limits (x_i - \overline{x}) = 0$ $\sum\limits f_i (x_i - \overline{x})^2 \ $$ = 68280$

The variance ($\sigma^2$) is given by the formula:

$\sigma^2 = \frac{\sum\limits f_i (x_i - \overline{x})^2}{N}$

$\sigma^2 = \frac{68280}{30}$

$\sigma^2 = \frac{6828}{3} = 2276$

The mean of the distribution is 107 and the variance is 2276.

Question 8.

Classes 0-10 10-20 20-30 30-40 40-50
Frequencies 5 8 15 16 6

Answer:

The given data is a grouped frequency distribution:

Class Interval Frequency ($f_i$)
0-105
10-208
20-3015
30-4016
40-506

First, we calculate the mean ($\overline{x}$).

We find the midpoints ($x_i$) of each class interval and the product $f_i x_i$. We also find the total frequency $N = \sum\limits f_i$ and the sum $\sum\limits f_i x_i$.

Class Interval Frequency ($f_i$) Midpoint ($x_i$) $f_i x_i$
0-105$\frac{0+10}{2} = 5$$5 \times 5 = 25$
10-208$\frac{10+20}{2} = 15$$8 \times 15 = 120$
20-3015$\frac{20+30}{2} = 25$$15 \times 25 = 375$
30-4016$\frac{30+40}{2} = 35$$16 \times 35 = 560$
40-506$\frac{40+50}{2} = 45$$6 \times 45 = 270$
Total $N = \sum\limits f_i = 50$ $\sum\limits f_i x_i = 1350$

The mean ($\overline{x}$) is given by the formula:

$\overline{x} = \frac{\sum\limits f_i x_i}{N}$

$\overline{x} = \frac{1350}{50} = \frac{135}{5} = 27$

The mean of the distribution is 27.


Next, we calculate the variance ($\sigma^2$).

We calculate the deviations from the mean $(x_i - \overline{x})$, the squared deviations $(x_i - \overline{x})^2$, and the product $f_i (x_i - \overline{x})^2$.

Class Interval $x_i$ $f_i$ $x_i - \overline{x} = x_i - 27$ $(x_i - \overline{x})^2$ $f_i (x_i - \overline{x})^2$
0-1055$5 - 27 = -22$$(-22)^2 = 484$$5 \times 484 = 2420$
10-20158$15 - 27 = -12$$(-12)^2 = 144$$8 \times 144 = 1152$
20-302515$25 - 27 = -2$$(-2)^2 = 4$$15 \times 4 = 60$
30-403516$35 - 27 = 8$$8^2 = 64$$16 \times 64 = 1024$
40-50456$45 - 27 = 18$$18^2 = 324$$6 \times 324 = 1944$
Total $N = 50$ $\sum\limits (x_i - \overline{x}) = 0$ $\sum\limits f_i (x_i - \overline{x})^2 = 6600$

The variance ($\sigma^2$) is given by the formula:

$\sigma^2 = \frac{\sum\limits f_i (x_i - \overline{x})^2}{N}$

$\sigma^2 = \frac{6600}{50}$

$\sigma^2 = \frac{660}{5} = 132$

The mean of the distribution is 27 and the variance is 132.

Question 9. Find the mean, variance and standard deviation using short-cut method

Height in cms 70-75 75-80 80-85 85-90 90-95 95-100 100-105 105-110 110-115
No. of children 3 4 7 7 15 9 6 6 3

Answer:

To find the mean, variance, and standard deviation using the short-cut method (also known as the step-deviation method), we will first construct a calculation table. The step-deviation method is ideal here as the class sizes are uniform.


Step 1: Construct the Calculation Table

We choose an assumed mean (A) to simplify calculations. A good choice is the mid-point of the class with the highest frequency. The class 90-95 has the highest frequency (15), so we'll set the assumed mean $A = 92.5$. The class size ($h$) is $75 - 70 = 5$.

Height in cms No. of children ($f_i$) Mid-point ($x_i$) $y_i = \frac{x_i - A}{h} \ $$ = \frac{x_i - 92.5}{5}$ $f_i y_i$ $y_i^2$ $f_i y_i^2$
70-75372.5-4-121648
75-80477.5-3-12936
80-85782.5-2-14428
85-90787.5-1-717
90-951592.50000
95-100997.51919
100-1056102.5212424
105-1106107.5318954
110-1153112.54121648
Total$N = \sum\limits f_i = 60$$\sum\limits f_i y_i = 6$$\sum\limits f_i y_i^2 = 254$

From the table, we have:

$N = 60$, $\sum\limits f_i y_i = 6$, $\sum\limits f_i y_i^2 = 254$, $A = 92.5$, $h = 5$.


Step 2: Calculate the Mean ($\bar{x}$)

The formula for the mean using the short-cut method is:

$\bar{x} = A + \left( \frac{\sum\limits f_i y_i}{N} \right) \times h$

Substituting the values from our table:

$\bar{x} = 92.5 + \left( \frac{6}{60} \right) \times 5$

$\bar{x} = 92.5 + (0.1) \times 5$

$\bar{x} = 92.5 + 0.5 = 93$

The mean height is 93 cm.


Step 3: Calculate the Variance ($\sigma^2$)

The formula for the variance using the short-cut method is:

$\sigma^2 = h^2 \left[ \frac{\sum\limits f_i y_i^2}{N} - \left( \frac{\sum\limits f_i y_i}{N} \right)^2 \right]$

Substituting the values from our table:

$\sigma^2 = 5^2 \left[ \frac{254}{60} - \left( \frac{6}{60} \right)^2 \right]$

$\sigma^2 = 25 \left[ \frac{254}{60} - (0.1)^2 \right]$

$\sigma^2 = 25 \left[ 4.2333... - 0.01 \right]$

$\sigma^2 = 25 [4.2233...]$

$\sigma^2 \approx 105.58$

The variance is approximately 105.58 cm².


Step 4: Calculate the Standard Deviation ($\sigma$)

The standard deviation is the square root of the variance.

$\sigma = \sqrt{\sigma^2} \approx \sqrt{105.58}$

$\sigma \approx 10.275$

The standard deviation is approximately 10.28 cm.

Question 10. The diameters of circles (in mm) drawn in a design are given below:

Diameters 33-36 37-40 41-44 45-48 49-52
No. of circles 15 17 21 22 25

Calculate the standard deviation and mean diameter of the circles.

[Hint: First make the data continuous by making the classes as 32.5-36.5, 36.5-40.5, 40.5-44.5, 44.5 - 48.5, 48.5 - 52.5 and then proceed.]

Answer:

To calculate the mean and standard deviation, we will use the short-cut (step-deviation) method. First, we need to convert the discontinuous class intervals into continuous ones, as suggested in the hint.


Step 1: Preparing the Data and Calculation Table

The class intervals are made continuous by subtracting 0.5 from the lower limit and adding 0.5 to the upper limit of each class. We then set up a table for calculations. Let's choose an assumed mean (A) from the mid-points. The class 41-44 (or 40.5-44.5) is in the middle of the distribution, so we will set the assumed mean $A = 42.5$. The class size ($h$) is $36.5 - 32.5 = 4$.

Diameters (mm) No. of circles ($f_i$) Mid-point ($x_i$) $y_i = \frac{x_i - A}{h} = \frac{x_i - 42.5}{4}$ $f_i y_i$ $y_i^2$ $f_i y_i^2$
32.5-36.51534.5-2-30460
36.5-40.51738.5-1-17117
40.5-44.52142.50000
44.5-48.52246.5122122
48.5-52.52550.52504100
Total$N = \sum\limits f_i \ $$ = 100$$\sum\limits f_i y_i \ $$ = 25$$\sum\limits f_i y_i^2 \ $$ = 199$

From the table, we have:

$N = 100$, $\sum\limits f_i y_i = 25$, $\sum\limits f_i y_i^2 = 199$, $A = 42.5$, $h = 4$.


Step 2: Calculate the Mean Diameter ($\bar{x}$)

The formula for the mean using the step-deviation method is:

$\bar{x} = A + \left( \frac{\sum\limits f_i y_i}{N} \right) \times h$

Substituting the values from our table:

$\bar{x} = 42.5 + \left( \frac{25}{100} \right) \times 4$

$\bar{x} = 42.5 + (0.25) \times 4$

$\bar{x} = 42.5 + 1 = 43.5$

The mean diameter of the circles is 43.5 mm.


Step 3: Calculate the Variance ($\sigma^2$) and Standard Deviation ($\sigma$)

The formula for the variance using the step-deviation method is:

$\sigma^2 = h^2 \left[ \frac{\sum\limits f_i y_i^2}{N} - \left( \frac{\sum\limits f_i y_i}{N} \right)^2 \right]$

Substituting the values from our table:

$\sigma^2 = 4^2 \left[ \frac{199}{100} - \left( \frac{25}{100} \right)^2 \right]$

$\sigma^2 = 16 \left[ 1.99 - (0.25)^2 \right]$

$\sigma^2 = 16 [1.99 - 0.0625]$

$\sigma^2 = 16 [1.9275] = 30.84$

The standard deviation is the square root of the variance.

$\sigma = \sqrt{30.84}$

$\sigma \approx 5.55$

The standard deviation of the diameters is approximately 5.55 mm.



Example 13 to 15 (Before Exercise 15.3)

Example 13: Two plants A and B of a factory show following results about the number of workers and the wages paid to them

A B
No. of workers 5000 6000
Average monthly wages ₹ 2500 ₹ 2500
Variance of distribution of wages 81 100

In which plant, A or B is there greater variability in individual wages?

Answer:

Given information for Plant A and Plant B:

Plant A Plant B
Number of workers ($N$)50006000
Average monthly wages ($\overline{x}$)$\textsf{₹}$ 2500$\textsf{₹}$ 2500
Variance of distribution of wages ($\sigma^2$)81100

To compare the variability in individual wages, we need to calculate the Coefficient of Variation (C.V.) for each plant.

The formula for Coefficient of Variation is $C.V. = \frac{\sigma}{\overline{x}} \times 100$, where $\sigma$ is the standard deviation and $\overline{x}$ is the mean (average wages).

First, calculate the standard deviation ($\sigma$) from the variance ($\sigma^2 = \text{Variance}$).

For Plant A:

Standard Deviation ($\sigma_A$) = $\sqrt{\text{Variance}_A} = \sqrt{81} = 9$

Coefficient of Variation ($C.V._A$) = $\frac{\sigma_A}{\overline{x}_A} \times 100$

$C.V._A = \frac{9}{2500} \times 100 = \frac{900}{2500} = \frac{9}{25} = 0.36$

Coefficient of Variation for Plant A is 0.36%.


For Plant B:

Standard Deviation ($\sigma_B$) = $\sqrt{\text{Variance}_B} = \sqrt{100} = 10$

Coefficient of Variation ($C.V._B$) = $\frac{\sigma_B}{\overline{x}_B} \times 100$

$C.V._B = \frac{10}{2500} \times 100 = \frac{1000}{2500} = \frac{10}{25} = 0.4$

Coefficient of Variation for Plant B is 0.4%.


Comparing the Coefficients of Variation:

$C.V._A = 0.36$

$C.V._B = 0.4$

Since $C.V._B > C.V._A$ ($0.4 > 0.36$), there is greater variability in the wages in Plant B compared to Plant A.

Conclusion: There is greater variability in individual wages in Plant B.

Example 14: Coefficient of variation of two distributions are 60 and 70, and their standard deviations are 21 and 16, respectively. What are their arithmetic means.

Answer:

The formula for the Coefficient of Variation (C.V.) is:

$C.V. = \frac{\sigma}{\overline{x}} \times 100$

where $\sigma$ is the standard deviation and $\overline{x}$ is the arithmetic mean.

We can rearrange this formula to find the arithmetic mean:

$\overline{x} = \frac{\sigma}{C.V.} \times 100$


For the first distribution:

Given: $C.V._1 = 60$, $\sigma_1 = 21$

Arithmetic Mean ($\overline{x}_1$) = $\frac{\sigma_1}{C.V._1} \times 100$

$\overline{x}_1 = \frac{21}{60} \times 100$

$\overline{x}_1 = \frac{\cancel{21}^{7}}{\cancel{60}_{20}} \times \cancel{100}^{5}$

$\overline{x}_1 = 7 \times 5 = 35$

The arithmetic mean of the first distribution is 35.


For the second distribution:

Given: $C.V._2 = 70$, $\sigma_2 = 16$

Arithmetic Mean ($\overline{x}_2$) = $\frac{\sigma_2}{C.V._2} \times 100$

$\overline{x}_2 = \frac{16}{70} \times 100$

$\overline{x}_2 = \frac{160}{7}$

$\overline{x}_2 \approx 22.857$

The arithmetic mean of the second distribution is $\frac{160}{7}$ or approximately 22.86.

Example 15: The following values are calculated in respect of heights and weights of the students of a section of Class XI :

Height Weight
Mean 162.6 cm 52.36 kg
Variance 127.69 cm2 23.1361 kg2

Can we say that the weights show greater variation than the heights?

Answer:

The given information about the heights and weights of the students is:

For Height:

Mean ($\overline{x}_{\text{Height}}$) = 162.6 cm

Variance ($\sigma^2_{\text{Height}}$) = 127.69 cm$^2$

For Weight:

Mean ($\overline{x}_{\text{Weight}}$) = 52.36 kg

Variance ($\sigma^2_{\text{Weight}}$) = 23.1361 kg$^2$


To compare the variability of two distributions when they are measured in different units (cm and kg) or have significantly different means, we use the Coefficient of Variation (C.V.). A higher Coefficient of Variation indicates greater relative variability.

The formula for the Coefficient of Variation is given by:

$C.V. = \frac{\sigma}{\overline{x}} \times 100$

where $\sigma$ is the standard deviation and $\overline{x}$ is the mean.


First, we need to calculate the standard deviation ($\sigma$) for both height and weight from their respective variances ($\sigma = \sqrt{\sigma^2}$).

For Height:

Standard Deviation ($\sigma_{\text{Height}}$) = $\sqrt{\text{Variance}_{\text{Height}}} = \sqrt{127.69}$

$\sigma_{\text{Height}} = 11.3$ cm

For Weight:

Standard Deviation ($\sigma_{\text{Weight}}$) = $\sqrt{\text{Variance}_{\text{Weight}}} = \sqrt{23.1361}$

$\sigma_{\text{Weight}} = 4.81$ kg


Now, we calculate the Coefficient of Variation for both height and weight.

For Height:

$C.V._{\text{Height}} = \frac{\sigma_{\text{Height}}}{\overline{x}_{\text{Height}}} \times 100$

$C.V._{\text{Height}} = \frac{11.3}{162.6} \times 100$

$C.V._{\text{Height}} \approx 0.069495 \times 100 \approx 6.95\%$

For Weight:

$C.V._{\text{Weight}} = \frac{\sigma_{\text{Weight}}}{\overline{x}_{\text{Weight}}} \times 100$

$C.V._{\text{Weight}} = \frac{4.81}{52.36} \times 100$

$C.V._{\text{Weight}} \approx 0.091864 \times 100 \approx 9.19\%$


Comparing the Coefficients of Variation:

$C.V._{\text{Height}} \approx 6.95\%$

$C.V._{\text{Weight}} \approx 9.19\%$

Since $C.V._{\text{Weight}} > C.V._{\text{Height}}$ ($9.19\% > 6.95\%$), the weights show greater relative variation than the heights.

Conclusion: Yes, the weights show greater variation than the heights because the Coefficient of Variation for weights is greater than that for heights.



Exercise 15.3

Question 1. From the data given below state which group is more variable, A or B?

Marks 10-20 20-30 30-40 40-50 50-60 60-70 70-80
Group A 9 17 32 33 40 10 9
Group B 10 20 30 25 43 15 7

Answer:

To compare the variability of the two groups, A and B, we will calculate the Coefficient of Variation (C.V.) for each group. The group with the higher Coefficient of Variation is considered more variable.

The formula for the Coefficient of Variation is $C.V. = \frac{\sigma}{\overline{x}} \times 100$, where $\sigma$ is the standard deviation and $\overline{x}$ is the mean.

First, we calculate the mean ($\overline{x}$) and standard deviation ($\sigma$) for each group using the grouped frequency distribution method. The classes are continuous. The class size is $h = 20 - 10 = 10$. We calculate the midpoints ($x_i$) for each class.

Midpoints ($x_i$): 15, 25, 35, 45, 55, 65, 75.

We will use the step-deviation method with assumed mean $A = 45$ and $h = 10$.

Let $u_i = \frac{x_i - A}{h} = \frac{x_i - 45}{10}$.

$u_i$ values: -3, -2, -1, 0, 1, 2, 3.

$u_i^2$ values: 9, 4, 1, 0, 1, 4, 9.


For Group A:

Frequencies ($f_{iA}$): 9, 17, 32, 33, 40, 10, 9

Total frequency $N_A = \sum\limits f_{iA} = 9 + 17 + 32 + 33 + 40 + 10 + 9 = 150$

Class $x_i$ $f_{iA}$ $u_i = \frac{x_i - 45}{10}$ $f_{iA} u_i$ $u_i^2$ $f_{iA} u_i^2$
10-20159-3-27981
20-302517-2-34468
30-403532-1-32132
40-5045330000
50-605540140140
60-706510220440
70-80759327981
Total $N_A = 150$ $\sum\limits f_{iA} u_i = -6$ $\sum\limits f_{iA} u_i^2 = 342$

Mean for Group A ($\overline{x}_A$) = $A + \frac{\sum\limits f_{iA} u_i}{N_A} \times h = 45 + \frac{-6}{150} \times 10 \ $$ = 45 - \frac{60}{150} = 45 - 0.4 = 44.6$

Variance for Group A ($\sigma_A^2$) = $h^2 \left[ \frac{\sum\limits f_{iA} u_i^2}{N_A} - \left(\frac{\sum\limits f_{iA} u_i}{N_A}\right)^2 \right] \ $$ = 10^2 \left[ \frac{342}{150} - \left(\frac{-6}{150}\right)^2 \right] \ $$ = 100 \left[ 2.28 - \left(-\frac{1}{25}\right)^2 \right] = 100 [2.28 - 0.0016] \ $$ = 100 [2.2784] = 227.84$

Standard Deviation for Group A ($\sigma_A$) = $\sqrt{227.84} \approx 15.09437$

Coefficient of Variation for Group A ($C.V._A$) = $\frac{\sigma_A}{\overline{x}_A} \times 100 = \frac{15.09437}{44.6} \times 100 \approx 33.84\%$


For Group B:

Frequencies ($f_{iB}$): 10, 20, 30, 25, 43, 15, 7

Total frequency $N_B = \sum\limits f_{iB} = 10 + 20 + 30 + 25 + 43 + 15 + 7 = 150$

Class $x_i$ $f_{iB}$ $u_i = \frac{x_i - 45}{10}$ $f_{iB} u_i$ $u_i^2$ $f_{iB} u_i^2$
10-201510-3-30990
20-302520-2-40480
30-403530-1-30130
40-5045250000
50-605543143143
60-706515230460
70-80757321963
Total $N_B = 150$ $\sum\limits f_{iB} u_i = -6$ $\sum\limits f_{iB} u_i^2 = 366$

Mean for Group B ($\overline{x}_B$) = $A + \frac{\sum\limits f_{iB} u_i}{N_B} \times h = 45 + \frac{-6}{150} \times 10 = 45 - 0.4 = 44.6$

Variance for Group B ($\sigma_B^2$) = $h^2 \left[ \frac{\sum\limits f_{iB} u_i^2}{N_B} - \left(\frac{\sum\limits f_{iB} u_i}{N_B}\right)^2 \right] \ $$ = 10^2 \left[ \frac{366}{150} - \left(\frac{-6}{150}\right)^2 \right] \ $$ = 100 \left[ 2.44 - \left(-\frac{1}{25}\right)^2 \right] \ $$ = 100 [2.44 - 0.0016] = 100 [2.4384] = 243.84$

Standard Deviation for Group B ($\sigma_B$) = $\sqrt{243.84} \approx 15.61537$

Coefficient of Variation for Group B ($C.V._B$) = $\frac{\sigma_B}{\overline{x}_B} \times 100 = \frac{15.61537}{44.6} \times 100 \approx 35.01\%$


Comparing the Coefficients of Variation:

$C.V._A \approx 33.84\%$

$C.V._B \approx 35.01\%$

Since the Coefficient of Variation for Group B ($35.01\%$) is greater than the Coefficient of Variation for Group A ($33.84\%$), Group B is more variable.

Conclusion: Group B is more variable than Group A.

Question 2. From the prices of shares X and Y below, find out which is more stable in value:

X 35 54 52 53 56 58 52 50 51 49
Y 108 107 105 105 106 107 104 103 104 101

Answer:

To determine which share is more stable in value, we need to compare their variability. Since the mean prices of the two shares are different, the best measure for comparison is the Coefficient of Variation (C.V.). The share with the lower C.V. is considered more stable.

The formula for the Coefficient of Variation is:

$C.V. = \frac{\sigma}{\bar{x}} \times 100$

Where $\sigma$ is the standard deviation and $\bar{x}$ is the mean.


Analysis for Share X

The prices for share X are: 35, 54, 52, 53, 56, 58, 52, 50, 51, 49.

Number of observations, $n = 10$.

1. Calculate the Mean ($\bar{x}_X$)

$\sum\limits x_i = 35+54+52+53+56+58+52+50+51+49 = 510$

$\bar{x}_X = \frac{\sum\limits x_i}{n} = \frac{510}{10} = 51$

2. Calculate the Standard Deviation ($\sigma_X$)

We'll first find the variance ($\sigma_X^2$). The formula is $\sigma_X^2 = \frac{\sum\limits (x_i - \bar{x}_X)^2}{n}$.

$x_i$ $x_i - \bar{x}_X = x_i - 51$ $(x_i - \bar{x}_X)^2$
35-16256
5439
5211
5324
56525
58749
5211
50-11
5100
49-24
Total$\sum\limits (x_i - \bar{x}_X)^2 = 350$

Variance, $\sigma_X^2 = \frac{350}{10} = 35$.

Standard Deviation, $\sigma_X = \sqrt{35} \approx 5.916$.

3. Calculate the Coefficient of Variation (C.V._X)

$C.V._X = \frac{5.916}{51} \times 100 \approx 11.6$


Analysis for Share Y

The prices for share Y are: 108, 107, 105, 105, 106, 107, 104, 103, 104, 101.

Number of observations, $n = 10$.

1. Calculate the Mean ($\bar{x}_Y$)

$\sum\limits y_i = 108+107+105+105+106+107+104+103+104+101 \ $$ = 1050$

$\bar{x}_Y = \frac{\sum\limits y_i}{n} = \frac{1050}{10} = 105$

2. Calculate the Standard Deviation ($\sigma_Y$)

The formula is $\sigma_Y^2 = \frac{\sum\limits (y_i - \bar{x}_Y)^2}{n}$.

$y_i$ $y_i - \bar{x}_Y = y_i - 105$ $(y_i - \bar{x}_Y)^2$
10839
10724
10500
10500
10611
10724
104-11
103-24
104-11
101-416
Total$\sum\limits (y_i - \bar{x}_Y)^2 = 40$

Variance, $\sigma_Y^2 = \frac{40}{10} = 4$.

Standard Deviation, $\sigma_Y = \sqrt{4} = 2$.

3. Calculate the Coefficient of Variation (C.V._Y)

$C.V._Y = \frac{2}{105} \times 100 \approx 1.90$


Conclusion

We compare the coefficients of variation for both shares:

  • C.V. for Share X $\approx 11.6$
  • C.V. for Share Y $\approx 1.90$

Since the Coefficient of Variation for Share Y is smaller than the Coefficient of Variation for Share X ($1.90 < 11.6$), the prices of Share Y are more stable in value.

Question 3. An analysis of monthly wages paid to workers in two firms A and B, belonging to the same industry, gives the following results:

(i) Which firm A or B pays larger amount as monthly wages?

(ii) Which firm, A or B, shows greater variability in individual wages?

Firm A Firm B
No. of wage earners 586 648
Mean of monthly wages 5253 5253
Variance of the distribution of wages 100 121

Answer:

Given Data:

Firm A Firm B
No. of wage earners ($n$) $n_A = 586$ $n_B = 648$
Mean of monthly wages ($\bar{x}$) $\bar{x}_A = \textsf{₹ } 5253$ $\bar{x}_B = \textsf{₹ } 5253$
Variance of the distribution of wages ($\sigma^2$) $\sigma_A^2 = 100$ $\sigma_B^2 = 121$

(i) Which firm A or B pays a larger amount as monthly wages?

To find the total amount paid as monthly wages by each firm, we multiply the number of wage earners by the mean monthly wage.

Total monthly wages = Number of wage earners $\times$ Mean monthly wage

For Firm A:

Total Wages$_A = n_A \times \bar{x}_A$

Total Wages$_A = 586 \times 5253 = \textsf{₹ } 30,78,258$

For Firm B:

Total Wages$_B = n_B \times \bar{x}_B$

Total Wages$_B = 648 \times 5253 = \textsf{₹ } 34,03,944$

Comparing the total wages, we see that $\textsf{₹ } 34,03,944 > \textsf{₹ } 30,78,258$.

Therefore, Firm B pays a larger amount as monthly wages.


(ii) Which firm, A or B, shows greater variability in individual wages?

Variability is measured by variance or standard deviation. Since the mean monthly wages for both firms are the same ($\bar{x}_A = \bar{x}_B = \textsf{₹ } 5253$), we can directly compare their variances to determine which has greater variability. The firm with the higher variance has greater variability.

Alternatively, we can calculate the standard deviation ($\sigma$) for each firm.

Standard Deviation = $\sqrt{\text{Variance}}$

For Firm A:

$\sigma_A = \sqrt{100} = 10$

For Firm B:

$\sigma_B = \sqrt{121} = 11$

Comparing the standard deviations, we have $\sigma_B (11) > \sigma_A (10)$.

Since the standard deviation of wages for Firm B is greater than that for Firm A, Firm B shows greater variability in individual wages.

Question 4. The following is the record of goals scored by team A in a football session:

No. of goals scored 0 1 2 3 4
No. of matches 1 9 7 5 3

For the team B, mean number of goals scored per match was 2 with a standard deviation 1.25 goals. Find which team may be considered more consistent?

Answer:

Given:

Data for Team A goals scored:

No. of goals scored ($x_i$) No. of matches ($f_i$)
01
19
27
35
43

Data for Team B:

Mean number of goals scored ($\overline{x}_B$) = 2

Standard deviation ($\sigma_B$) = 1.25


To Find: Which team is more consistent.


Solution:

Consistency is measured by the inverse of variability. A lower Coefficient of Variation (C.V.) indicates lower variability and thus higher consistency. We will calculate the C.V. for both teams and compare them.

The formula for Coefficient of Variation is $C.V. = \frac{\sigma}{\overline{x}} \times 100$.


Calculations for Team A:

We need to calculate the mean ($\overline{x}_A$) and standard deviation ($\sigma_A$) from the frequency distribution.

Total number of matches ($N_A$) = $\sum\limits f_i = 1 + 9 + 7 + 5 + 3 = 25$

Calculate $\sum\limits f_i x_i$:

$x_i$ $f_i$ $f_i x_i$
01$1 \times 0 = 0$
19$9 \times 1 = 9$
27$7 \times 2 = 14$
35$5 \times 3 = 15$
43$3 \times 4 = 12$
Total $N_A = 25$ $\sum\limits f_i x_i = 50$

Mean ($\overline{x}_A$) = $\frac{\sum\limits f_i x_i}{N_A} = \frac{50}{25} = 2$

The mean number of goals scored per match for Team A is 2.

Calculate variance ($\sigma_A^2$). We use the formula $\sigma_A^2 = \frac{\sum\limits f_i x_i^2}{N_A} - (\overline{x}_A)^2$.

Calculate $\sum\limits f_i x_i^2$:

$x_i$ $f_i$ $x_i^2$ $f_i x_i^2$
010$1 \times 0 = 0$
191$9 \times 1 = 9$
274$7 \times 4 = 28$
359$5 \times 9 = 45$
4316$3 \times 16 = 48$
Total $N_A = 25$ $\sum\limits f_i x_i^2 = 130$

Variance ($\sigma_A^2$) = $\frac{130}{25} - (2)^2 = 5.2 - 4 = 1.2$

Standard Deviation ($\sigma_A$) = $\sqrt{1.2} \approx 1.0954$

Coefficient of Variation for Team A ($C.V._A$) = $\frac{\sigma_A}{\overline{x}_A} \times 100 = \frac{1.0954}{2} \times 100 \approx 54.77\%$


Calculations for Team B:

Mean ($\overline{x}_B$) = 2

Standard Deviation ($\sigma_B$) = 1.25

Coefficient of Variation for Team B ($C.V._B$) = $\frac{\sigma_B}{\overline{x}_B} \times 100 = \frac{1.25}{2} \times 100 = 0.625 \times 100 = 62.5\%$


Comparison of Coefficients of Variation:

$C.V._A \approx 54.77\%$

$C.V._B = 62.5\%$

Since $C.V._A < C.V._B$ ($54.77\% < 62.5\%$), Team A has lower relative variability in the number of goals scored per match compared to Team B. Therefore, Team A is more consistent.

Conclusion: Team A may be considered more consistent.

Question 5. The sum and sum of squares corresponding to length x (in cm) and weight y (in gm) of 50 plant products are given below:

$\sum\limits_{i=1}^{50} x_i = 212 \;,\; \sum\limits_{i=1}^{50} x_i^2 = 902.8 \ ,$ $ \sum\limits_{i=1}^{50} y_i = 261 \;,\; \sum\limits_{i=1}^{50} y_i^2 = 1457.6$

Which is more varying, the length or weight?

Answer:

Given:

Number of plant products, $n = 50$.

Sum of lengths: $\sum\limits_{i=1}^{50} x_i = 212$ cm

Sum of squares of lengths: $\sum\limits_{i=1}^{50} x_i^2 = 902.8$ cm$^2$

Sum of weights: $\sum\limits_{i=1}^{50} y_i = 261$ gm

Sum of squares of weights: $\sum\limits_{i=1}^{50} y_i^2 = 1457.6$ gm$^2$


To Find: Which is more varying, the length or weight.


Solution:

To compare the variability of two distributions that are measured in different units (cm and gm), we calculate the Coefficient of Variation (C.V.) for each distribution. The distribution with the higher C.V. is considered more varying.

The Coefficient of Variation is given by the formula:

$C.V. = \frac{\sigma}{\overline{x}} \times 100$

where $\sigma$ is the standard deviation and $\overline{x}$ is the mean.

The standard deviation is the square root of the variance ($\sigma = \sqrt{\sigma^2}$). The variance is calculated as $\sigma^2 = \frac{\sum\limits z_i^2}{n} - (\overline{z})^2$, where $z$ represents the variable (either $x$ or $y$) and $\overline{z} = \frac{\sum\limits z_i}{n}$.


Calculations for Length (x):

Mean of length ($\overline{x}_x$) = $\frac{\sum\limits x_i}{n} = \frac{212}{50} = 4.24$ cm

Variance of length ($\sigma_x^2$) = $\frac{\sum\limits x_i^2}{n} - (\overline{x}_x)^2$

$\sigma_x^2 = \frac{902.8}{50} - (4.24)^2$

$\sigma_x^2 = 18.056 - 17.9776 = 0.0784$ cm$^2$

Standard Deviation of length ($\sigma_x$) = $\sqrt{0.0784} = 0.28$ cm

Coefficient of Variation for length ($C.V._x$) = $\frac{\sigma_x}{\overline{x}_x} \times 100$

$C.V._x = \frac{0.28}{4.24} \times 100 = \frac{28}{424} \times 100 = \frac{7}{106} \times 100 = \frac{700}{106} = \frac{350}{53} \approx 6.60\%$


Calculations for Weight (y):

Mean of weight ($\overline{x}_y$) = $\frac{\sum\limits y_i}{n} = \frac{261}{50} = 5.22$ gm

Variance of weight ($\sigma_y^2$) = $\frac{\sum\limits y_i^2}{n} - (\overline{x}_y)^2$

$\sigma_y^2 = \frac{1457.6}{50} - (5.22)^2$

$\sigma_y^2 = 29.152 - 27.2484 = 1.9036$ gm$^2$

Standard Deviation of weight ($\sigma_y$) = $\sqrt{1.9036} = 1.38$ gm

Coefficient of Variation for weight ($C.V._y$) = $\frac{\sigma_y}{\overline{x}_y} \times 100$

$C.V._y = \frac{1.38}{5.22} \times 100 = \frac{138}{522} \times 100 = \frac{23}{87} \times 100 = \frac{2300}{87} \approx 26.44\%$


Comparison of Coefficients of Variation:

$C.V._x \approx 6.60\%$

$C.V._y \approx 26.44\%$

Since $C.V._y > C.V._x$ ($26.44\% > 6.60\%$), the weight shows greater relative variability than the length.

Conclusion: The weight is more varying than the length.



Example 16 to 19 - Miscellaneous Examples

Example 16: The variance of 20 observations is 5. If each observation is multiplied by 2, find the new variance of the resulting observations

Answer:

Given:

Number of observations, $n = 20$.

Variance of the original observations ($\sigma^2$) = 5.


Let the original observations be $x_1, x_2, \dots, x_{20}$.

The mean of the original observations is $\overline{x} = \frac{1}{n} \sum\limits_{i=1}^{n} x_i$.

The variance of the original observations is given by:

$\sigma^2 = \frac{1}{n} \sum\limits_{i=1}^{n} (x_i - \overline{x})^2$

We are given $\sigma^2 = 5$.


Each observation is multiplied by 2. Let the new observations be $y_i$.

$y_i = 2x_i$, for $i = 1, 2, \dots, 20$.

The number of new observations is still $n = 20$.


Let the mean of the new observations be $\overline{y}$.

$\overline{y} = \frac{1}{n} \sum\limits_{i=1}^{n} y_i = \frac{1}{20} \sum\limits_{i=1}^{20} (2x_i)$

$\overline{y} = \frac{2}{20} \sum\limits_{i=1}^{20} x_i = 2 \left(\frac{1}{20} \sum\limits_{i=1}^{20} x_i\right)$

$\overline{y} = 2\overline{x}$

The new mean is twice the original mean.


The variance of the new observations ($\sigma_{\text{new}}^2$) is given by:

$\sigma_{\text{new}}^2 = \frac{1}{n} \sum\limits_{i=1}^{n} (y_i - \overline{y})^2$

Substitute $y_i = 2x_i$ and $\overline{y} = 2\overline{x}$ into the formula:

$\sigma_{\text{new}}^2 = \frac{1}{20} \sum\limits_{i=1}^{20} (2x_i - 2\overline{x})^2$

Factor out 2 from the term inside the square:

$\sigma_{\text{new}}^2 = \frac{1}{20} \sum\limits_{i=1}^{20} (2(x_i - \overline{x}))^2$

Square the term $2(x_i - \overline{x})$:

$\sigma_{\text{new}}^2 = \frac{1}{20} \sum\limits_{i=1}^{20} 4(x_i - \overline{x})^2$

Factor out the constant 4 from the summation:

$\sigma_{\text{new}}^2 = 4 \left( \frac{1}{20} \sum\limits_{i=1}^{20} (x_i - \overline{x})^2 \right)$

The expression in the parenthesis is the original variance, $\sigma^2$.

$\sigma_{\text{new}}^2 = 4 \times \sigma^2$


Substitute the given value of the original variance ($\sigma^2 = 5$):

$\sigma_{\text{new}}^2 = 4 \times 5$

$\sigma_{\text{new}}^2 = 20$

The new variance of the resulting observations is 20.

Example 17: The mean of 5 observations is 4.4 and their variance is 8.24. If three of the observations are 1, 2 and 6, find the other two observations.

Answer:

Given:

Number of observations, $n = 5$.

Mean of the observations, $\bar{x} = 4.4$.

Variance of the observations, $\sigma^2 = 8.24$.

Three of the five observations are 1, 2, and 6.


To Find:

The other two observations.


Solution:

Let the two unknown observations be $x$ and $y$.

The five observations are 1, 2, 6, $x$, and $y$.

Step 1: Use the Mean to form the first equation

The formula for the mean is $\bar{x} = \frac{\sum\limits x_i}{n}$.

Substitute the given values:

$4.4 = \frac{1 + 2 + 6 + x + y}{5}$

$4.4 \times 5 = 9 + x + y$

$22 = 9 + x + y$

$x + y = 22 - 9$

$x + y = 13$

... (i)

Step 2: Use the Variance to form the second equation

The formula for variance is $\sigma^2 = \frac{\sum\limits x_i^2}{n} - (\bar{x})^2$.

Substitute the given values:

$8.24 = \frac{1^2 + 2^2 + 6^2 + x^2 + y^2}{5} - (4.4)^2$

$8.24 = \frac{1 + 4 + 36 + x^2 + y^2}{5} - 19.36$

$8.24 + 19.36 = \frac{41 + x^2 + y^2}{5}$

$27.6 = \frac{41 + x^2 + y^2}{5}$

$27.6 \times 5 = 41 + x^2 + y^2$

$138 = 41 + x^2 + y^2$

$x^2 + y^2 = 138 - 41$

$x^2 + y^2 = 97$

... (ii)

Step 3: Solve the system of two equations

From equation (i), we have $y = 13 - x$.

Substitute this expression for $y$ into equation (ii):

$x^2 + (13 - x)^2 = 97$

Expand the squared term:

$x^2 + (169 - 26x + x^2) = 97$

Combine like terms and set the equation to zero:

$2x^2 - 26x + 169 - 97 = 0$

$2x^2 - 26x + 72 = 0$

Divide the entire equation by 2 to simplify it:

$x^2 - 13x + 36 = 0$

Factor the quadratic equation:

$(x - 4)(x - 9) = 0$

This gives two possible values for $x$: $x = 4$ or $x = 9$.

Now, find the corresponding values for $y$ using $y = 13 - x$:

  • If $x = 4$, then $y = 13 - 4 = 9$.
  • If $x = 9$, then $y = 13 - 9 = 4$.

In either case, the two unknown observations are 4 and 9.


Answer:

The other two observations are 4 and 9.

Example 18: If each of the observation x1 , x2 , ...,xn is increased by ‘a’, where a is a negative or positive number, show that the variance remains unchanged.

Answer:

Given:

Let the original set of observations be $x_1, x_2, \dots, x_n$.

The mean of these observations is $\bar{x} = \frac{1}{n} \sum\limits_{i=1}^{n} x_i$.

The variance of these observations is $\sigma_x^2 = \frac{1}{n} \sum\limits_{i=1}^{n} (x_i - \bar{x})^2$.

A new set of observations, $y_1, y_2, \dots, y_n$, is formed by adding a constant 'a' to each original observation, such that $y_i = x_i + a$ for all $i=1, 2, \dots, n$.


To Prove:

The variance of the new set of observations ($\sigma_y^2$) is equal to the variance of the original set of observations ($\sigma_x^2$). That is, we need to show that $\sigma_y^2 = \sigma_x^2$.


Proof:

Step 1: Find the mean of the new observations ($\bar{y}$)

The mean of the new observations is the sum of all new observations divided by their count, $n$.

$\bar{y} = \frac{1}{n} \sum\limits_{i=1}^{n} y_i$

Substitute $y_i = x_i + a$:

$\bar{y} = \frac{1}{n} \sum\limits_{i=1}^{n} (x_i + a)$

Separate the terms in the summation:

$\bar{y} = \frac{1}{n} \left( \sum\limits_{i=1}^{n} x_i + \sum\limits_{i=1}^{n} a \right)$

The sum of a constant 'a' repeated 'n' times is $na$.

$\bar{y} = \frac{1}{n} \left( \sum\limits_{i=1}^{n} x_i + na \right)$

$\bar{y} = \frac{1}{n} \sum\limits_{i=1}^{n} x_i + \frac{na}{n}$

Since $\frac{1}{n} \sum\limits_{i=1}^{n} x_i = \bar{x}$, we have:

$\bar{y} = \bar{x} + a$

... (i)

This shows that the new mean is the old mean increased by 'a'.

Step 2: Calculate the variance of the new observations ($\sigma_y^2$)

The formula for the variance of the new observations is:

$\sigma_y^2 = \frac{1}{n} \sum\limits_{i=1}^{n} (y_i - \bar{y})^2$

Now, substitute $y_i = x_i + a$ and $\bar{y} = \bar{x} + a$ into this formula:

$\sigma_y^2 = \frac{1}{n} \sum\limits_{i=1}^{n} ((x_i + a) - (\bar{x} + a))^2$

Simplify the term inside the square:

$(x_i + a) - (\bar{x} + a) = x_i + a - \bar{x} - a = x_i - \bar{x}$

The constant 'a' cancels out. Now, substitute this simplified term back into the variance formula:

$\sigma_y^2 = \frac{1}{n} \sum\limits_{i=1}^{n} (x_i - \bar{x})^2$

This expression is the exact formula for the variance of the original observations, $\sigma_x^2$.

Therefore, we can conclude that:

$\sigma_y^2 = \sigma_x^2$

This shows that adding a constant 'a' to each observation does not change the variance. The spread or dispersion of the data points relative to their mean remains the same.


Conclusion:

We have shown that if each observation is increased by a constant 'a', the variance remains unchanged. Hence Proved.

Example 19: The mean and standard deviation of 100 observations were calculated as 40 and 5.1, respectively by a student who took by mistake 50 instead of 40 for one observation. What are the correct mean and standard deviation?

Answer:

Given:

Number of observations, $n = 100$.

Incorrect mean, $\bar{x}_{\text{incorrect}} = 40$.

Incorrect standard deviation, $\sigma_{\text{incorrect}} = 5.1$.

Incorrect observation = 50.

Correct observation = 40.


To Find:

The correct mean ($\bar{x}_{\text{correct}}$) and the correct standard deviation ($\sigma_{\text{correct}}$).


Solution:

Step 1: Calculate the Correct Mean

First, we find the incorrect sum of all observations using the incorrect mean.

Incorrect sum = $n \times \bar{x}_{\text{incorrect}} = 100 \times 40 = 4000$

Next, we find the correct sum by subtracting the incorrect value and adding the correct value.

Correct sum = Incorrect sum - Incorrect value + Correct value

Correct sum = $4000 - 50 + 40 = 3990$

Now, we can calculate the correct mean.

$\bar{x}_{\text{correct}} = \frac{\text{Correct sum}}{n} = \frac{3990}{100} = 39.9$

So, the correct mean is 39.9.

Step 2: Calculate the Correct Standard Deviation

To find the correct standard deviation, we first need to find the correct variance. We start by using the formula for the incorrect variance to find the incorrect sum of squares ($\sum\limits x_i^2$).

The formula for variance is $\sigma^2 = \frac{\sum\limits x_i^2}{n} - (\bar{x})^2$.

From this, the sum of squares is $\sum\limits x_i^2 = n(\sigma^2 + (\bar{x})^2)$.

Using the incorrect values:

Incorrect variance, $\sigma_{\text{incorrect}}^2 = (5.1)^2 = 26.01$.

Incorrect $\sum\limits x_i^2 = 100 (26.01 + (40)^2)$

Incorrect $\sum\limits x_i^2 = 100 (26.01 + 1600)$

Incorrect $\sum\limits x_i^2 = 100 (1626.01) = 162601$

Now, we find the correct sum of squares by subtracting the square of the incorrect value and adding the square of the correct value.

Correct $\sum\limits x_i^2 = \text{Incorrect } \sum\limits x_i^2 - (\text{Incorrect value})^2 \ $$ + (\text{Correct value})^2$

Correct $\sum\limits x_i^2 = 162601 - (50)^2 + (40)^2$

Correct $\sum\limits x_i^2 = 162601 - 2500 + 1600$

Correct $\sum\limits x_i^2 = 161701$

Now, we can calculate the correct variance using the correct sum of squares and the correct mean.

$\sigma_{\text{correct}}^2 = \frac{\text{Correct } \sum\limits x_i^2}{n} - (\bar{x}_{\text{correct}})^2$

$\sigma_{\text{correct}}^2 = \frac{161701}{100} - (39.9)^2$

$\sigma_{\text{correct}}^2 = 1617.01 - 1592.01 = 25$

Finally, the correct standard deviation is the square root of the correct variance.

$\sigma_{\text{correct}} = \sqrt{25} = 5$

So, the correct standard deviation is 5.


Answer:

The correct mean is 39.9 and the correct standard deviation is 5.



Miscellaneous Exercise On Chapter 15

Question 1. The mean and variance of eight observations are 9 and 9.25, respectively. If six of the observations are 6, 7, 10, 12, 12 and 13, find the remaining two observations.

Answer:

Given:

Number of observations, $n = 8$.

Mean of observations ($\overline{x}$) = 9.

Variance of observations ($\sigma^2$) = 9.25.

Six of the observations are 6, 7, 10, 12, 12, and 13.


To Find: The remaining two observations.


Solution:

Let the two remaining observations be $a$ and $b$. The eight observations are 6, 7, 10, 12, 12, 13, $a$, and $b$.

The mean of the observations is given by the formula:

$\overline{x} = \frac{\sum\limits x_i}{n}$

The sum of the eight observations is:

$\sum\limits x_i = 6 + 7 + 10 + 12 + 12 + 13 + a + b = 60 + a + b$

We are given $\overline{x} = 9$ and $n = 8$. Substitute these values into the mean formula:

$9 = \frac{60 + a + b}{8}$

Multiply both sides by 8:

$9 \times 8 = 60 + a + b$

$72 = 60 + a + b$

Subtract 60 from both sides:

$a + b = 12$

... (i)


The variance of the observations is given by the formula:

$\sigma^2 = \frac{\sum\limits x_i^2}{n} - (\overline{x})^2$

The sum of the squares of the eight observations is:

$\sum\limits x_i^2 = 6^2 + 7^2 + 10^2 + 12^2 + 12^2 + 13^2 + a^2 + b^2$

$\sum\limits x_i^2 = 36 + 49 + 100 + 144 + 144 + 169 + a^2 + b^2$

$\sum\limits x_i^2 = 642 + a^2 + b^2$

We are given $\sigma^2 = 9.25$ and $\overline{x} = 9$. Substitute these values into the variance formula:

$9.25 = \frac{642 + a^2 + b^2}{8} - (9)^2$

$9.25 = \frac{642 + a^2 + b^2}{8} - 81$

Add 81 to both sides:

$9.25 + 81 = \frac{642 + a^2 + b^2}{8}$

$90.25 = \frac{642 + a^2 + b^2}{8}$

Multiply both sides by 8:

$90.25 \times 8 = 642 + a^2 + b^2$

$722 = 642 + a^2 + b^2$

Subtract 642 from both sides:

$a^2 + b^2 = 80$

... (ii)


Now we have a system of two equations with two variables $a$ and $b$:

1) $a + b = 12$

2) $a^2 + b^2 = 80$

From equation (i), we can express $b$ in terms of $a$: $b = 12 - a$.

Substitute this expression for $b$ into equation (ii):

$a^2 + (12 - a)^2 = 80$

Expand $(12 - a)^2$:

$a^2 + (144 - 24a + a^2) = 80$

Combine like terms:

$2a^2 - 24a + 144 = 80$

Subtract 80 from both sides:

$2a^2 - 24a + 144 - 80 = 0$

$2a^2 - 24a + 64 = 0$

Divide the entire equation by 2:

$a^2 - 12a + 32 = 0$


This is a quadratic equation in $a$. We can factor this equation. We look for two numbers that multiply to 32 and add up to -12. These numbers are -4 and -8.

So, we can factor the quadratic equation as:

$(a - 4)(a - 8) = 0$

This gives two possible values for $a$:

$a - 4 = 0 \implies a = 4$

or

$a - 8 = 0 \implies a = 8$


Case 1: If $a = 4$, substitute this into equation (i) to find $b$:

$4 + b = 12 \implies b = 12 - 4 = 8$

In this case, the other two observations are 4 and 8.

Case 2: If $a = 8$, substitute this into equation (i) to find $b$:

$8 + b = 12 \implies b = 12 - 8 = 4$

In this case, the other two observations are 8 and 4.

Both cases result in the same pair of numbers for the remaining observations.

Let's verify if these values satisfy equation (ii): $a^2 + b^2 = 80$.

If $a=4$ and $b=8$, then $4^2 + 8^2 = 16 + 64 = 80$. This is correct.

The remaining two observations are 4 and 8.

Question 2. The mean and variance of 7 observations are 8 and 16, respectively. If five of the observations are 2, 4, 10, 12, 14. Find the remaining two observations.

Answer:

Given:

Number of observations, $n = 7$.

Mean of observations ($\overline{x}$) = 8.

Variance of observations ($\sigma^2$) = 16.

Five of the observations are 2, 4, 10, 12, and 14.


To Find: The remaining two observations.


Solution:

Let the two remaining observations be $a$ and $b$. The seven observations are 2, 4, 10, 12, 14, $a$, and $b$.

The mean of the observations is given by the formula:

$\overline{x} = \frac{\sum\limits x_i}{n}$

The sum of the seven observations is:

$\sum\limits x_i = 2 + 4 + 10 + 12 + 14 + a + b = 42 + a + b$

We are given $\overline{x} = 8$ and $n = 7$. Substitute these values into the mean formula:

$8 = \frac{42 + a + b}{7}$

Multiply both sides by 7:

$8 \times 7 = 42 + a + b$

$56 = 42 + a + b$

Subtract 42 from both sides:

$a + b = 14$

... (i)


The variance of the observations is given by the formula:

$\sigma^2 = \frac{\sum\limits x_i^2}{n} - (\overline{x})^2$

The sum of the squares of the seven observations is:

$\sum\limits x_i^2 = 2^2 + 4^2 + 10^2 + 12^2 + 14^2 + a^2 + b^2$

$\sum\limits x_i^2 = 4 + 16 + 100 + 144 + 196 + a^2 + b^2$

$\sum\limits x_i^2 = 460 + a^2 + b^2$

We are given $\sigma^2 = 16$ and $\overline{x} = 8$. Substitute these values into the variance formula:

$16 = \frac{460 + a^2 + b^2}{7} - (8)^2$

$16 = \frac{460 + a^2 + b^2}{7} - 64$

Add 64 to both sides:

$16 + 64 = \frac{460 + a^2 + b^2}{7}$

$80 = \frac{460 + a^2 + b^2}{7}$

Multiply both sides by 7:

$80 \times 7 = 460 + a^2 + b^2$

$560 = 460 + a^2 + b^2$

Subtract 460 from both sides:

$a^2 + b^2 = 100$

... (ii)


Now we have a system of two equations with two variables $a$ and $b$:

1) $a + b = 14$

2) $a^2 + b^2 = 100$

From equation (i), we can express $b$ in terms of $a$: $b = 14 - a$.

Substitute this expression for $b$ into equation (ii):

$a^2 + (14 - a)^2 = 100$

Expand $(14 - a)^2$ using the formula $(p - q)^2 = p^2 - 2pq + q^2$:

$a^2 + (14^2 - 2 \times 14 \times a + a^2) = 100$

$a^2 + 196 - 28a + a^2 = 100$

Combine like terms:

$2a^2 - 28a + 196 = 100$

Subtract 100 from both sides:

$2a^2 - 28a + 196 - 100 = 0$

$2a^2 - 28a + 96 = 0$

Divide the entire equation by 2:

$a^2 - 14a + 48 = 0$


This is a quadratic equation in $a$. We can solve it by factoring. We look for two numbers that multiply to 48 and add up to -14. These numbers are -6 and -8.

So, we can factor the quadratic equation as:

$(a - 6)(a - 8) = 0$

This gives two possible values for $a$:

$a - 6 = 0 \implies a = 6$

or

$a - 8 = 0 \implies a = 8$


Case 1: If $a = 6$, substitute this into equation (i) to find $b$:

$6 + b = 14 \implies b = 14 - 6 = 8$

In this case, the other two observations are 6 and 8.

Case 2: If $a = 8$, substitute this into equation (i) to find $b$:

$8 + b = 14 \implies b = 14 - 8 = 6$

In this case, the other two observations are 8 and 6.

Both cases result in the same pair of numbers for the remaining observations.

Let's verify if these values satisfy equation (ii): $a^2 + b^2 = 100$.

If $a=6$ and $b=8$, then $6^2 + 8^2 = 36 + 64 = 100$. This is correct.

The remaining two observations are 6 and 8.

Question 3. The mean and standard deviation of six observations are 8 and 4, respectively. If each observation is multiplied by 3, find the new mean and new standard deviation of the resulting observations.

Answer:

Given:

Number of observations, $n = 6$.

Mean of original observations ($\overline{x}_{\text{original}}$) = 8.

Standard deviation of original observations ($\sigma_{\text{original}}$) = 4.

Each observation is multiplied by 3.


To Find:

The new mean and new standard deviation.


Solution:

Let the original observations be $x_1, x_2, \dots, x_6$.

The mean of the original observations is $\overline{x}_{\text{original}} = \frac{1}{6} \sum\limits_{i=1}^{6} x_i = 8$.

The standard deviation of the original observations is $\sigma_{\text{original}} = \sqrt{\frac{\sum\limits_{i=1}^{6} (x_i - \overline{x}_{\text{original}})^2}{6}} = 4$.

The new observations $y_i$ are obtained by multiplying each $x_i$ by a constant $k=3$. So, $y_i = 3x_i$ for $i = 1, 2, \dots, 6$.


New Mean:

The new mean ($\overline{y}_{\text{new}}$) is given by:

$\overline{y}_{\text{new}} = \frac{1}{n} \sum\limits_{i=1}^{n} y_i$

Substitute $y_i = 3x_i$ and $n=6$:

$\overline{y}_{\text{new}} = \frac{1}{6} \sum\limits_{i=1}^{6} (3x_i)$

Using the property of summation $\sum\limits k z_i = k \sum\limits z_i$:

$\overline{y}_{\text{new}} = 3 \left( \frac{1}{6} \sum\limits_{i=1}^{6} x_i \right)$

The term in the parenthesis is the original mean $\overline{x}_{\text{original}}$.

$\overline{y}_{\text{new}} = 3 \times \overline{x}_{\text{original}}$

Substitute the given value of $\overline{x}_{\text{original}} = 8$:

$\overline{y}_{\text{new}} = 3 \times 8 = 24$

The new mean is 24.


New Standard Deviation:

The variance of the original observations is $\sigma_{\text{original}}^2 = (\sigma_{\text{original}})^2 = 4^2 = 16$.

The variance of the new observations ($\sigma_{\text{new}}^2$) is given by:

$\sigma_{\text{new}}^2 = \frac{1}{n} \sum\limits_{i=1}^{n} (y_i - \overline{y}_{\text{new}})^2$

Substitute $y_i = 3x_i$ and $\overline{y}_{\text{new}} = 3\overline{x}_{\text{original}}$:

$\sigma_{\text{new}}^2 = \frac{1}{6} \sum\limits_{i=1}^{6} (3x_i - 3\overline{x}_{\text{original}})^2$

Factor out 3 from the term inside the square:

$\sigma_{\text{new}}^2 = \frac{1}{6} \sum\limits_{i=1}^{6} (3(x_i - \overline{x}_{\text{original}}))^2$

Square the term $3(x_i - \overline{x}_{\text{original}})$:

$\sigma_{\text{new}}^2 = \frac{1}{6} \sum\limits_{i=1}^{6} 9(x_i - \overline{x}_{\text{original}})^2$

Factor out the constant 9 from the summation:

$\sigma_{\text{new}}^2 = 9 \left( \frac{1}{6} \sum\limits_{i=1}^{6} (x_i - \overline{x}_{\text{original}})^2 \right)$

The expression in the parenthesis is the original variance $\sigma_{\text{original}}^2$.

$\sigma_{\text{new}}^2 = 9 \times \sigma_{\text{original}}^2$

Substitute the value of $\sigma_{\text{original}}^2 = 16$:

$\sigma_{\text{new}}^2 = 9 \times 16 = 144$

The new variance is 144.

The new standard deviation ($\sigma_{\text{new}}$) is the square root of the new variance:

$\sigma_{\text{new}} = \sqrt{\sigma_{\text{new}}^2} = \sqrt{144} = 12$

Alternate Method using Property:

If each observation $x_i$ is multiplied by a constant $k$, the new mean is $\overline{y} = k\overline{x}$ and the new standard deviation is $\sigma_y = |k|\sigma_x$.

Here, $k=3$.

New mean = $3 \times \text{Original Mean} = 3 \times 8 = 24$.

New standard deviation = $|3| \times \text{Original Standard Deviation} = 3 \times 4 = 12$.

Both methods yield the same result.

The new mean of the resulting observations is 24 and the new standard deviation is 12.

Question 4. Given that $\overline{x}$ is the mean and σ2 is the variance of n observations x1 , x2 , ...,xn . Prove that the mean and variance of the observations ax1 , ax2 , ax3 , ...., axn are a$\overline{x}$ and a2 σ2 , respectively, (a ≠ 0).

Answer:

Given:

A set of $n$ observations: $x_1, x_2, \dots, x_n$.

Mean of these observations = $\overline{x}$.

Variance of these observations = $\sigma^2$.

A new set of observations is created by multiplying each original observation by a non-zero constant 'a', resulting in $y_1 = ax_1, y_2 = ax_2, \dots, y_n = ax_n$.


To Prove:

The mean of the new observations is $a\overline{x}$.

The variance of the new observations is $a^2 \sigma^2$.


Proof for the Mean:

The mean of the original observations is defined as:

$\overline{x} = \frac{1}{n} \sum\limits_{i=1}^{n} x_i$

Let the mean of the new observations be $\overline{y}$. By definition:

$\overline{y} = \frac{1}{n} \sum\limits_{i=1}^{n} y_i$

Substitute $y_i = ax_i$:

$\overline{y} = \frac{1}{n} \sum\limits_{i=1}^{n} (ax_i)$

Using the property of summation $\sum\limits k z_i = k \sum\limits z_i$:

$\overline{y} = a \left( \frac{1}{n} \sum\limits_{i=1}^{n} x_i \right)$

The expression in the parenthesis is the original mean $\overline{x}$.

$\overline{y} = a\overline{x}$

Thus, the mean of the observations $ax_1, ax_2, \dots, ax_n$ is $a\overline{x}$.


Proof for the Variance:

The variance of the original observations is defined as:

$\sigma^2 = \frac{1}{n} \sum\limits_{i=1}^{n} (x_i - \overline{x})^2$

Let the variance of the new observations be $\sigma_y^2$. By definition:

$\sigma_y^2 = \frac{1}{n} \sum\limits_{i=1}^{n} (y_i - \overline{y})^2$

Substitute $y_i = ax_i$ and the new mean $\overline{y} = a\overline{x}$ (proved above):

$\sigma_y^2 = \frac{1}{n} \sum\limits_{i=1}^{n} (ax_i - a\overline{x})^2$

Factor out 'a' from the term inside the square:

$\sigma_y^2 = \frac{1}{n} \sum\limits_{i=1}^{n} (a(x_i - \overline{x}))^2$

Square the term $a(x_i - \overline{x})$:

$\sigma_y^2 = \frac{1}{n} \sum\limits_{i=1}^{n} a^2 (x_i - \overline{x})^2$

Since $a^2$ is a constant (and $a \neq 0$), we can factor it out from the summation:

$\sigma_y^2 = a^2 \left( \frac{1}{n} \sum\limits_{i=1}^{n} (x_i - \overline{x})^2 \right)$

The expression in the parenthesis is the original variance $\sigma^2$.

$\sigma_y^2 = a^2 \sigma^2$

Thus, the variance of the observations $ax_1, ax_2, \dots, ax_n$ is $a^2 \sigma^2$.

We have shown that the mean and variance of the observations $ax_1, ax_2, \dots, ax_n$ are $a\overline{x}$ and $a^2 \sigma^2$, respectively, given that $a \neq 0$.

Question 5. The mean and standard deviation of 20 observations are found to be 10 and 2, respectively. On rechecking, it was found that an observation 8 was incorrect. Calculate the correct mean and standard deviation in each of the following cases:

(i) If wrong item is omitted.

(ii) If it is replaced by 12.

Answer:

Given:

Incorrect number of observations ($n_{\text{incorrect}}$) = 20.

Incorrect mean ($\overline{x}_{\text{incorrect}}$) = 10.

Incorrect standard deviation ($\sigma_{\text{incorrect}}$) = 2.

Incorrect observation recorded = 8.


To Find: The correct mean and standard deviation for two cases.


Solution:

From the incorrect mean, we can find the incorrect sum of observations:

$\overline{x}_{\text{incorrect}} = \frac{\sum\limits x_{\text{incorrect}}}{n_{\text{incorrect}}}$

... (A)

$\sum\limits x_{\text{incorrect}} = \overline{x}_{\text{incorrect}} \times n_{\text{incorrect}}$

$\sum\limits x_{\text{incorrect}} = 10 \times 20 = 200$

The incorrect sum of observations is 200.

From the incorrect standard deviation, we can find the incorrect variance:

$\sigma_{\text{incorrect}}^2 = (\sigma_{\text{incorrect}})^2 = 2^2 = 4$

The formula for variance is $\sigma^2 = \frac{\sum\limits x_i^2}{n} - (\overline{x})^2$.

Using this, we find the incorrect sum of squares:

$\sigma_{\text{incorrect}}^2 = \frac{\sum\limits x^2_{\text{incorrect}}}{n_{\text{incorrect}}} - (\overline{x}_{\text{incorrect}})^2$

... (B)

$4 = \frac{\sum\limits x^2_{\text{incorrect}}}{20} - (10)^2$

$4 = \frac{\sum\limits x^2_{\text{incorrect}}}{20} - 100$

$\frac{\sum\limits x^2_{\text{incorrect}}}{20} = 100 + 4 = 104$

$\sum\limits x^2_{\text{incorrect}} = 104 \times 20 = 2080$

The incorrect sum of squares is 2080.


Case (i) If wrong item is omitted:

The incorrect observation (8) is removed from the data.

New number of observations ($n_{\text{new}}$) = $n_{\text{incorrect}} - 1 = 20 - 1 = 19$

Correct sum of observations ($\sum\limits x_{\text{correct}}$) = Incorrect sum of observations - Incorrect observation

$\sum\limits x_{\text{correct}} = 200 - 8 = 192$

Calculate the correct mean:

$\overline{x}_{\text{correct}} = \frac{\sum\limits x_{\text{correct}}}{n_{\text{new}}}$

$\overline{x}_{\text{correct}} = \frac{192}{19}$

The correct mean is $\frac{192}{19}$.

Correct sum of squares ($\sum\limits x^2_{\text{correct}}$) = Incorrect sum of squares - (Incorrect observation)$^2$

$\sum\limits x^2_{\text{correct}} = 2080 - 8^2 = 2080 - 64 = 2016$

Calculate the correct variance:

$\sigma_{\text{correct}}^2 = \frac{\sum\limits x^2_{\text{correct}}}{n_{\text{new}}} - (\overline{x}_{\text{correct}})^2$

$\sigma_{\text{correct}}^2 = \frac{2016}{19} - \left(\frac{192}{19}\right)^2$

$\sigma_{\text{correct}}^2 = \frac{2016}{19} - \frac{36864}{361}$

To combine the fractions, find a common denominator (361):

$\sigma_{\text{correct}}^2 = \frac{2016 \times 19}{361} - \frac{36864}{361} = \frac{38304}{361} - \frac{36864}{361} = \frac{1440}{361}$

Calculate the correct standard deviation:

$\sigma_{\text{correct}} = \sqrt{\sigma_{\text{correct}}^2} = \sqrt{\frac{1440}{361}} = \frac{\sqrt{144 \times 10}}{\sqrt{361}} = \frac{12\sqrt{10}}{19}$

Using $\sqrt{10} \approx 3.162$: $\sigma_{\text{correct}} \approx \frac{12 \times 3.162}{19} \approx \frac{37.944}{19} \approx 1.997$

The correct mean is $\frac{192}{19} \approx 10.11$ and the correct standard deviation is $\frac{12\sqrt{10}}{19} \approx 1.997$ if the wrong item is omitted.


Case (ii) If it is replaced by 12:

The incorrect observation (8) is replaced by the correct observation (12).

New number of observations ($n_{\text{new}}$) = $n_{\text{incorrect}} = 20$ (since one item is replaced, the number of observations remains the same).

Correct sum of observations ($\sum\limits x_{\text{correct}}$) = Incorrect sum of observations - Incorrect observation + Correct observation

$\sum\limits x_{\text{correct}} = 200 - 8 + 12 = 204$

Calculate the correct mean:

$\overline{x}_{\text{correct}} = \frac{\sum\limits x_{\text{correct}}}{n_{\text{new}}}$

$\overline{x}_{\text{correct}} = \frac{204}{20} = 10.2$

The correct mean is 10.2.

Correct sum of squares ($\sum\limits x^2_{\text{correct}}$) = Incorrect sum of squares - (Incorrect observation)$^2$ + (Correct observation)$^2$

$\sum\limits x^2_{\text{correct}} = 2080 - 8^2 + 12^2 = 2080 - 64 + 144 = 2080 + 80 = 2160$

Calculate the correct variance:

$\sigma_{\text{correct}}^2 = \frac{\sum\limits x^2_{\text{correct}}}{n_{\text{new}}} - (\overline{x}_{\text{correct}})^2$

$\sigma_{\text{correct}}^2 = \frac{2160}{20} - (10.2)^2$

$\sigma_{\text{correct}}^2 = 108 - 104.04$

$\sigma_{\text{correct}}^2 = 3.96$

Calculate the correct standard deviation:

$\sigma_{\text{correct}} = \sqrt{\sigma_{\text{correct}}^2} = \sqrt{3.96}$

Using a calculator, $\sqrt{3.96} \approx 1.98997$

The correct standard deviation is approximately 1.99.

The correct mean is 10.2 and the correct standard deviation is $\sqrt{3.96} \approx 1.99$ if the wrong item is replaced by 12.

Question 6. The mean and standard deviation of marks obtained by 50 students of a class in three subjects, Mathematics, Physics and Chemistry are given below:

Subject Mathematics Physics Chemistry
Mean 42 32 40.9
Standard deviation 12 15 20

Which of the three subjects shows the highest variability in marks and which shows the lowest?

Answer:

Given:

The mean and standard deviation for the marks of 50 students in three subjects are given in the table:

Subject Mean ($\overline{x}$) Standard Deviation ($\sigma$)
Mathematics4212
Physics3215
Chemistry40.920

To Find:

We need to find which subject shows the highest variability in marks and which shows the lowest.


Solution:

To compare the variability of two or more data sets with different means, we use the Coefficient of Variation (CV).

The formula for Coefficient of Variation is:

$CV = \frac{\text{Standard Deviation}}{\text{Mean}} \times 100\%$

Let's calculate the Coefficient of Variation for each subject:

Coefficient of Variation for Mathematics:

$CV_{Math} = \frac{\sigma_{Math}}{\overline{x}_{Math}} \times 100\%$

$CV_{Math} = \frac{12}{42} \times 100\%$

$CV_{Math} = \frac{2}{7} \times 100\%$

$CV_{Math} \approx 0.2857 \times 100\%$

$CV_{Math} \approx 28.57\%$

Coefficient of Variation for Physics:

$CV_{Physics} = \frac{\sigma_{Physics}}{\overline{x}_{Physics}} \times 100\%$

$CV_{Physics} = \frac{15}{32} \times 100\%$

$CV_{Physics} = 0.46875 \times 100\%$

$CV_{Physics} = 46.875\%$

$CV_{Physics} \approx 46.88\%$

Coefficient of Variation for Chemistry:

$CV_{Chemistry} = \frac{\sigma_{Chemistry}}{\overline{x}_{Chemistry}} \times 100\%$

$CV_{Chemistry} = \frac{20}{40.9} \times 100\%$

$CV_{Chemistry} = \frac{2000}{40.9}\%$

$CV_{Chemistry} \approx 48.90\%$


Comparing the Coefficients of Variation:

$CV_{Math} \approx 28.57\%$

$CV_{Physics} \approx 46.88\%$

$CV_{Chemistry} \approx 48.90\%$

A higher Coefficient of Variation indicates greater variability.

The highest Coefficient of Variation is for Chemistry ($48.90\%$).

The lowest Coefficient of Variation is for Mathematics ($28.57\%$).

Therefore, Chemistry shows the highest variability in marks, and Mathematics shows the lowest variability.

Question 7. The mean and standard deviation of a group of 100 observations were found to be 20 and 3, respectively. Later on it was found that three observations were incorrect, which were recorded as 21, 21 and 18. Find the mean and standard deviation if the incorrect observations are omitted.

Answer:

Given:

Number of observations, $n_{old} = 100$

Old Mean, $\overline{x}_{old} = 20$

Old Standard Deviation, $\sigma_{old} = 3$

Incorrect observations are 21, 21, and 18.


To Find:

The new mean and standard deviation after omitting the incorrect observations.


Solution:

We know that the mean is given by $\overline{x} = \frac{\sum\limits x_i}{n}$.

The sum of the old observations is $\sum\limits x_{old} = n_{old} \times \overline{x}_{old}$.

$\sum\limits x_{old} = 100 \times 20 = 2000$

The incorrect observations are 21, 21, and 18.

The sum of incorrect observations $= 21 + 21 + 18 = 60$.

The correct sum of the remaining observations is $\sum\limits x_{new} = \sum\limits x_{old} - \text{Sum of incorrect observations}$.

$\sum\limits x_{new} = 2000 - 60 = 1940$

The new number of observations is $n_{new} = n_{old} - \text{Number of incorrect observations}$.

$n_{new} = 100 - 3 = 97$

The new mean is $\overline{x}_{new} = \frac{\sum\limits x_{new}}{n_{new}}$.

$\overline{x}_{new} = \frac{1940}{97} = 20$

$\overline{x}_{new} = 20$

... (i)

Now, we need to find the new standard deviation. The formula for standard deviation is $\sigma = \sqrt{\frac{\sum\limits x_i^2}{n} - \overline{x}^2}$.

Squaring the standard deviation, we get the variance: $\sigma^2 = \frac{\sum\limits x_i^2}{n} - \overline{x}^2$.

From the old data, we have $\sigma_{old}^2 = \frac{\sum\limits x_{old}^2}{n_{old}} - \overline{x}_{old}^2$.

$3^2 = \frac{\sum\limits x_{old}^2}{100} - 20^2$

$9 = \frac{\sum\limits x_{old}^2}{100} - 400$

$9 + 400 = \frac{\sum\limits x_{old}^2}{100}$

$409 = \frac{\sum\limits x_{old}^2}{100}$

$\sum\limits x_{old}^2 = 409 \times 100 = 40900$

The sum of squares of incorrect observations is $21^2 + 21^2 + 18^2 = 441 + 441 + 324 = 1206$.

The correct sum of squares of the remaining observations is $\sum\limits x_{new}^2 = \sum\limits x_{old}^2 - \text{Sum of squares of incorrect observations}$.

$\sum\limits x_{new}^2 = 40900 - 1206 = 39694$

Now we can calculate the new variance, $\sigma_{new}^2$, using the new sum of squares and the new mean.

$\sigma_{new}^2 = \frac{\sum\limits x_{new}^2}{n_{new}} - \overline{x}_{new}^2$

$\sigma_{new}^2 = \frac{39694}{97} - 20^2$

$\sigma_{new}^2 = \frac{39694}{97} - 400$

$\sigma_{new}^2 = \frac{39694 - 400 \times 97}{97}$

$400 \times 97 = 38800$

$\sigma_{new}^2 = \frac{39694 - 38800}{97}$

$\sigma_{new}^2 = \frac{894}{97}$

$\sigma_{new}^2 = \frac{894}{97}$

... (ii)

Finally, the new standard deviation is $\sigma_{new} = \sqrt{\sigma_{new}^2}$.

$\sigma_{new} = \sqrt{\frac{894}{97}}$

$\sigma_{new} \approx \sqrt{9.216494845}$

$\sigma_{new} \approx 3.035866$

Rounding to two decimal places, $\sigma_{new} \approx 3.04$.

$\sigma_{new} \approx 3.04$

... (iii)

Thus, the new mean is 20 and the new standard deviation is approximately 3.04.