Menu Top
Complete Course of Mathematics
Topic 1: Numbers & Numerical Applications Topic 2: Algebra Topic 3: Quantitative Aptitude
Topic 4: Geometry Topic 5: Construction Topic 6: Coordinate Geometry
Topic 7: Mensuration Topic 8: Trigonometry Topic 9: Sets, Relations & Functions
Topic 10: Calculus Topic 11: Mathematical Reasoning Topic 12: Vectors & Three-Dimensional Geometry
Topic 13: Linear Programming Topic 14: Index Numbers & Time-Based Data Topic 15: Financial Mathematics
Topic 16: Statistics & Probability


Content On This Page
Normal Distribution: Definition and Properties (Symmetry, Bell Shape) Probability Density Function of Normal Distribution (Implicit) Standard Normal Distribution and Z-scores
Area Under the Normal Curve and its Interpretation (using Z-tables)


Normal Distribution




Normal Distribution: Definition and Properties (Symmetry, Bell Shape)


Definition

The **Normal Distribution**, often referred to as the **Gaussian Distribution** (named after Carl Friedrich Gauss) or colloquially as the "bell curve," is the most significant and widely used continuous probability distribution. Its importance stems from its ability to model numerous naturally occurring phenomena and its central role in statistical theory, particularly due to the Central Limit Theorem.

A **continuous random variable** $X$ is said to follow a normal distribution if its probability distribution exhibits a characteristic symmetric, bell-shaped curve. The shape and position of this curve are entirely determined by two parameters:

A normal distribution with mean $\mu$ and variance $\sigma^2$ is denoted by the notation $X \sim N(\mu, \sigma^2)$. Note that some texts or software might use the standard deviation in the notation, e.g., $N(\mu, \sigma)$, so it's important to clarify whether the second parameter is variance or standard deviation.


Properties of the Normal Distribution

The normal distribution has several key properties that contribute to its importance and ease of use:

  1. Bell Shape:

    The graphical representation of the normal distribution's probability density function is a distinctive, symmetrical, bell-shaped curve.

    Graph of a normal distribution bell curve
  2. Symmetry:

    The normal curve is perfectly symmetric about its vertical axis passing through the mean ($\mu$). This symmetry implies that the distribution is not skewed.

    Due to this symmetry, the mean, median, and mode of a normal distribution are all located at the same point:

    Mean = Median = Mode = $\mu$

    ... (i)

  3. Unimodal:

    The distribution has a single peak, which occurs at the mean, median, and mode ($\mu$).

  4. Asymptotic Tails:

    The tails of the normal curve extend indefinitely in both directions, approaching the horizontal axis asymptotically. This means the curve gets infinitely close to the x-axis but never actually touches it, indicating that technically, any real number is a possible value for a normally distributed variable, although values far from the mean have extremely low probabilities.

  5. Total Area Under the Curve:

    As with any continuous probability distribution, the total area under the probability density curve and above the horizontal axis is equal to 1. This represents the total probability of all possible outcomes, which must sum to 1 (or 100%).

    $$\int_{-\infty}^{\infty} f(x) dx = 1$$

    ... (ii)

    (Where $f(x)$ is the probability density function of the normal distribution).

  6. Empirical Rule (68-95-99.7 Rule):

    A very useful property for interpreting normal distributions is the empirical rule, which states the approximate percentage of data that falls within certain standard deviations of the mean:

    • Approximately **68%** of the data falls within **one** standard deviation of the mean ($\mu \pm \sigma$).
    • Approximately **95%** of the data falls within **two** standard deviations of the mean ($\mu \pm 2\sigma$).
    • Approximately **99.7%** of the data falls within **three** standard deviations of the mean ($\mu \pm 3\sigma$).

    This rule provides a quick way to understand the spread of data in a normal distribution.

    Diagram illustrating the Empirical Rule for a normal distribution

These properties make the normal distribution mathematically tractable and applicable to a wide range of statistical problems.



Probability Density Function of Normal Distribution (Implicit)


Probability for Continuous Random Variables

For a **continuous random variable** $X$, we cannot talk about the probability of $X$ taking on a single specific value (since there are infinitely many values, the probability of any single value is effectively zero). Instead, probability for a continuous random variable is defined over intervals.

The distribution of a continuous random variable is described by a **Probability Density Function (PDF)**, typically denoted by $f(x)$. The PDF does not give probabilities directly, but its value at any given point indicates the relative likelihood of the variable taking a value around that point.

Key characteristics of a Probability Density Function $f(x)$ for a continuous random variable:


The Normal PDF Formula

The normal distribution $N(\mu, \sigma^2)$ is defined by a specific mathematical formula for its Probability Density Function, $f(x)$. This formula dictates the precise shape of the normal curve (bell shape, symmetry, etc.) for any given values of the mean ($\mu$) and standard deviation ($\sigma$).

The formula for the PDF of a normal distribution is:

$$f(x \, | \, \mu, \sigma) = \frac{1}{\sigma \sqrt{2\pi}} e^{-\frac{1}{2} \left(\frac{x - \mu}{\sigma}\right)^2}$$

... (3)

for $-\infty < x < \infty$, $\mu \in \mathbb{R}$, and $\sigma > 0$.

Where:

While calculating probabilities by integrating this function is complex and usually done using statistical software or tables, understanding the formula confirms that the distribution is completely determined by $\mu$ and $\sigma$. The shape is fixed, just scaled and shifted by these parameters. The constant $\frac{1}{\sigma \sqrt{2\pi}}$ is a normalizing constant that ensures the total area under the curve is 1.

In most introductory statistics applications, probabilities for a normal distribution are found using the Standard Normal Distribution (Z-scores) and Z-tables rather than direct integration of this formula.




Standard Normal Distribution and Z-scores


The Standard Normal Distribution ($Z$)

Calculating probabilities for every possible normal distribution $N(\mu, \sigma^2)$ by integrating its probability density function (PDF) formula directly would be impractical. To simplify probability calculations for normal distributions, we use a single, standardized normal distribution as a universal reference. This is called the **Standard Normal Distribution**.

Definition: The Standard Normal Distribution is a special case of the normal distribution where the **mean ($\mu$) is 0** and the **standard deviation ($\sigma$) is 1**. Consequently, its variance ($\sigma^2$) is also 1.

A random variable that follows the standard normal distribution is conventionally denoted by the letter $Z$. Thus, $Z \sim N(\mu=0, \sigma^2=1)$.

Graph of the Standard Normal Distribution curve

The properties of the standard normal distribution are the same as any normal distribution (bell shape, symmetry, unimodal, asymptotic tails, total area = 1), but its centering at 0 and spread of 1 make it convenient for standardization.


Standardization and Z-scores

The process of converting any value from an arbitrary normal distribution $X \sim N(\mu, \sigma^2)$ into a corresponding value on the standard normal distribution $Z \sim N(0, 1)$ is called **standardization**. The transformed value is known as a **Z-score** or a standard score.

The formula for standardizing an observation $x$ from a normal distribution with mean $\mu$ and standard deviation $\sigma$ is:

$$Z = \frac{x - \mu}{\sigma}$$

... (1)

Interpretation of a Z-score:

A Z-score tells you exactly how many standard deviations an original observation $x$ is away from the mean $\mu$ of its distribution. The sign of the Z-score indicates whether the observation is above or below the mean:

For example, a Z-score of $Z=2$ means the value $x$ is 2 standard deviations above the mean. A Z-score of $Z=-0.5$ means the value $x$ is half a standard deviation below the mean.

Purpose of Standardization:

The primary reason for standardization is to be able to use a single table (the Standard Normal or Z-table) or standard statistical functions in calculators/software to find probabilities for ANY normal distribution. Once an $X$ value is converted to its corresponding $Z$-score, the probability associated with that $X$ value (e.g., the probability of getting a value less than $x$) is the same as the probability associated with the $Z$-score in the standard normal distribution.


Example

Example 1. Scores on a statistics test are normally distributed with a mean of 70 and a standard deviation of 8. Find the Z-scores for students who scored:

(a) 78

(b) 62

(c) 70

Answer:

Given: Normal distribution with mean $\mu = 70$ and standard deviation $\sigma = 8$.

To Find: Z-scores for given scores.

Solution:

We use the standardization formula $Z = \frac{X - \mu}{\sigma}$ (Formula 1) with $\mu=70$ and $\sigma=8$.

**(a) Score $X = 78$:**

$$Z = \frac{78 - 70}{8}$$

... (ii)

$$Z = \frac{8}{8} = 1$$

... (iii)

The Z-score for a score of 78 is 1. This means a score of 78 is exactly 1 standard deviation above the mean.

**(b) Score $X = 62$:**

$$Z = \frac{62 - 70}{8}$$

... (iv)

$$Z = \frac{-8}{8} = -1$$

... (v)

The Z-score for a score of 62 is -1. This means a score of 62 is exactly 1 standard deviation below the mean.

**(c) Score $X = 70$:**

$$Z = \frac{70 - 70}{8}$$

... (vi)

$$Z = \frac{0}{8} = 0$$

... (vii)

The Z-score for a score of 70 is 0. This means a score of 70 is exactly at the mean.



Area Under the Normal Curve and its Interpretation (using Z-tables)


Area Represents Probability

For any continuous random variable, the probability that the variable falls within a specific range of values is represented by the **area under the probability density function (PDF) curve** over that range. For a normally distributed variable $X \sim N(\mu, \sigma^2)$:

The probability that $X$ is between values $a$ and $b$, denoted $P(a \le X \le b)$, is equal to the area under the normal curve between $x=a$ and $x=b$.

Shaded area under normal curve between a and b

Since the normal distribution is continuous, the probability of $X$ being exactly equal to any single value is zero ($P(X=x) = 0$). Therefore, for any constants $a$ and $b$, $P(a \le X \le b) = P(a < X \le b) = P(a \le X < b) = P(a < X < b)$. The inclusion or exclusion of the endpoints does not affect the probability (area).


Using the Standard Normal (Z) Distribution for Probabilities

Because any normal distribution can be transformed into the standard normal distribution using $Z = (X - \mu) / \sigma$, we can calculate probabilities for any $X \sim N(\mu, \sigma^2)$ by finding the corresponding area under the standard normal curve $Z \sim N(0, 1)$.

The probability statement involving $X$ can be converted into an equivalent probability statement involving $Z$. For example, to find $P(a \le X \le b)$, we standardize the values $a$ and $b$ to obtain Z-scores $z_1 = (a-\mu)/\sigma$ and $z_2 = (b-\mu)/\sigma$. Then, the probability $P(a \le X \le b)$ is equal to the area under the standard normal curve between $z_1$ and $z_2$, i.e., $P(z_1 \le Z \le z_2)$.

$$P(a \le X \le b) = P\left(\frac{a-\mu}{\sigma} \le Z \le \frac{b-\mu}{\sigma}\right) = P(z_1 \le Z \le z_2)$$

... (iii)


Standard Normal (Z) Tables

Standard Normal tables, commonly known as **Z-tables**, provide pre-calculated areas under the standard normal curve $Z \sim N(0, 1)$ for various Z-scores. The type of table used is important:

Assuming we use a Z-table that provides $P(Z \le z)$ (area to the left):

Modern statistical calculators and software can compute normal probabilities directly given $\mu$, $\sigma$, and the interval, without requiring manual Z-score conversion or table lookups.


Example

Example 1. Using the test score data from Example 1, Section I3 ($X \sim N(\mu=70, \sigma=8)$), find the probability that a randomly selected student scored:

(a) Less than 78

(b) More than 62

(c) Between 62 and 78

Answer:

Given: Test scores are normally distributed with mean $\mu=70$ and standard deviation $\sigma=8$.

To Find: Probabilities for specific score ranges.

Solution:

We convert the given X scores to Z-scores using the formula $Z = (X - \mu) / \sigma$. From Example 1, Section I3:

  • A score of $X=78$ corresponds to $Z = \frac{78 - 70}{8} = \frac{8}{8} = 1$.
  • A score of $X=62$ corresponds to $Z = \frac{62 - 70}{8} = \frac{-8}{8} = -1$.
  • A score of $X=70$ corresponds to $Z = \frac{70 - 70}{8} = \frac{0}{8} = 0$.

**(a) Probability of scoring less than 78:** $P(X < 78)$.

Convert the inequality to Z-scores:

$$P(X < 78) = P\left(Z < \frac{78 - 70}{8}\right) = P(Z < 1)$$

... (iv)

Using a standard normal (Z) table that gives area to the left (or a calculator function):

$$P(Z < 1) = P(Z \le 1) \approx 0.8413$$

(From Z-table for Z=1.00) ... (v)

The probability of scoring less than 78 is approximately 0.8413.

**(b) Probability of scoring more than 62:** $P(X > 62)$.

Convert the inequality to Z-scores:

$$P(X > 62) = P\left(Z > \frac{62 - 70}{8}\right) = P(Z > -1)$$

... (vi)

Using the complement rule $P(Z > -1) = 1 - P(Z \le -1)$. Look up $Z=-1.00$ in the table:

$$P(Z \le -1) \approx 0.1587$$

(From Z-table for Z=-1.00) ... (vii)

$$P(X > 62) = 1 - P(Z \le -1) \approx 1 - 0.1587 = 0.8413$$

... (viii)

The probability of scoring more than 62 is approximately 0.8413. (Note the symmetry: $P(X > \mu - c) = P(X < \mu + c)$ for symmetric distributions).

**(c) Probability of scoring between 62 and 78:** $P(62 \le X \le 78)$.

Convert the interval to Z-scores:

$$P(62 \le X \le 78) = P\left(\frac{62 - 70}{8} \le Z \le \frac{78 - 70}{8}\right) = P(-1 \le Z \le 1)$$

... (ix)

Using the property for area between two Z-scores: $P(-1 \le Z \le 1) = P(Z \le 1) - P(Z \le -1)$.

We already found $P(Z \le 1) \approx 0.8413$ (from v) and $P(Z \le -1) \approx 0.1587$ (from vii).

$$P(62 \le X \le 78) \approx 0.8413 - 0.1587 = 0.6826$$

... (x)

The probability of scoring between 62 and 78 is approximately 0.6826.

This result is consistent with the Empirical Rule, which states that approximately 68% of data in a normal distribution falls within one standard deviation of the mean ($\mu \pm \sigma = 70 \pm 8 = [62, 78]$).

Summary of results:

(a) $P(X < 78) \approx 0.8413$

(b) $P(X > 62) \approx 0.8413$

(c) $P(62 \le X \le 78) \approx 0.6826$