Menu Top
Complete Course of Mathematics
Topic 1: Numbers & Numerical Applications Topic 2: Algebra Topic 3: Quantitative Aptitude
Topic 4: Geometry Topic 5: Construction Topic 6: Coordinate Geometry
Topic 7: Mensuration Topic 8: Trigonometry Topic 9: Sets, Relations & Functions
Topic 10: Calculus Topic 11: Mathematical Reasoning Topic 12: Vectors & Three-Dimensional Geometry
Topic 13: Linear Programming Topic 14: Index Numbers & Time-Based Data Topic 15: Financial Mathematics
Topic 16: Statistics & Probability


Content On This Page
Histograms: Construction and Interpretation (for Grouped Data) Frequency Polygon: Construction and Interpretation Graphic Representation of Data (Histograms & Frequency Polygon for Frequency Distributions)


Graphical Representation: Frequency Distributions




Histograms: Construction and Interpretation (for Grouped Data)


Definition and Characteristics

A histogram is a powerful graphical tool used to represent the distribution of a continuous numerical variable. It is constructed from a grouped frequency distribution. Unlike bar graphs, which are used for discrete or categorical data and have spaces between bars, histograms feature adjacent rectangular bars, symbolizing the continuous nature of the data.

Key characteristics of a histogram:


Construction of a Histogram (for Equal Class Widths)

To construct a histogram when the class intervals have equal width, follow these steps:

  1. Prepare the Frequency Distribution Table:

    Ensure you have a grouped frequency distribution table where the class intervals are continuous. Ideally, use the exclusive method (e.g., $0-10, 10-20, 20-30$). If your data is in inclusive intervals (e.g., $0-9, 10-19$), convert them into continuous class boundaries before plotting. The lower boundary is obtained by subtracting the adjustment factor (half the gap between upper limit of a class and lower limit of the next) from the lower limit, and the upper boundary is obtained by adding the adjustment factor to the upper limit. For $0-9, 10-19$, the boundaries are $0-0.5 = -0.5$ to $9+0.5 = 9.5$, and $10-0.5 = 9.5$ to $19+0.5 = 19.5$, and so on.

  2. Draw and Label Axes:

    Draw two perpendicular lines: the horizontal axis (x-axis) and the vertical axis (y-axis). Label the x-axis with the variable being measured (e.g., Weight in kg, Marks, Height in cm). Mark the class boundaries on the x-axis.

    Label the y-axis as "Frequency" or "Number of...".

  3. Choose a Scale:

    Select a suitable scale for the y-axis such that the highest frequency fits comfortably on the graph. The scale must start from 0 and use equal increments.

    Select an appropriate scale for the x-axis based on the range of class boundaries. If the first class interval does not start from 0, and there is a significant gap between 0 and the first lower boundary, you can indicate this break in the axis using a "kink" or a jagged line near the origin on the x-axis. This is done to avoid wasting space while still accurately representing the relative positions of the class intervals.

  4. Draw the Bars:

    For each class interval listed in your table, draw a rectangular bar:

    • The base of the bar lies on the x-axis and extends from the lower class boundary to the upper class boundary of that interval.
    • The height of the bar is drawn vertically from the x-axis up to the point on the y-axis that corresponds to the frequency of that class interval, according to the chosen scale.

    Since the class boundaries of consecutive intervals meet, the bars will be drawn adjacent to each other with no gaps.

  5. Add a Title:

    Give the histogram a clear, descriptive title that summarizes the data being displayed.

Note: Constructing a histogram for data with unequal class widths requires calculating "Frequency Density" (Frequency / Class Width) for each class and plotting it on the y-axis, ensuring the area of the bar remains proportional to the frequency. However, for basic histograms, equal class widths are commonly used.


Example

Example 1. Draw a histogram for the following frequency distribution of student weights:

Weight (kg)Frequency (f)
40 - 452
45 - 505
50 - 555
55 - 607
60 - 656
65 - 704
70 - 751
Total30

Answer:

Given: Grouped frequency distribution table for student weights.

To Construct: A histogram.

Solution:

The class intervals ($40-45, 45-50, \ldots, 70-75$) are continuous (exclusive method) and have equal width (5 kg). The class boundaries are the limits themselves (40, 45, 50, ..., 75).

We set up the histogram:

  • Horizontal Axis (x-axis): Represents "Weight (kg)". We mark the class boundaries 40, 45, 50, 55, 60, 65, 70, 75 on this axis. Since the first class starts at 40 (not 0), we may use a kink near the origin if the graph needs to show the origin, but often the axis starts at or just before the first boundary for clarity of the distribution shape.
  • Vertical Axis (y-axis): Represents "Frequency" or "Number of Students". The highest frequency is 7. We can choose a scale from 0 up to 8 or 10 with increments of 1 or 2. Let's use increments of 1: 0, 1, 2, 3, 4, 5, 6, 7, 8.
  • Bars: We draw rectangular bars for each class interval:
    • For 40-45, base from 40 to 45 on x-axis, height 2.
    • For 45-50, base from 45 to 50, height 5.
    • For 50-55, base from 50 to 55, height 5.
    • For 55-60, base from 55 to 60, height 7.
    • For 60-65, base from 60 to 65, height 6.
    • For 65-70, base from 65 to 70, height 4.
    • For 70-75, base from 70 to 75, height 1.
  • Title: "Distribution of Student Weights".
Histogram showing distribution of student weights

Title: Distribution of Student Weights


Interpretation of a Histogram

A histogram provides valuable insights into the characteristics of the data distribution:

Histograms are invaluable for the initial exploration of continuous data, helping statisticians and analysts get a feel for the data's underlying distribution before performing more detailed numerical analysis.



Frequency Polygon: Construction and Interpretation


Definition and Purpose

A frequency polygon is another graphical method used to represent frequency distributions, particularly for grouped data. It is essentially a line graph created by plotting points that represent the frequency of each class interval at its midpoint (class mark) and then connecting these points with straight line segments.

Frequency polygons serve similar purposes to histograms but can be more advantageous for:


Construction of a Frequency Polygon

A frequency polygon can be constructed using two common methods:

Method 1: Using a Histogram

  1. First, construct the histogram for the given grouped frequency distribution.
  2. Mark a point at the midpoint of the top of each rectangular bar in the histogram.
  3. Connect these marked midpoints consecutively with straight line segments.
  4. To "close" the polygon and bring the line down to the x-axis, imagine a class interval immediately before the first actual class and one immediately after the last actual class, each having a frequency of 0. Find the class marks of these two imaginary classes. Connect the midpoint of the top of the first bar to the class mark of the preceding imaginary class on the x-axis. Similarly, connect the midpoint of the top of the last bar to the class mark of the succeeding imaginary class on the x-axis.

Method 2: Using the Frequency Table Directly

  1. Calculate Class Marks:

    For each class interval in your grouped frequency distribution, calculate the class mark (midpoint) using the formula:

    $\text{Class Mark} (x_i) = \frac{\text{Lower Limit} + \text{Upper Limit}}{2}$

    If you are using class boundaries, use the boundaries instead of limits in the formula.

  2. Add Imaginary Classes:

    To close the polygon on the x-axis, add two imaginary class intervals, one before the first class and one after the last class. The width of these imaginary classes should be the same as the actual classes. Assign a frequency of 0 to both these imaginary classes and calculate their class marks.

    For example, if the first class is $40-45$ (class mark 42.5) and the width is 5, the preceding imaginary class would be $35-40$ with class mark 37.5 and frequency 0.

  3. Draw and Label Axes:

    Draw two perpendicular axes. Label the horizontal axis (x-axis) with the variable and mark the class marks on this axis. Label the vertical axis (y-axis) as "Frequency" and choose an appropriate scale starting from 0.

  4. Plot Points:

    Plot points corresponding to the class mark and frequency of each class interval. Each point will have coordinates $(x_i, f)$, where $x_i$ is the class mark and $f$ is the frequency. Remember to plot the points for the two imaginary classes as well; these points will lie on the x-axis since their frequency is 0.

  5. Connect Points:

    Connect the plotted points consecutively from left to right using straight line segments.

  6. Add a Title:

    Give the frequency polygon a clear and informative title.


Example

Example 1. Construct a frequency polygon for the following frequency distribution of student weights (using Method 2):

Weight (kg)Frequency (f)
40 - 452
45 - 505
50 - 555
55 - 607
60 - 656
65 - 704
70 - 751
Total30

Answer:

Given: Grouped frequency distribution table for student weights.

To Construct: A frequency polygon using Method 2.

Solution:

The class intervals are $40-45, 45-50, \ldots, 70-75$, with a constant width of 5.

1. Calculate Class Marks:

Weight (kg)Frequency (f)Class Mark ($x_i$)
40 - 452$\frac{40+45}{2} = 42.5$
45 - 505$\frac{45+50}{2} = 47.5$
50 - 555$\frac{50+55}{2} = 52.5$
55 - 607$\frac{55+60}{2} = 57.5$
60 - 656$\frac{60+65}{2} = 62.5$
65 - 704$\frac{65+70}{2} = 67.5$
70 - 751$\frac{70+75}{2} = 72.5$

2. Add Imaginary Classes and their Class Marks:

  • Class before 40-45: Width is 5. Lower limit $= 40 - 5 = 35$. Upper limit $= 40$. Imaginary class: $35-40$. Class Mark $= \frac{35+40}{2} = 37.5$. Frequency = 0.
  • Class after 70-75: Width is 5. Lower limit $= 75$. Upper limit $= 75 + 5 = 80$. Imaginary class: $75-80$. Class Mark $= \frac{75+80}{2} = 77.5$. Frequency = 0.

The points to plot are (37.5, 0), (42.5, 2), (47.5, 5), (52.5, 5), (57.5, 7), (62.5, 6), (67.5, 4), (72.5, 1), and (77.5, 0).

3. Draw and Label Axes: X-axis: Weight (kg), Y-axis: Frequency (Number of Students). Mark the class marks (37.5, 42.5, ..., 77.5) on the x-axis. Scale the y-axis from 0 to 8 or 10, similar to the histogram.

4. Plot Points: Plot the nine points calculated above on the graph paper.

5. Connect Points: Join the points with straight line segments in the order they appear on the x-axis, starting from (37.5, 0) and ending at (77.5, 0).

6. Title: "Frequency Polygon of Student Weights".

Frequency polygon showing distribution of student weights

Title: Frequency Polygon of Student Weights


Interpretation of a Frequency Polygon

Frequency polygons offer similar insights to histograms regarding the data distribution, but with a different visual emphasis:




Graphic Representation of Data (Histograms & Frequency Polygon for Frequency Distributions)

Both histograms and frequency polygons are powerful graphical methods used to represent the distribution of data, particularly when dealing with grouped frequency distributions derived from continuous or large datasets. While they depict the same underlying data, they do so in different visual formats, each emphasizing slightly different aspects of the distribution.


Comparing Histograms and Frequency Polygons

Understanding the distinctions and similarities between histograms and frequency polygons is key to choosing the appropriate graph for a given purpose and for interpreting them correctly. Here is a comparison of their main features:

Feature Histogram Frequency Polygon
Visual Form Consists of adjacent rectangular bars. The base of each bar represents a class interval, and the height represents the frequency (or frequency density). A line graph formed by plotting points at the class marks (midpoints) of the intervals against their corresponding frequencies, and connecting these points with straight lines. The polygon is typically closed by connecting to the x-axis.
Data Point Representation Shows the frequency of data falling within the entire range of a class interval (represented by the bar's area/height). Represents the frequency of the class interval at a single point – its class mark. The line segments connect these representative points.
Emphasis Emphasizes the frequency or count of observations within specific, discrete intervals. The area of each bar is key for understanding proportion within each bin. Emphasizes the continuous nature and the overall shape, trend, and smoothness of the distribution. Facilitates identifying the mode(s) and seeing how frequency changes gradually.
Continuity The absence of gaps between bars explicitly indicates the continuous nature of the underlying variable (when using class boundaries). The connecting line segments imply a continuous change in frequency across the range of the variable.
Comparison of Multiple Distributions Overlaying multiple histograms on the same graph can become visually cluttered and difficult to read, especially with many classes or datasets. Overlaying multiple frequency polygons is much clearer, making it easier to compare the shapes, centers, and spreads of several distributions on the same graph.
Construction Basis Constructed directly from the class intervals or class boundaries and their frequencies. Requires calculating the class mark (midpoint) for each class interval and uses these midpoints along with frequencies for plotting.
Area Under the Graph The total area of all bars in a histogram is proportional to the total frequency (total number of observations). If using frequency density, the area equals the total frequency. The total area under the frequency polygon is approximately equal to the total area of the corresponding histogram and thus represents the total frequency.

Choosing the Appropriate Graph

The choice between using a histogram and a frequency polygon depends on the specific goal of the data visualization:

In many analyses, both graphs can be complementary. A frequency polygon can even be drawn on top of a histogram by connecting the midpoints of the tops of the bars (and extending to the x-axis), illustrating their close relationship and providing both the interval-specific view and the overall shape trend.