AP Statistics
Sections:  1.|  Density Curves  2.| Normal Distributions 3| Normal Distribution Calculations 4.| Assessing Normality

  Density Curves

Working through the first unit we now have some basic tools for describing distributions. We should have a clear strategy for exploring data on a single quantitative variable (univariate). Below are a few reminders.

Always plot your data: make a graph, usually a histogram or a stemplot.

Look for the overall pattern (shape, center, spread) and for striking deviations such as outliers.

Calculate a numerical summary to briefly describe the center and spread.

Here is one more step to add to the strategy:

Sometimes the overall pattern of a large number of observations is so regular that we can describe it by a smooth curve.

Below is a histogram with a smooth curve superimposed. Don't worry about the source, the graph is being used to illustrate the concept above. The smooth curve is a mathematical model for the distribution.  A mathematical model is an idealized description for the data. It gives a picture of the overall data ignoring the slight irregularities.

 

Figure 2.1

Choosing the width of the classes on a histogram can have a bearing on the final shape. Here is another example with wider class intervals.

Figure 2.2

Any curve that is always on or above the horizontal axis and has total area underneath equal to one is a density curve.

 

 A density curve is a smooth curve that:                  

  • is always on or above the horizontal axis

  • has an area of exactly 1 underneath it.
     

The density curves in Figures 2.1 and 2.2 are normal curves. Notice that they not identical, that is, there are any number of normal curves. Normal at this point means the curve is symmetrical about the center of the graph and it is above or on the x-axis.

In the first unit we looked at shapes of graphs, i.e., symmetry and skewness. Figure 2.3 shows three density curves: a symmetric normal density curve, a left skewed curve and a right-skewed curve with the relative placement of the mean and median for each curve. A density curve gives an approximate shape and is often an adequate description of the overall pattern of the distribution.

Figure 2.3(a)                         Figure 2.3(b)                     Figure 2.3(c)

Density curves will not show outliers since the "smoothing" encompasses the points into the curve. No set of real data is exactly described by a density curve, the curve is just an approximation that makes it easy to use for practical calculations as we will see shortly.

The median and mean of a density curve.

Our measures of center and spread apply to density curves as well as to actual sets of observations. The median and quartiles are easy. Areas under a density area under the curve into quarters. One-fourth of the area under the curve is to the left of the first quartile, and three-fourths of the area is to the left of the third quartile. You can roughly locate the median and quartiles of any density curve by eye by dividing the area under the curve into four equal parts.

Because density curves are idealized patterns, a density curve is exactly symmetric. The median of a symmetric density curve is therefore at its center. Figure 2.3(a) shows the median of a symmetric curve. It isn’t so easy to spot the equal-areas point on a skewed curve. There are mathematical ways of finding the median for any density curve. We did that to mark the median on the skewed curves in Figure 2.3(b) and Figure 2.3(c).

What about the mean? The mean of a set of observations is their arithmetic average. If we think of the observations as weights strung out along a thin rod, the mean is the point at which the rod would balance. This fact is also true of density curves. The mean is the point at which the curve would balance if made of solid material. Figure 2.4 illustrates this fact about the mean. A symmetric curve balances at its center because the two sides are identical. The mean and median of a symmetric density curve are equal, as in Figure 2.3(a). We know that the mean of a skewed distribution is pulled toward the long tail. Figures 2.3(b) and 2.3(c) shows how the mean of a skewed density curve is pulled toward the long tail more than is the median. It’s hard to locate the balance point by eye on a skewed curve. There are mathematical ways of calculating the mean for any density curve, so we are able to mark the mean as well as the median in Figure 2.3.

Figure 2.4

Median and mean of a density curve

Median: The equal-areas point with 50% of the “mass” on either side.

Mean: The balancing point of the curve, if it were a solid mass.

The mean and median of a symmetric density curve are equal.

The mean of a skewed curve is pulled away from the median in the direction of the long tail.

We can roughly locate the mean, median, and quartiles of any density curve by eye. This is not true of the standard deviation. When necessary, we can once again call on more advanced mathematics to learn the value of the standard deviation. The study of mathematical methods for doing calculations with density curves is part of theoretical statistics. Though we are concentrating on statistical practice, we often make use of the results of mathematical study.

Because a density curve is an idealized description of the distribution of data, we need to distinguish between the mean and standard deviation of the density curve and the mean x-bar and standard deviation s computed from the actual observations. The usual notation for the mean of an idealized distribution is μ (the Greek letter mu). We write the standard deviation of a density curve as σ (the Greek letter sigma).

Try Self-Check 8

Proceed to Statistics Assignment 4: Data Distributions

© 2004 Aventa Learning. All rights reserved.