AP Statistics
Sections:  1|.Introduction 2| Designing Samples 3| Designing Experiments 4| Simulating Experiments

Simulating Experiments

Toss a coin 10 times. What is the likelihood of a run of 3 or more consecutive heads or tails? A couple plans to have children until they have a girl or until they have four children, whichever comes first. What are the chances that they will have a girl among their children? An airline knows from past experience that a certain percentage of customers who have purchased tickets will not show up to board the airplane. If the airline “overbooks” a particular flight (i.e., sells more tickets than they have seats), what are the chances that the airline will encounter more ticketed passengers than they have seats for? There are three methods we can use to answer questions involving chance like these:

1. Try to estimate the likelihood of a result of interest by actually carrying out the experiment many times and calculating the result’s relative frequency. That’s slow, sometimes costly, and often impractical or logistically difficult.

2. Develop a probability model and use it to calculate a theoretical answer. This requires that we know something about the rules of probability and therefore may not be feasible. (We will develop a probability model in the next unit.)

3. Start with a model that, in some fashion, reflects the truth about the experiment, and then develop a procedure for imitating—or simulating—a number of repetitions of the experiment. This is quicker than repeating the real experiment, especially if we can use the TI-83/89 or a computer, and it allows us to do problems that are hard when done with formal mathematical analysis.

Here is an example of a simulation.

 A GIRL IN THE FAMILY

Suppose we are interested in estimating the likelihood of a couple’s having a girl among their first four children. Let a flip of a fair coin represent a birth, with heads corresponding to a girl and tails a boy. Since girls and boys are equally likely to occur on any birth, the coin flip is an accurate imitation of the situation. Flip the coin until a head appears or until the coin has been flipped 4 times, whichever comes first. The appearance of a head within the first 4 flips corresponds to the couple’s having a girl among their first four children.

If this coin-flipping procedure is repeated many times, to represent the births in a large number of families, then the proportion of times that a head appears within the first 4 flips should be a good estimate of the true likelihood of the couple’s having a girl. A single die (one of a pair of dice) could also be used to simulate the birth of a son or daughter. Let an even number of spots (called pips) represent a girl, and let an odd number of spots represent a boy.

 

 

SIMULATION

The imitation of chance behavior, based on a model that accurately reflects the experiment under consideration, is called a simulation.

Simulation is an effective tool for finding likelihoods of complex results once we have a trustworthy model. In particular, we can use random digits from a table, graphing calculator, or computer software to simulate many repetitions quickly. The proportion of repetitions on which a result occurs will eventually be close to its true likelihood, so simulation can give good estimates of probabilities. The art of random digit simulation can be illustrated by a series of examples.

Example:

 

SIMULATION STEPS

Step 1: State the problem or describe the experiment. Toss a coin 10 times. What is the likelihood of a run of at least 3 consecutive heads or 3 consecutive tails?

Step 2: State the assumptions. There are two:

• A head or a tail is equally likely to occur on each toss.

• Tosses are independent of each other (i.e., what happens on one toss will not influence the next toss).

Step 3: Assign digits to represent outcomes. In a random number table, such as Random Number Table, the digits 0, 1, 2, 3, 4, 5, 6, 7, 8, and 9 occur with the same long-term relative frequency (1/10). We also know that the successive digits in the table are independent. It follows that even digits and odd digits occur with the same long-term relative frequency, 50%. Here is one assignment of digits for coin tossing:

• One digit simulates one toss of the coin.

Odd digits represent heads; even digits represent tails.

Successive digits in the table simulate independent tosses.

Step 4: Simulate many repetitions. Looking at 10 consecutive digits in Random Number Table simulates one repetition. Read many groups of 10 digits from the table to simulate many repetitions. Be sure to keep track of whether or not the event we want (a run of 3 heads or 3 tails) occurs on each repetition.

Here are the first three repetitions, starting at line 31 in the Random Number Table. Runs of 3 or more heads or tails have been underlined.

Digits 4 1 6 9 2 4 0 5 8 1 | 9 3 0 5 0 4 8 7 3 4 | 3 4 6 5 2 4 1 5 7 7

Heads/tails T H T H T T T H T H | H H T H T T T H H T | H T T H T T H H H H

Run of 3 YES YES YES

Twenty-two additional repetitions were done for a total of 25 repetitions; 23 of them did have a run of 3 or more heads or tails.

Step 5: State your conclusions. We estimate the probability of a run by the proportion Of course, 25 repetitions are not enough to be confident that our estimate is accurate. Now that we understand how to do the simulation, we can tell a computer to do many thousands of repetitions. A long simulation (or mathematical analysis) finds that the true probability is about 0.826.

estimated probability =  23/25 = 0.92

 

Once you have gained some experience in simulation, establishing a correspondence  between random numbers and outcomes in the experiment is usually the hardest part, and must be done carefully. Although coin tossing may not fascinate you, the model in the example above is typical of many probability problems because it consists of independent trials (the tosses) all having the same possible outcomes and probabilities. The coin tosses are said to be independent because the result of one toss has no effect or influence over the next coin toss. Shooting 10 free throws and observing the sexes of 10 children have similar models and are simulated in much the same way. The idea is to state the basic structure of the random phenomenon and then use simulation to move from this model to the probabilities of more complicated events. The model is based on opinion and past experience. If it does not correctly describe the random phenomenon, the probabilities derived from it by simulation will also be incorrect. Step 3 (assigning digits) can usually be done in several different ways, but some assignments are more efficient than others. Here are some examples of this step.

Example:

 

ASSIGNING DIGITS

(a) Choose a person at random from a group of which 70% are employed. One digit simulates one person:

0, 1, 2, 3, 4, 5, 6 = employed

7, 8, 9 = not employed

The following correspondence is also satisfactory:

00, 01, . . . , 69 = employed

70, 71, . . . , 99 = not employed

This assignment is less efficient, however, because it requires twice as many digits and ten times as many numbers.

(b) Choose one person at random from a group of which 73% are employed. Now two digits simulate one person:

00, 01, 02, . . . , 72 = employed

73, 74, 75, . . . , 99 = not employed

We assigned 73 of the 100 two-digit pairs to “employed” to get probability 0.73. Representing “employed” by 01, 02, . . . , 73 would also be correct.

(c) Choose one person at random from a group of which 50% are employed, 20% are unemployed, and 30% are not in the labor force. There are now three possible outcomes, but the principle is the same. One digit simulates one person:

0, 1, 2, 3, 4 = employed

5, 6 = unemployed

7, 8, 9 = not in the labor force

Another valid assignment of digits might be

0, 1 = unemployed

2, 3, 4 = not in the labor force

5, 6, 7, 8, 9 = employed

What is important is the number of digits assigned to each outcome, not the order of  the digits.

 

As the last example shows, simulation methods work just as easily when outcomes are not equally likely. Consider the following slightly more complicated example.

Example:

 

FROZEN YOGURT SALES

Orders of frozen yogurt flavors (based on sales) have the following relative frequencies: 38% chocolate, 42% vanilla, and 20% strawberry. The experiment consists of customers entering the store and ordering yogurt. The task is to simulate 10 frozen yogurt sales based on this recent history. Instead of considering the random number table to be made up of single digits, we now consider it to be made up of pairs of digits. This is because the relative frequencies of interest have a maximum of two significant digits. The range of the pairs of digits is 00 to 99, and since all the pairs are equally likely to occur, the pairs 00, 01, 02, . . . , 99 all have relative frequency 0.01.

Thus we may assign the numbers in the random number table as follows:

00 to 37 to correspond to the outcome chocolate (C)

• 38 to 79 to correspond to the outcome vanilla (V)

80 to 99 to correspond to the outcome strawberry (S)

The sequence of random numbers (starting at row 36 the Random Number Table) is as follows:

24028 03405 01178 06316

This yields the following two-digit numbers:

24 02 80 34 05 01 17 80 63 16

which correspond to the outcomes

C C S C C C C S V C

This small sample is not representative of the population. More trials would be needed to simulate the population distribution.

 

Example:

 

A GIRL OR FOUR

A couple plans to have children until they have a girl or until they have four children, whichever comes first. We will show how to use random digits to estimate the likelihood that they will have a girl.

The model is the same as for coin tossing. We will assume that each child has probability 0.5 of being a girl and 0.5 of being a boy, and the sexes of successive children are independent.

Assigning digits is also easy. One digit simulates the sex of one child:

0, 1, 2, 3, 4 = girl

5, 6, 7, 8, 9 = boy

To simulate one repetition of this child-bearing strategy, read digits from the Random Number Table until the couple has either a girl or four children. Notice that the number of digits needed to simulate one repetition depends on how quickly the couple gets a girl. Here is the simulation, using line 38 of Random Number Table.

To interpret the digits, G for girl and B for boy are written under them, space separates repetitions, and under each repetition “+” indicates if a girl was born and “–” indicates one was not.

7854  54  92  0  1  0  53  2  91  4  1  82  1  0  971

BBBG BG BG G G G BG G BG G G BG G G BBG

                                +      +    +   + +  +  +   +  +   +  +   +   + +    +

90  4  72  4  4  682  3  93  0  4  1  981  9557

BG G BG G G BBG G BG G G G BBG BBBB

                                 +  +   +   +  +   +     +   +  +  +  +   +        –

In these 28 repetitions, a girl was born 27 times. Our estimate of the probability that this strategy will produce a girl is therefore

estimated probability  = 27/28 = .964

Some mathematics shows that if our probability model is correct, the true likelihood of having a girl is 0.938. Our simulated answer was within reason. Unless the couple is unlucky, they will succeed in having a girl.

 

Simulations with the calculator or computer

The calculator and computer can be extremely useful in conducting simulations because they can be easily programmed to quickly perform a large number of repetitions.

Review the content of Unit 5 and proceed to the Unit 5 Exam.