
DIGMATH: Dynamic Investigatory Graphical Displays of Mathematics:
Graphical
Simulations for Statistics and Probability in Excel
Sheldon P. Gordon and Florence S.
Gordon

Most
of the following graphical simulations require the use of macros to
operate. In order to use these spreadsheets, Excel must be set to accept
macros. To change the security setting on macros:
- When you open any of the spreadsheets, a new bar
appears near the top of the window that says something like:
"Some Active content has been disabled", depending
on the version of Excel you are using.
- Click on Options.
- Click on "Enable the Content" and then
click OK.
The
following are the 88 DIGMath simulations for probability and statistics that are currently (December 2024)
completed and ready for use. (Several others are under development.) Please
feel free to download (with Chrome or Edge, simply click on any of the links below; with Firefox, right click on any of the links and then select Save Link As ...)and use any or all of these files. If you want all
of them, you can click on the link: statistics.7z or you can send an e-mail to
gordonsp@retiree.farmingdale.edu and we will try sending you a zip file with all
files. If you have any problems
downloading or running any of these Excel files, please contact us at gordonsp@retiree.farmingdale.edu
or flogo@optonline.net. If you have any
suggestions for improvements or for new topics, please pass them on also.
- Fair Coin
A coin flipping simulation in which the user has the choice of the number
of repetitions when flipping a fair coin.
The spreadsheet randomly generates the number of random flips, displays the results both graphically in a bar chart and numerically, and compares the outcomes to what is expected based on probability theory.
- Dice Roll
A dice rolling simulation in which the user can choose the number of rolls
of a pair of fair dice.
- Coin Flips A coin
flipping simulation in which the user has the choice of the number of fair
coins being flipped and the number of repetitions.
- Binomial Distribution This
DIGMath module lets you investigate the binomial distribution based on n trials with a probability p of success. You can select values
of n and p using sliders and the program draws the histogram for the
corresponding binomial distribution. It also shows the mean and standard
deviation of the distribution.
- Binomial Simulation
The user has the choice of the number of coins, the probability of
success, and the number of repetitions.
- Binomial
Probabilities This DIGMath module lets you investigate the
probabilities associated with a binomial distribution. The spreadsheet
covers six different cases, covering virtually al the standard kinds of
problems with binomial probabilities. It will calculate and display the
binomial probabilities of getting (1) exactly x successes in a binomial process with n trials with a probability p of success; (2) between two given numbers of successes; (3)
at least a given number of successes; (4) more than a given number of
successes; (5) at most a given number of successes; and (6) less than a
given number of successes. In each case, you select values of n, p, and x
using sliders and the program draws the histogram to show the
corresponding binomial probability.
- Law of Large Numbers The user has
the choice of the desired probability of success and the number of
repetitions to see the pattern of successes over the long run.
- Chaos
of Small Numbers The user has the choice of the desired
probability of success and the number of repetitions (up to 25) to see the
actual (simulated) outcomes and the cumulative frequency of success to
demonstrate the unpredictable nature of the outcomes in the short run.
- The Effects of an
Extra Point on the Mean and Standard Deviation This DIGMath module
lets you investigate the effect that changing an additional point has on
the values for the mean and the standard deviation (both graphically and
numerically). The spreadsheet starts with a set of either 5, 10, or 20
points and you can change the value for an additional point to see how it
affects the calculations and the extent of the changes depending on the
number of points.
- Effect
of an Extra Point on Statistics This DIGMath module lets you
investigate the effects of an extra point on the mean, the median, the
standard deviation, and the InterQuartile Range from two perspectives:
depending on how close to or far from the center an extra point lies and
depending on the size of the dataset.
- The Normal Distribution
This DIGMath module lets you investigate the normal distribution
with mean m and standard deviations. You
enter both parameters using sliders and the program draws the
corresponding normal distribution curve.
You can watch the effects of changing the parameter values on the
resulting curve.
- Normal Probabilities This DIGMath spreadsheet lets you investigate
the probabilities associated with a normal distribution . You can enter,
via sliders, values for the mean m and standard deviation s and then an interval of x-values from xL to xR to visualize the probability that x lies between these two values . The program raws the graph of the normal distribution, and
highlights graphically the region under the normal curve between xLand xR; it also shows numerically the probability that x lies between these two values.
- Simulating
the Normal Distribution This DIGMath spreadsheet lets you investigate
the normal distribution in terms of a random simulation. You can enter,
via sliders, values for the mean m and standard deviation
sand then an interval of x-values from x-Min to x-Max. You can also select the number of random points
you want in this normal distribution and the program will generate those
points, plot them along with the graph of the normal distribution, and
display graphically, with different colors, those that fall under the
designated portion of the normal distribution curve and those that do not.
The results are also shown numerically and compared to the theoretical
values for the area under the normal curve.
- Normal
Approximation to the Binomial Distribution This
DIGMath program lets you investigate how well a normal distribution
approximates the binomial distribution based on the parameters n and p. You enter the values for n and p via sliders
and the program draws the histogram for the binomial distribution and the
corresponding normal distribution curve using m = np
and s = √np(1-p) to compare the two
distributions.
- The Poisson
Distribution This DIGMath spreadsheet lets you investigate two different
aspects of the Poisson distribution that expresses the probability of a
number of events occurring in a fixed period of time if these events occur
with a known average rate and independently of the time since the last
event. (1) The first looks at the shape of the Poisson distribution
depending on its two parameters, the expected number of outcomes of an
event in a given time period and the number of occurrences. You use a
slider to vary the first parameters and see the effects on the shape of the
distribution. (2) The second aspect is based on the idea that the Poisson
distribution can be used to approximate the binomial distribution with
probability of success p and number of trials n. Using sliders to change n and p, you can observe which combinations make for a
good approximation and which do not.
- Central
Limit Theorem Simulation The user can choose any of four underlying
populations (normal, uniformly distributed, skewed, and bimodal), the
sample size, and the number of random samples. The simulation
randomly generates the samples and plots the means of each sample.
From the graphical display and the associated numerical displays, it
becomes apparent that (1) the distribution of sample means is centered
very close to the mean of the underlying population, that (2) the spread
in the sample means is a fraction of the standard deviation of the
underlying population (about one-half as large when n = 4, about
one-third as large when n = 9, about one-quarter as large when n
= 16, etc.), so that students quickly conjecture that the formula for the
standard deviation of the distribution of sample means is s/√n, and that (3) as the sample size increases, the
sampling distribution looks more and more like a normal distribution.
- Visualizing the Sample Mean and the Sample Standard Deviation This DIGMath program helps you visualize the sample means and the sample standard deviations drawn from each of the four underlying populations used in the Central Limit Theorem Simulation. You can select your choice of the population and select, using sliders, the sample size and the number of desired samples. The program calculates and displays graphically the average of the sample means and the sample standard deviations.
- t-Distributions This DIGMath module lets you investigate the
properties of the t-distribution
based on various numbers of degrees of freedom from 2 up to 31. You enter
the desired number of degrees of freedom and the program draws the
corresponding t-distribution
curve as well as the curves for d.f. = 1, d.f. = 11, and d.f. = 21 and the
limiting normal distribution curve when d.f. = 31.
- Distribution
of Sample Proportions The user chooses the probability of success p, the sample size n, and the number
of random samples. The simulation randomly generates the samples,
displays the corresponding proportion of successes, and displays the summary
statistics. From these displays, the students quickly conjecture
that (1) the mean of the distribution of sample proportions is equal to
the proportion of successes in the underlying population, that (2) the
simulated results with different values of the sample size n agree
with the formula for the standard deviation of this sampling distribution,
and that (3) the sampling distribution becomes more and more normal in
appearance as the sample size increases.
- Sample
Medians This DIGMath simulation is similar to the Central Limit
Theorem Simulation, but with sample medians instead of sample means.
- Sample
Midranges This DIGMath simulation is similar to the Central
Limit Theorem Simulation, but with sample midranges instead of sample
means.
- Sample
Modes This DIGMath simulation is similar to the Central Limit
Theorem Simulation, but with sample modes instead of sample means.
- Distribution
of Sample Variances This DIGMath simulation is similar is similar to
the Central Limit Theorem Simulation, but instead of simulating sample
means from a population, the program now generates and displays the sample
variances.
- Distribution
of Sample Standard Deviations This DIGMath simulation is similar to
the Central Limit Theorem Simulation, but instead of simulating sample
means from a population, the program now generates and displays the sample
standard deviations.
- Sample IQR's
This DIGMath simulation is similar to the Central Limit Theorem
Simulation, but instead of simulating sample means from a population, the
program now generates and displays the sample InterQuartile Ranges. The
IQR is the difference between the first and third quartiles of a set of
data and so represents a measure of the spread in the data.
- Distribution
of Sample Skewnesses This is similar to the Central Limit Theorem
Simulation, but instead of simulating sample means from a population, the
program now generates and displays the sample skewnesses -- a measure of
how far a set of values deviates from a symmetric distribution.
- Standard Deviations of Sample Proportions This
DIGMath program simulates the distribution of the standard deviation of sample proportions p. You enter the probability of success p, the sample size and the number of random samples and the program randomly generates, calculates, and displays the standard deviation of those sample proportions.
- Simulating Confidence
Intervals The user has the choice of the same four underlying
populations as in the Central Limit Theorem simulation (to see that the
population does not affect the results) and the confidence level (90%,
95%, 98%, or 99%). The simulation generates a fixed number of
samples from the selected population, calculates and plots the
corresponding confidence interval, and summarizes the number and
percentage of confidence intervals that actually contain the mean of the
underlying population. Students see that the actual (simulated) percentage
is typically close to the selected value for the confidence level.
They also see that typically the higher the confidence level, the longer
the lines are that represent the actual confidence interval. They also see
that typically those confidence intervals that do not contain the
population mean are near-misses.
- Constructing
Confidence Intervals for Means This DIGMath spreadsheet assists you in
constructing a confidence interval for the mean of a population. You enter
the sample data -- the sample size n, the sample mean, and the
sample standard deviation, and select the level of confidence (90%, 95%,
98%, or 99%) you want. The spreadsheet constructs the corresponding
confidence interval and displays it, as well as compares it in size, to
the confidence intervals with other levels of confidence.
- Simulating
Confidence Intervals for Proportions The user controls the
choice of the population proportion p for the underlying population, the confidence
level (90%, 95%, 98%, 99%), and the sample size n. The simulation generates a fixed number of random samples
from that population, calculates and plots the corresponding confidence
interval, and summarizes the number and percentage of confidence intervals
that actually contain the proportion p of the underlying population. Students see that
the actual (simulated) percentage is typically close to the selected value
for the confidence level. They also see that typically the higher
the confidence level, the longer the lines are that represent the actual
confidence interval. They also see that, as the sample size increases, the
lengths of the sample confidence intervals decrease and as the sample size
decreases, the lengths of the confidence intervals increase. They also see
that typically those confidence intervals that do not contain the
population proportion are near-misses.
- Constructing Confidence
Intervals for Proportions This DIGMath spreadsheet assists you in
constructing a confidence interval for the proportion of a population. You
enter the sample data -- the sample size n and the number of "successes" in that
sample -- and select the level of confidence (90%, 95%, 98%, or 99%) you want. The spreadsheet constructs the
corresponding confidence interval and displays it, as well as compares it
in size, to the confidence intervals with other levels of confidence.
- Visualizing
Confidence Intervals for the Mean This DIGMath module lets you investigate
the ideas associated with confidence intervals for the mean of a
population in a dynamic fashion using sliders. You can control the values
for the sample data -- the sample size n, the sample mean, and the sample standard deviation. The spreadsheet
constructs the corresponding 90%, 95%, 98%, and 99% confidence intervals
and displays all of them, both graphically and numerically, so you can
compare the lengths of each as you change the input values.
- Visualizing
Confidence Intervals for Proportions This DIGMath module lets you
investigate the ideas associated with confidence intervals for the
proportion p of a
population in a dynamic fashion using sliders. You can control the values
for the sample data -- the sample size n and the number of successes x. The spreadsheet constructs the corresponding 90%, 95%, 98%, and
99% confidence intervals and displays all of them, both graphically and
numerically, so you can compare the lengths of each as you change the
input values.
- Confidence
Intervals for the Difference of Means This DIGMath module helps in
constructing a confidence interval for the difference of mean based on
summary sample data from two samples: the size of the samples, the sample
means, and the sample standard deviations. The user can choose the
confidence level desired -- 90%, 95%, 98%, or 99% and the resulting
intervals are shown graphically and numerically.
- Confidence
Intervals for the Difference of Proportions This DIGMath module helps
in constructing a confidence interval for the difference in population
proportion based on summary sample data from two samples: the size of the
samples and the number of "successes" in each sample. The user
can choose the confidence level desired -- 90%, 95%, 98%, or 99% -- and
the resulting intervals are shown graphically and numerically.
- Simulating Hypothesis
Testing The user has the choice of the same four underlying
populations (again, to see that the population does not affect the
results) and the level of significance (10%, 5%, 2%, 1%). The simulation
generates a fixed number of samples from the selected population, plots
the mean of each sample with a vertical line at the appropriate location,
and summarizes the number and percentage of sample means that fall into
this region. The height of each line is equal
to the standard deviation of that sample. Students see that the simulated
percentage of sample means that fall in the rejection region is typically
close to the selected level of significance. They also see that most
of the sample means that fall into the rejection region tend to be quite
close to the critical values. They also see that the lines
representing the samples whose means are close to the population mean tend
to be very tightly clustered compared to those that are near the extreme
ends, which are sparsely distributed.
- Visualizing Hypothesis Testing for the Mean This DIGMath spreadsheet helps you
visualize the fundamental ideas related to testing a hypothesis for the
mean of a population. You enter the sample data -- the sample size n,
the sample mean, and the sample standard deviation, the choice of a
two-tailed test or a one-tailed test with the tail on the right or the
left, and select the significance level α you want highlighted. The
spreadsheet displays all four corresponding critical values for a
one-tailed test or all eight critical values for a two-tailed test and
shows the position of the sample mean for the data. It also displays
the associated z- or t-value and the conclusion of
whether you can Reject or Fail to Reject the null hypothesis as the
selected significance level.
- Hypothesis Tests for the
Mean This DIGMath spreadsheet assists you in testing a hypothesis for
the mean of a population. You enter the null hypothesis for the supposed
value of m and select the test you want -- either two
tailed or one tail with either tail. You then enter the sample data -- the
sample size n, the sample mean and the sample standard deviation
-- -- and the level of significance a. The spreadsheet displays the corresponding normal or t-distribution,
the location of the critical value(s), and the location of the sample
mean. It also shows the associated z- or t-value, as
well as the corresponding P-value, and the conclusion as to
whether you Reject or Fail to Reject the null hypothesis.
- Hypothesis Tests for the
Proportion This DIGMath spreadsheet assists you in testing a
hypothesis for the proportion of a population. You enter the null
hypothesis for the supposed value of p and select the test you want -- either two
tailed or one tail with either tail. You then enter the sample data -- the
sample size n and the number of successes x in that
sample -- and the level of significance a. The spreadsheet displays the corresponding normal distribution
(when appropriate), the location of the critical value(s), and the
location of the sample proportion p. It also shows the associated
z-value, as well as the corresponding P-value, and the conclusion
as to whether you Reject or Fail to Reject the null hypothesis.
- Simulating the P-Values for Hypothesis Tests on the Mean This DIGMath module lets you visualize the P-values associated with sample means when conducting a hypothesis test on the population mean. The program randomly generates 100 samples of size n = 50 from your choice of the usual four underpopulations, calculates the P-value associated with each sample mean, and draws the P-values as a series of vertical lines. You can choose the significance level a for a two-tailed test and the program also shows the vertical line associated with the P-value for that critical value. You can then see the number, and percentage, of the sample means whose P-values are sufficiently unlikely that would indicate that you should Reject the null hypothesis. The spreadsheet also displays the scatterplot of the 100 sample P-values plotted against the values of the 100 sample means to demonstrate that the pattern in the points typically looks like a normal distribution pattern. A horizontal line is also included at the height corresponding to the P-value associated with the critical value for the hypothesis test on the population mean at the selected level a of significance.
- Simulating the P-Values for Hypothesis Tests on the Proportion This DIGMath module lets you visualize the P-values associated with sample proportions when conducting a hypothesis test on the population proportion p . The program randomly generates 100 samples of size 100, calculates the P-value associated with each sample proportion, and draws the P-values as a series of vertical lines. You can choose the significance level a for a two-tailed test and the program also shows the vertical line associated with the P-value for that critical value. You can then see the number, and percentage, of the sample proportions whose P-values are sufficiently unlikely that would indicate that you should Reject the null hypothesis. The spreadsheet also displays the scatterplot of the 100 sample P-values plotted against the values of the 100 sample proportions to demonstrate that the pattern in the points typically looks like a normal distribution pattern. A horizontal line is also included at the height corresponding to the P-value associated with the critical value for the hypothesis test on the population proportion at the selected levelof significance a.
- Hypothesis Test on
the Difference of Means This DIGMath spreadsheet assists you in
testing a hypothesis for the difference in means of two populations. The
null hypothesis is that the two means are equal, and you have to select
the alternate hypothesis test you want -- either two tailed or one tail
with either tail. You then enter the sample data for the two samples --
the sample size n, the sample mean and the sample standard
deviation -- and the level of significance a The spreadsheet displays the corresponding
normal or t-distribution for the distribution of differences of
sample means, the location of the critical value(s), and the location of
the difference in the two sample means. It also shows the associated
z- or t-value, as well as the corresponding P-value,
and the conclusion as to whether you Reject or Fail to Reject the null
hypothesis.
- Hypothesis Test on
the Difference of Proportions This DIGMath spreadsheet assists you in
testing a hypothesis for the difference in proportions of two populations.
The null hypothesis is that the two proportions are equal, and you have to
select the alternate hypothesis test you want -- either two-tailed or one-tailed with either tail. You then enter the sample data for the two samples
-- the sample size n and the number of successes in each sample
-- and the level of significance a. The spreadsheet displays the corresponding normal distribution
(if appropriate) for the distribution of differences of sample
proportions, the location of the critical value(s), and the location of
the difference in the two sample proportions. It also shows the associated
z-value, as well as the corresponding P-value, and the
conclusion as to whether you Reject or Fail to Reject the null hypothesis.
- The Distribution of
the Difference of Means This DIGMath module lets you investigate the
distribution of the difference of means based on summary sample data from
two samples drawn from the choice of four underlying populations
(to see that the population does not affect the results). The user can
choose the sample size (from n = 2 to n = 50) from each
sample and the number (from 50 to 250) of samples. The simulation
generates that number of samples from the selected populations, plots the
difference in the sample means of each sample, and displays the
mean and standard deviation of the differences in the sample means
compared to the theoretical predictions based on the population of
differences of means of all possible samples.
- The Distribution
of the Difference of Sample Proportions This DIGMath module lets you investigate the
distribution of the difference of proportions based on summary sample data
from two samples drawn from two binomial populations. For each
population, you can choose the probability p of success and the sample size
(from n = 2 to n = 100) from each sample, as well as the
number of random samples (between 50 and 300). The simulation generates
that number of samples from the two populations, plots the difference in
the sample proportions of each set of samples, and displays the mean and
standard deviation of the differences in the sample proportions compared
to the theoretical predictions based on the population of differences of
proportions of all possible samples.
- Linear Regression:
Fitting a Line to Data This DIGMath module performs a linear
regression analysis on any set of up to 50 (x, y) data points. It
shows graphically the points and the associated regression line and also
displays the equation of the regression line, the value for the
correlation coefficient r, and the value for the Sum of the
Squares that measures how close the line comes to all the data points.
- Sum of the Squares This DIGMath module allow you to investigate dynamically how the sum of the squares measures how well a line fits a set of data. You can enter a set of data and select the number of data points you want to use. You also enter the values you want for the slope and the vertical intercept of a line. The display shows the data points with the line based on those parameters and also shows the value for the sum of the squares associated with that linear fit. the value for the sum of the squares associated with that linear fit.
- Regression
Simulation The user has the choice of the sample size (n >
2) and the number of samples. The simulation generates the random
samples, calculates the equation of and plots the corresponding sample
regression line, and also draws the population regression line. The
students quickly see that, with small sample sizes, the likelihood of the
sample regression line being close to the population regression line may be
very small with widely varying slopes for many of the sample lines.
As the sample size increases, the sample regression lines become ever more
closely matched to the population line.
- Simulating
the Correlation Coefficient This DIGMath spreadsheet lets you
investigate the sample distribution for the correlation coefficient r based on repeated random samples drawn from a bivariate
population. You can choose between n
= 3 and n = 50 random points for
each sample and between 50 and 250 such samples from the underlying
population. For each sample, it then calculates the correlation
coefficient and displays a histogram showing the values of r from the samples. It also calculates and displays the mean of the
sample correlation coefficients and compares it to the correlation
coefficient for the underlying bivariate population.
- Simulating the
Regression Coefficients This DIGMath module lets you investigate
the sample distributions for the two regression coefficients a
and b in the regression equation y
= ax + b based on repeated random samples drawn from a
bivariate population. You can choose between n = 3 and n = 40
random points for each sample and between 50 and 250 such samples from the
underlying population. For each sample, the program calculates the
regression equation and displays the various regression lines along with
the regression line for the underlying bivariate population. It then draws
two histograms -- one showing the distribution of the values of the slope a
from the random samples and the other showing the distribution of the
values of the vertical intercepts b from those
samples. . It also calculates and displays the mean of each of the sample
regression coefficients and compares it to the regression coefficients for
the underlying bivariate population.
- The
Effects of an Extra Point on the Regression Line and the Correlation
Coefficient This DIGMath module lets you investigate the effect that
changing an additional point has on the regression line (both graphically
and numerically) and on the correlation coefficient. You have the choice
of 5, 10, or 20 fixed points and can move an additional point using
sliders to see how it affects the calculations and the extent of the
changes depending on the number of points.
- Fitting a
Median-Median Line to Data This DIGMath module fits a
median-median line to any set of up to 50 (x, y) data points. It
shows graphically the points and the associated median-median line and
also displays the equation of the median-median line and the value for the
Sum of the Squares that measures how close the line comes to all the data
points.
- Simulating the Median-Median Line This spreadsheet lets you investigate the median-median line that fits a set of data via a simulation. You have the choice of the sample size and the number samples that will be drawn from an underlying population. The spreadsheet generates random samples and draws the corresponding median-median lines to help you see the effect of sample size on the consistency of the lines produced.
- Simulating
the Quartile-Quartile Line This DIGMath module lets you investigate
the quartile-quartile line that fits a set of data via a simulation. The
quartile-quartile line is based on finding the 1st and 3rd quartiles for
both the x and the y values in a set of data and then
creating the line that passes through those two points. As such, it is a
viable alternative to the usual least-squares regression line that is
conceptually and computationally simpler. You have a choice of the sample
size and the number of samples that will be drawn from an underlying
population. The spreadsheet generates the random samples and draws all the
corresponding quartile-quartile lines to help you see the effect of sample
size on the consistency of the lines.
- Comparing Lines
that Fit Data This DIGMath program lets you compare how well the least-squares
line, the median-median line (that is built into many calculators), and
the quartile-quartile line fit sets of data. You can choose the number of
random data points from an underlying population and the spreadsheet
generates a random sample and displays the three lines, along with the
data points, so that you can compare how well the three lines fit the data
and how they compare to one another, particularly as the sample size
increases.
- DataFit: Fitting Linear,
Exponential, and Power Functions to Data This DIGMath spreadsheet
is provided as a visual and computational tool for investigating the issue
of fitting linear, exponential, and power functions to data and the
underlying transformations used to create the nonlinear functions. You can
enter a set of data and the spreadsheet displays six graphs:
(1) For a linear fit: the regression line superimposed over the original (x,
y) data;
(2) For an exponential fit: the regression line superimposed over the transformed
(x, log y) data values;
(3) The exponential function superimposed over the original (x, y)
data;
(4) For a power fit: the regression line superimposed over the transformed
(log x, log y) data values;
(5) The power function superimposed over the original (x, y)
data;
(6) All three functions superimposed over the original (x, y)
data.
The spreadsheet also shows the values for the correlation coefficients
associated with all three linear fits and the values
for the sums of the squares associated with each of the three fits to the
original data. On a separate page, the spreadsheet also shows the
residual plots associated with each of the three function fits.
- Multivariate
Linear Regression This DIGMath spreadsheet lets you perform
multivariate linear regression when the dependent variable Y is a
function of two independent variables X1 and X2 or a
function of three independent variables X1, X2, and X3.
You enter the number of data points (up to a maximum of 50) and then the
values for the dependent and independent variables in the appropriate
columns. The spreadsheet responds with the equation of the associated
linear regression equation, the value for the sum of the squares, and the
value for the coefficient of determination, R2;
note that this value tells you the percentage of the variation that is
explained by the linear function.
- Comparing Moving Averages This DIGMath program lets you compare different moving averages to one another as well as to the underlying set of data. Moving averages are used widely in many different fields to identify patterns and trends in data where there are wild fluctuations on a day-to-day basis. This spreadsheet uses some actual data on the spread of the COVID pandemic to illustrate the concept by plotting the underlying data, the 3-day moving average, the 20-day moving average, and a third moving average that you can choose based on either 4-days, 5-days, up through 19-days.
- Fitting Functions to Moving Averages This DIGMath spreadsheet lets you investigate how moving averages can be used to identify the pattern in a set of data that has extreme daily fluctuations. The program creates charts for the 5-day moving average, the 10-day moving average, the 15-day moving average, and the 20-day moving average. It also allows you to select the type of function you would like to use -- linear, exponential, power, quadratic, or cubic -- using a slider and displays the resulting functions that fit both the underlying data and the corresponding moving average. It also displays the resulting equations of the functions and the values of the corresponding correlation coefficients, r, (for the linear, exponential, and power fits) and the coefficient of multiple correlation, R, (for the polynomial fits) to assess how well each function fits the data.
- The Kolmogorov-Smirnov Test for Normality This DIGMath program is a test to determine whether or not a set of data might be normally distributed via an hypothesis test when you can safely assume that the underlying population is normally distributed. You enter a relatively small (up to 30 numbers) in ascending numerical order and the spreadsheet performs the test at the 5% significance level while providing a graphical interpretation of the procedure. It displays the associated test-statistic and reports whether you can reject the claim that the data is normally distributed or whether you fail to reject the claim.
- The Lillifors Test for Normality This DIGMath program is a test to determine whether or not a set of data might be normally distributed via an hypothesis test. It is a special case of the Kolmogorov-Smirnov Test when you don't know if you can assume that the underlying population is itself normally distributed. You enter a relatively small (up to 30 numbers) in ascending numerical order and the spreadsheet performs the test at the 5% significance level while providing a graphical interpretation of the procedure. It displays the associated test-statistic and reports whether you can reject the claim that the data is normally distributed or whether you fail to reject the claim.
- The
Birthday Problem This DIGMath module lets you investigate the Birthday
Problem, which asks for the probability that two people in a group of n people will have the same birthday. The program lets you decide
on the number of people in a group (between 1 and 100) and displays the
graph of the probability of a match versus the number of people in the
group.
- Simulating the Birthday
Problem This DIGMath module lets you investigate the Birthday Problem
from the point of view of a random simulation. You have the choice of the
number of people in a group (from 2 to 50). The program then generates a
random sample of birthdates for each of the people and displays the list,
including highlighting those that match. It presents the results,
including the theoretical probability of a match and the number of
matches.
- Number of Boys vs.
Girls Born in a Family This DIGMath spreadsheet lets you investigate the
number of Boys and Girls born into a family based on the fact that 51.2%
of all live births are Boys. You can choose the number of children (1-10)
in a family and the number of such families. The program simulates this
and displays the outcomes graphically in a histogram and numerically with
a table of outcomes and the mean and standard deviation of the results.
- The
Drunkard's (or Random) Walk Simulation This DIGMath module lets you
investigate the notion of a random walk in the plane. You have the choice
of the number of random steps (between 1 and 1000) and the length of each
step. The program then generates a random collection of steps and displays
the results graphically, as well as some numerical analysis on the actual
distance covered from the starting point compared to the theoretical
predictions.
- Buffon Needle Problem
This DIGMath module lets you experiment with a graphical simulation of
Buffon's Needle Problem -- the probability that a needle of length L
lands on the seam between parallel strips of flooring of width W
when it falls to the floor. You can select the number of random
"needles" that fall, the width of the strip of flooring, and the
length of the needles and see the results graphically and numerically.
- Buffon Needle Problem on Square Tiles This DIGMath module lets you experiment with a graphical simulation of the Laplace/Buffon's Needle Problem -- the probability that a needle of length L lands on the seam between square tiles of width W when it falls to the floor. You can select the number of random "needles" that fall, the width of the tile, and the length of the needles and see the results graphically and numerically.
- Buffon's
Disk Problem This DIGMath spreadsheet lets you experiment with a
graphical simulation of a variation of Buffon's Needle Problem -- the
probability that a circular disk of radius r lands on the seam between parallel strips
of flooring of width W when
it falls to the floor. You can select the number of random
"disks" that fall, the width of the strip of flooring, and the
length of the needles and see the results graphically and numerically.
- Buffon's
Disk in a Circle Problem This DIGMath spreadsheet lets you experiment with a
graphical simulation of a variation of Buffon's Needle Problem -- the
probability that a circular disk of radius r lands entirely within a larger circle of radius R when
it falls to the floor or if it crosses the boundary circle. You can select the number of random
"disks" that fall, the radius r of each disk, and the radius R of the circle, and see the results graphically and numerically.
- Buffon's Disks on a Square Tile Problem This DIGMath spreadsheet lets you experiment with a graphical simulation of a variation of Buffon's Needle Problem -- the probability that a circular disk of radius r lands on the seam between square tiles of width W when it falls to the floor. You can select the number of random "disks" that fall, the width of the tile, and the length of the needles and see the results graphically and numerically.
- Buffon's
Needle Problem in Concentric Circles This DIGMath spreadsheet
lets you experiment with a graphical simulation of a variation of Buffon's
Needle Problem -- the probability that a circular disk of radius r lands on the seam between a group of
concentric circles when it falls to the floor. You can select the
number of random "disks" that fall, the fixed difference in the
radii of the concentric circles on the floor, and the length of the
needles and see the results graphically and numerically.
- Buffon Problem for Square Coins on a Square Tile This DIGMath spreadsheet lets you experiment with a graphical simulation of a variation on Buffon's Needle Problem -- the probability that a square coin of length L lands on the edge of a square tile of width W . You can select the number of random "coins" that fall, the width of the flooring tile, and the number of coins and see the results graphically and numerically.
- Simulation of Gambler's Ruin This DIGMath simulation lets you investigate the notion of Gambler's Ruin in which a person enters a game of chance with a fixed amount to bet (the stake) and repeatedly bets a fixed amount on one particular outcome until he or she runs out of money or reaches a certain amount of winnings. You can enter the stake, the amount of the bet, the amount won on each successful bet, the fixed probability of winning on each bet, and the number of bets (up to 1000) that will be displayed using sliders. In most realistic situations, the "house" sets the payoff amount low enough to assure that the gambler will eventually run out of money -- that is, lose his or her shirt. This is why it is known as Gambler's Ruin.
- Simulation of Dart
Throwing This DIGMath module lets you investigate the process of
throwing random darts at a dartboard. You can select between 100 and 1000
random darts and the spreadsheet shows the position that each dart lands
and displays the breakdown of how many, and what percentage, of the darts
all into each of the rings in the dartboard.
- Product of the Faces of Two
Dice This DIGMath simulation lets you investigate the product of the
faces on a pair of dice. You can choose up to 720 rolls of the two dice
and the spreadsheet shows the distribution of the outcomes and a list of
the number and percentage of each possible outcome compared to the
theoretical predictions.
- Difference of the Faces of Two
Dice This
DIGMath simulation lets you investigate the differences of the faces on a
pair of fair dice. You can choose up to 720 rolls of the two dice and the
spreadsheet shows the distribution of the outcomes and a list of the
number and percentage of each possible outcome compared to the theoretical
predictions.
- Simulation
of Rolling 4-Sided Dice This DIGMath spreadsheet lets you investigate
the probability experiment of rolling a pair of fair 4-sided dice, instead
of the usual 6-sided dice, so that the possible sums are now 2, 3, ..., 8.
The simulation shows the results of repeated trials both graphically in a
histogram and numerically in terms of the number of times each of the
possible outcomes arises.
- Simulation
of Rolling 8-Sided Dice This DIGMath spreadsheet lets you investigate
the probability experiment of rolling a pair of fair 8-sided dice, instead
of the usual 6-sided dice, so that the possible sums are now 2, 3, ...,
16. The simulation shows the results of repeated trials both graphically
in a histogram and numerically in terms of the number of times each of the
possible outcomes arises.
- Yahtzee: Rolling
Five Dice Simulation The game of YahtzeeTM involves rolling a set of five fair dice.
This DIGMath module lets you investigate this experiment by simulating
repeated random rolls (up to 720 times) of five dice. It displays the
results in a histogram as well as a table showing the simulated outcomes
and the expected theoretical outcomes.
- Visualizing
Conditional Probability This
DIGMath spreadsheet helps you visualize the idea of conditional
probability where the usual sample space for a probability experiment is
reduced by knowing some other detail of the event. The spreadsheet looks at the sum of the
faces of two dice and allows you to investigate what happens if either (1)
you know the result of the second die or (2) you know the product of the
two faces. The spreadsheet displays
the corresponding histogram and the associated numerical outcomes.
- Waiting
Time Simulation This DIGMath module lets you investigate the length of
time a car will wait at a red light. You can select the total length of
the cycle and the length of time that the light is red. The results -- the
number of times that the wait is 0, 1, 2, seconds -- are shown graphically
in a histogram and in a table listing the outcomes. The average wait over
all repetitions is also shown.
- Hypergeometric
Probabilities This DIGMath
module helps you visualize the probabilities associated with a
hypergeometric distribution, which is based on selecting a sample of size n
from a population having N elements of which K are
considered successes. The standard
probability problem asks what is the probability of having exactly x
successes in that sample? You can
enter the three values N, K, and n and the desired number of
successes x in that sample. The
spreadsheet draws the corresponding hypergeometric distribution and
highlights the desired outcome, as well as the numerical results.
- Trinomial Probabilities This DIGMath spreadsheet is a computational tool for calculating the probability of success for trinomial probabilities in which there is a probability of success and probabilities for each of two different kinds of failures. (Picture the arrival times of airplanes -- success is arriving on-time, one kind of failure is arriving late, and the other kind of failure is arriving early.) You select the probability of success p, the desired number successes x, the probability of the first kind of failure q, the number of such failures y, and the probability of the second kind of failure, r. The program calculates and displays the associated probability of getting x successes.
- Chi-Square
Analysis This DIGMath spreadsheet is designed to perform a complete
chi-square analysis on many different sized contingency tables, including
2 by 2, 2 by 3, 2 by 4, 3 by 2, 3 by 3, and 3 by 4. On the Set-Up screen,
the user first enters the number of rows and the number of columns of the
desired contingency table and is then instructed to click on an
appropriate tab to go to the corresponding input screen. On that screen,
the user then enters the observed values into the various positions in the
contingency table. The spreadsheet displays the resulting table of
expected frequencies, the number of degrees of freedom, and the value of
the chi-square statistic based on the values in the table. It also draws
the graph of the corresponding chi-square distribution and indicates the
location of the critical value, based on the desired level of
significance, separating the rejection region from the region where one
cannot reject the null hypothesis. In addition, the program indicates the
location of the chi-square statistic corresponding to the data in the
contingency table. Finally, the program indicates whether or not one can
reject the null hypothesis at that significance level.
- The
Chi-Square Distributions This DIGMath module lets you explore the
behavior of various chi-square distributions, which depend on the number n of degrees of freedom. The user can enter any desired number of degrees
of freedom from 2 to 31 and the program draws the graphs of that
chi-square distribution as well as those with 3, 7, 11, 15, ..., 27
degrees of freedom. Because the chi-square distributions become more
normal in shape as n
increases, the
program also draws the standard normal distribution with mean m = 0 and standard deviation s = 1 for comparison.
- Chi-Square
Simulation This DIGMath spreadsheet lets you investigate the variation
in the values that can arise for the chi-square statistic via a random
simulation based on a two by three contingency table. You define the table
by entering the column and row totals and the number of random samples
drawn from that population and the spreadsheet generates and graphs the
corresponding values of the chi-square statistic.
- One-Way
Analysis of Variance (ANOVA) This DIGMath spreadsheet is designed to
perform a complete one-way analysis of variance (ANOVA) to test whether
the means of two or more (up to 5 sample means) may come from populations
with the same mean (the null hypothesis) or from populations with
different means (the alternate hypothesis). Each sample can contain up to
10 entries. The spreadsheet displays the resulting ANOVA table, including
the value of the F-statistic. It also draws
the graph of the corresponding F-distribution and indicates the location of the
critical value, based on the 5% level of significance, separating the
rejection region from the region where one cannot reject the null
hypothesis. In addition, the program indicates the location of the F-statistic corresponding to the data in the table.
Finally, the program indicates whether or not one can reject the null
hypothesis at the 5% significance level.
- Simulating the Runs
Test This DIGMath spreadsheet lets you investigate the Runs Test both
graphically and numerically. When there is a collection of outcomes
consisting of A's and B's, the object is to see the
number of runs that occur. The module lets you select the total number of A's
and B's, the number of A's, and the number of random
samples. The distribution of the number of runs is drawn as well as
numerical measures for the mean and standard deviation.
All
of these files were developed under the support from a variety of grants from
the National Science Foundation, to whom the authors are very appreciative.
