# Fun with coin flips

November 21, 2012
Categorized as: R, R-Bloggers

We all know that the odds of flipping an unbiased coin is 50% heads, 50% tails. But what happens if you do this a lot of times. Do you expect the same number of heads and tails? What if we took a cumulative sum where heads = +1 and tails = -1. What would that sum be? Here is a function that will do this n times and plot it.

probPlot <- function(n=1000) {
vals <- rbinom(n=n, size=1, prob=.5)
vals[vals==0] <- -1
df = data.frame(x=1:length(vals), y=cumsum(vals))
range <- c( -max(abs(df$y)), max(abs(df$y)))
ggplot(df, aes(x=x, y=y)) +
geom_hline(yintercept=0, colour='blue') +
geom_line() +
ylim(range) +
ylab('Cumulative Sum') +
xlab(paste('Point in sequence 1:n coin flips for n=',
prettyNum(n, big.mark=',', scientific=FALSE), sep='')) +
ggtitle(paste('Cumulative sums for succession of ',
prettyNum(n, big.mark=',', scientific=FALSE),
' coin flips\nwhere Heads = +1 & Tails = -1', sep=''))
}

The results of probPlot(n=10000) is:

But if we run it again we get a different plot:

But after 10,000 coin flips, the possible range is 10,000 for all heads or -10,000 for all tails. If we set the range for the y-axis it appears that the cumulative sum is indeed pretty close to 0.

Using the animation package, we can quickly create an animation zooming in from the full range to see the variation in the cumulative sum.

require(animation)
set.seed(2112)
n <- 10000
nsteps <- 10
p <- probPlot(n) + ggtitle(NULL)
steps <- c(n)
while(steps[length(steps)] / 2 > max(abs(p$data$y))) {
steps = c(steps, steps[length(steps)] / 2)
}
saveHTML( {
for(i in steps) {
print( p + ylim(c(-i, i)) )
} },
interval = 0.5, htmlfile='cumulativesum.html', autobrowse=FALSE, outdir=getwd(),
title = paste('Cumulative sums for succession of ',
prettyNum(n, big.mark=',', scientific=FALSE), ' coin flips', sep=''),
description = paste('Cumulative sums for succession of ',
prettyNum(n, big.mark=',', scientific=FALSE),
'coin flips where Heads = +1 and Tails = -1', sep='')
)