Disclaimer: All of the numbers in this lab are entirely made up.

Last Week Review:

Let event A be that a potato is a Yukon Gold potato.
Suppose the probability that a potato is a Yukon Gold is 1/3.
Let event B be that a potato is used to make mashed potatoes. Suppose the probability that a potato is mashed, given that it was Yukon Gold, is 3/4.
Suppose the probability that a potato is mashed, given that it was not Yukon Gold, is 1/2.
What is the probability that a potato is Yukon Gold, given that it is mashed?
What is the probability that a potato is mashed?
What is the probability that a potato is both mashed AND Yukon Gold?
What is the probability that a potato is mashed OR Yukon Gold?
Assuming picking potatoes involves independent events, what is the probability that I pick two Yukon Golds in a row?
What is the probability that, of 3 potatoes picked, 2 are Yukon Golds?

Binomial Distribution

From lecture, we know that when there are two possible outcomes that occur/don’t occur n times, the number of ways of one event occurring k times is \(\frac{n!}{k!(n-k)!}\).
We also know that, given independence, the probability of an intersection of events is \(p^k(1-p)^{1-k}\).
Combining these, we get the formula for the binomial distribution:

\(\frac{n!}{k!(n-k)!}p^k(1-p)^{n-k}\)

Let’s now revisit that last problem from the review. We can calculate this probability using the formula in R:

n<-3
k<-2
p<-1/3
factorial(n)/(factorial(k)*factorial(n-k)) * p^k * (1-p)^(n-k)
## [1] 0.2222222

We can also use R’s built-in function to answer this question:

dbinom(x=2,size=3,prob=1/3)
## [1] 0.2222222

R’s built-in functions can also help us answer other questions. For example, let us now consider picking 10 potatoes and getting 5 Yukon Golds. We can find the probability of this event just like we did earlier:

dbinom(x=5,size=10,prob=1/3)
## [1] 0.1365645

However, we may also be interested in finding the probability of seeing an event as or more extreme than the one we observed. (This should sound very familiar.)
Since the probability of picking a Yukon Gold is 1/3 and we picked a total of 10 potatoes, we would expect to see about 3.3 Yukon Golds. What we observed (5) is 1.67 greater than what we’d expect, so in order to be as or more extreme, we are interested in anything greater or equal to 5 or less than or equal to 1.67. Since the data is discrete, this is the same thing as \(P(x\leq1 \cup x\geq5)\). We can calculate this using pbinom(), which calculates the probability of being less than or equal to a value. If we want to find the probability of being greater or equal to a number, we tell R to calculate 1-pbinom() of one less than what we’re interested in.

pbinom(1,size=10,prob=1/3)
## [1] 0.1040492
1-pbinom(4,size=10,prob=1/3)
## [1] 0.2131281
# What we want:
pbinom(1,size=10,prob=1/3) + (1-pbinom(4,size=10,prob=1/3))
## [1] 0.3171773
# Equivalently (as seen in lecture):
binom.test(x=5,n=10,p=1/3)$p.value
## [1] 0.3171773

Quiz Review

Question 1

Suppose that the probability of a potato rotting after a month in the pantry is 1/3 (and that one potato rotting is independent of the others). Let’s say we have 5 potatoes in the pantry.
Part a If we didn’t get around to eating our delicious potatoes for a month, what is the probability that all 5 are still good?
Part b What is the probability that at least 3 potatoes are still good?
Part c What is the probability that exactly two are still good?

Question 2

## Mean of X: 11.33101
## Mean of y: 15.87852
## Std Dev of X: 2.188988
## Std Dev of Y: 2.073249
## Correlation: 0.8727247

Part a Find the equation of the regression line.
Part b If we have an X-value of 14.33, what would we predict the Y-value to be?
Part c If we have a Y-value of 11.73, what would we expect the X-value to be?

Question 3

What can you say about the distribution? (center, shape, spread)

Question 4

Compare the two plots. (center, shape, spread)

Question 5

Suppose 1000 people take a medical screening test. 270 people get a positive test result, 1/3 of which actually have the disease. If the prevalence of disease is .1, what are the sensitivity and specificity of the test?

Answers

Last Week Review

What is the probability that a potato is Yukon Gold, given that it is mashed?
\(P(A|B) = 3/7\)
What is the probability that a potato is mashed?
\(P(B) = 7/12\)
What is the probability that a potato is both mashed and Yukon Gold?
\(P(A \cap B) = 1/4\)
What is the probability that a potato is either mashed or Yukon Gold?
\(P(A \cup B) = 2/3\)
What is the probability that I pick two Yukon Golds in a row?
\(P(A)*P(A) = 1/9\)
What is the probability that, of 3 potatoes picked, 2 are Yukon Golds?
\(P(2 Y in 3 picks) = 2/9\)

Question 1

Part a \({1/3}^{1/5}\)
Part b 0.2098765
Part c 0.3292181

Question 2

Part a

## (Intercept)           X 
##   6.5125162   0.8265808

Part b
18.3574192

Part c
7.5083875

Question 3

The distribution has one peak and is centered around 17. It appears slightly skewed to the left, and ranges from about 10 to about 20.

Question 4

Both distributions appear centered around 45-50. The one on the left appears right-skewed, and ranges from about 25 to about 80. The one on the right appears left-skewed, and ranges from about 25 to about 60.

Question 5

Sensitivity: .9
Specificity: .8