Objectives

  1. Analyze studies that have two unmatched/unpaired groups with continuous outcomes
  2. Practice calculating and interpreting a two sample t-test and confidence intervals
  3. Discuss assumptions for two sample t-test and use the Welch’s test

Infant Diarrhea Study

Diarrhea is a major public health problem in underdeveloped countries, especially for babies. Diarrhea leads to dehydration, which results in millions of deaths each year worldwide.

Researchers in Peru conducted a double-blind randomized controlled trial, published in The New England Journal of Medicine, to determine whether Bismuth salicylate would improve the outcomes of infants suffering from diarrhea.

In their study, all infants received the standard therapy for diarrhea: oral re-hydration. In addition to the rehydration, 85 babies received bismuth salicylate, while 84 babies received a placebo. The total stool volumes for all infants over the course of their illness was measured. To adjust for body size, the researchers divided by body weight to obtain their outcome of interest: stool output per kilogram of body weight. The results of their study are available on the course website.

DiaData <- read.delim("http://myweb.uiowa.edu/pbreheny/data/diarrhea.txt")

Explore the data

Let’s examine the distribution of stool output by group using a boxplot of the data:

boxplot(DiaData$Stool ~ DiaData$Group, col = c("burlywood4", "darkgoldenrod3"), main = "Effect of bismuth salicylate", horizontal = T)

Summarize the distribution and describe any differences you notice about the two groups

Two Sample t-tests

Student’s two sample t-test

To do a students two sample t-test, we must be able to find the following values:

  • Standard deviation using a pooled variance
  • Standard error
  • The t-statistic.

These three can be found using the equations below (which are also in your lecture slides):

Test statistic:

\(SD_(pooled) = \sqrt{ \frac{(n_1 - 1)S_1^2 + (n_2 - 1)S_2^2}{n_1 + n_2 -2} }\)

\(SE_d = SD_(pooled) \sqrt{ \frac{1}{n_1} + \frac{1}{n_2}}\)

\(t = \frac{\bar{x_1} - \bar{x_2}}{SE_d}\) with:

\(df = n_1 + n_2 - 2\)

Confidence interval:

\(\bar{x_1} - \bar{x_2} \pm t_{\alpha/2, df}*SE_d\)

So as you can tell we need to compute several statistics from the data in order to do this t-test. We need to compute the mean of continuous outcomes for both groups and those will be out \(\bar{x_1}\) and \(\bar{x_2}\) as well as the standard error above.

What is our null hypothesis for this data?

Calculate the test statistic and 95% confidence interval by hand. Summarize your results.

As a reminder, here are some helpful functions for finding summary statistics:

Recall the ‘by’ function we learned in Lab 4: allows us to run a function over data set into groups.

  • The first parameter is the data of interest
  • The second parameter is the data that defines the groups
  • The third parameter is the function to run over the data
#Find n in each group
table(DiaData$Group)

#Find mean of each group
by(DiaData$Stool, DiaData$Group, mean)

#Find standard deviation of each group 
by(DiaData$Stool, DiaData$Group, sd)

Hypothesis Test:

x1 <- 260.2976 # Mean of control group
x2 <- 181.8706 # Mean of treatment group

sd1 <- 253.6135 # SD of control group
sd2 <- 197.3636 # SD of treatment group

n1 <- 84 # n for control group
n2 <- 85 # n for treatment group

sd_pooled <- sqrt(((n1 - 1)*sd1^2 + (n2 - 1)*sd2^2) / (n1 + n2 - 2))
SE <- sd_pooled * sqrt((1 / n1) + (1 / n2))
df <- n1 + n2 - 2

t <- (x1 - x2) / (SE)
2*pt(t, df = df, lower.tail = F)
## [1] 0.02608141

Confidence Interval:

(x1 - x2) - qt(0.975, df = df)*SE
## [1] 9.45734
(x1 - x2) + qt(0.975, df = df)*SE
## [1] 147.3967

Check with R

Comments about t.test code:

Note that we can use the tilde (~) to specify that you want to look at stool output by group. We use the tilde because all stool outputs are in one column and group is given in another column. If you were given the stool outputs in two columns (Control & Treatment), then you would simply give R the two columns separated by a comma. It would look something like: ‘t.test(Treatment, Control, var.equal = TRUE)’. See last lab 10 ‘Anorexia’ data for an example of data for each group given by column.

Note also that we no longer have to say ‘paired=TRUE’ in the function. The default is ‘paired=FALSE’ and that is exactly the test we want.

We have to specify that we are assuming the population variances are equal using the argument ‘var.equal=TRUE’, otherwise the results will not be the same as the results we computed by hand.

t.test(DiaData$Stool~DiaData$Group, var.equal = TRUE)
## 
##  Two Sample t-test
## 
## data:  DiaData$Stool by DiaData$Group
## t = 2.245, df = 167, p-value = 0.02608
## alternative hypothesis: true difference in means between group Control and group Treatment is not equal to 0
## 95 percent confidence interval:
##    9.457362 147.396700
## sample estimates:
##   mean in group Control mean in group Treatment 
##                260.2976                181.8706

Welch’s two sample t-test

When we calculated the two sample t-test by hand above, we had to assume the population variances (square of the standard deviation) were equal.

Review the standard deviations of the two groups and describe whether you think the assumption is met

by(DiaData$Stool, DiaData$Group, sd)
## DiaData$Group: Control
## [1] 253.6135
## ------------------------------------------------------------ 
## DiaData$Group: Treatment
## [1] 197.3636

If the population variances were not similar, we would consider using the Welch’s two sample t-test. The Welch’s test accounts for data that do not have equal population variances, providing us with a more accurate test result if the assumption of equal variances is violated. To do this test by hand is quite computational so we will learn how to compute it in R.

To conduct a Welch’s test, we simply switch the variance argument to var.equal= FALSE. This is also the default, so you can leave out the argument entirely and get the same answer.

t.test(DiaData$Stool~DiaData$Group, var.equal = FALSE)
## 
##  Welch Two Sample t-test
## 
## data:  DiaData$Stool by DiaData$Group
## t = 2.2417, df = 156.64, p-value = 0.02638
## alternative hypothesis: true difference in means between group Control and group Treatment is not equal to 0
## 95 percent confidence interval:
##    9.323083 147.530979
## sample estimates:
##   mean in group Control mean in group Treatment 
##                260.2976                181.8706

Here our p-value is very close to the one we obtained using the Student’s test. This will generally be the case when the population variances are reasonably close.

Practice Problems: Heart Study

Some infants are born with congenital heart defects and require surgery very early in life. One approach to performing this surgery is known as “circulatory arrest.” A downside of this procedure, however, is that it cuts off the flow of blood to the brain, possibly resulting in brain damage. An alternative procedure, “low-flow bypass” maintains circulation to the brain, but does so with an external pump that potentially causes other sorts of injuries to the brain.

To investigate the treatments, surgeons at Harvard Medical School conducted a randomized controlled trial. In the trial, 70 infants received low-flow bypass surgery and 73 received the circulatory arrest approach. The researchers looked at two outcomes: the Psychomotor Development Index (PDI), which measures physiological development, and the Mental Development Index (MDI), which measures mental development. For both indices, higher scores indicate greater levels of development. The results of their study are on the course website.

Use:

heart <- read.delim("http://myweb.uiowa.edu/pbreheny/data/infant-heart.txt")
  1. Calculate the standard deviations of PDI and MDI for each group. Based on these values, should we do a Student’s test or a Welch’s test?

  2. Conduct a t-test (whichever one you decided was most appropriate) to determine whether the difference in physiological development between the infants in the circulatory arrest group and the low-flow bypass group is statistically significant.

  3. Conduct a t-test (whichever one you decided was most appropriate) to determine whether the difference in mental development between the infants in the circulatory arrest group and the low-flow bypass group is statistically significant.

  4. Calculate the 95% confidence intervals for these two tests. Interpret these in terms of the context of the study. What does the confidence interval mean? Which group did better?

  5. If your child had to have open-heart surgery as an infant, which treatment option would you prefer? Why?