Get Data

Data may be imported from a local file or downloaded from the web. For this example we will use a CSV file downloaded from the web and data entered by hand.

  htwt = read.csv("http://facweb1.redlands.edu/fac/jim_bentley/downloads/math111/htwt.csv")
  head(htwt)
##   Height Weight Group
## 1     64    159     1
## 2     63    155     2
## 3     67    157     2
## 4     60    125     1
## 5     52    103     2
## 6     58    122     2
  bp = c(87,67,55,66,88,75,84,78,64,73,84,55,72,83,75,55,83,63)

For now, We will focus on the weight (Weight) data in the htwt dataframe.

 names(htwt)
## [1] "Height" "Weight" "Group"
 htwt$Weight
##  [1] 159 155 157 125 103 122 101  82 228 199 195 110 191 151 119 119 112  87 190
## [20]  87

We can create a quick stemplot using the base package.

  stem(htwt$Weight, 2)
## 
##   The decimal point is 1 digit(s) to the right of the |
## 
##    8 | 277
##   10 | 130299
##   12 | 25
##   14 | 1579
##   16 | 
##   18 | 0159
##   20 | 
##   22 | 8

We can create back-to-back stemplots using the aplpack package. We first make Group a factor variable, and then generate the plot.

## 1 | 2: represents 120
##  leaf unit: 10
##             n: 20
##    3    0. | 888
##    9    1* | 001111
##   (2)    t | 22
##    9     f | 5555
##          s | 
##    5    1. | 9999
##         2* | 
##    1     t | 2
## ____________________________
##   1 | 2: represents 120, leaf unit: 10 
## htwt$Weight[htwt$Group == "Male"]
##                  htwt$Weight[htwt$Group == "Female"]
## ____________________________
##            | 0* |           
##            |  t |           
##            |  f |           
##            |  s |           
##    1      8| 0. |88     2   
##    3     10| 1* |0111  (4)  
##    4      2|  t |2      5   
##   (1)     5|  f |555    4   
##            |  s |           
##    4    999| 1. |9      1   
##            | 2* |           
##    1      2|  t |           
##            |  f |           
##            |  s |           
##            | 2. |           
##            | 3* |           
## ____________________________
## n:        9      11     
## ____________________________

Stem Splitting

How stems are split can greatly affect the way we view the data. We use the blood pressure data to show this.

  # Too few stems
  stem(bp)
## 
##   The decimal point is 1 digit(s) to the right of the |
## 
##   5 | 555
##   6 | 3467
##   7 | 23558
##   8 | 334478
  # Too many stems
  stem(bp,5)
## 
##   The decimal point is at the |
## 
##   54 | 000
##   56 | 
##   58 | 
##   60 | 
##   62 | 0
##   64 | 0
##   66 | 00
##   68 | 
##   70 | 
##   72 | 00
##   74 | 00
##   76 | 
##   78 | 0
##   80 | 
##   82 | 00
##   84 | 00
##   86 | 0
##   88 | 0
  # Just right stems
  stem(bp,2)
## 
##   The decimal point is 1 digit(s) to the right of the |
## 
##   5 | 555
##   6 | 34
##   6 | 67
##   7 | 23
##   7 | 558
##   8 | 3344
##   8 | 78
  # Strangely, the aplpack version defaults to the right stems
  stem.leaf(bp)
## 1 | 2: represents 12
##  leaf unit: 1
##             n: 18
##    3    5. | 555
##    5    6* | 34
##    7    6. | 67
##   (2)   7* | 23
##   (3)   7. | 558
##    6    8* | 3344
##    2    8. | 78