% Basic Statistical Concepts
% R Bootcamp HTML Slides
% Jared Knowles
We will review the following statistical concepts here through the lens of R:
qplot(factor(cyl), data = mtcars) + labs(x = "cylinder", title = "Car models by Cylinder Count")
data(diamonds)
qplot(factor(cut), data = diamonds) + labs(x = "Cut", title = "Diamonds by Cut Quality")
qplot(factor(color), data = diamonds) + labs(x = "Color", title = "Diamonds by Color and Clarity") +
facet_wrap(~clarity, nrow = 2)
qplot(carat, price, data = diamonds, color = color) + geom_smooth(aes(group = 1))
| Level of Meas. | Stats |
|---|---|
| Nominal | mode, Chi-squared |
| Ordinal | median, percentile, (plus above) |
| Interval | mean, standard deviation, correlation, ANOVA, plus above |
| Continuous | geometric mean, harmonic mean, coefficient of variation, logarithms, plus above |
qplot(hwy, data = mpg, geom = "density") + geom_vline(xintercept = median(mpg$hwy),
color = I("blue"), size = I(1.1)) + geom_vline(xintercept = mean(mpg$hwy),
color = I("gold"), size = I(1.1)) + geom_vline(xintercept = 26, color = I("orange"),
size = I(1.1)) + geom_text(aes(x = median(mpg$hwy) + 1.5, y = 0.08, label = "Median"),
size = I(4.5)) + geom_text(aes(x = mean(mpg$hwy) - 1.5, y = 0.06, label = "Mean"),
size = I(4.5)) + geom_text(aes(x = 26 + 1.5, y = 0.05, label = "Mode"),
size = I(4.5))
library(xtable)
print(xtable(table(mpg$hwy)), type = "html")
## <!-- html table generated in R 2.15.1 by xtable 1.7-0 package -->
## <!-- Wed Sep 26 16:57:47 2012 -->
## <TABLE border=1>
## <TR> <TH> </TH> <TH> V1 </TH> </TR>
## <TR> <TD align="right"> 12 </TD> <TD align="right"> 5 </TD> </TR>
## <TR> <TD align="right"> 14 </TD> <TD align="right"> 2 </TD> </TR>
## <TR> <TD align="right"> 15 </TD> <TD align="right"> 10 </TD> </TR>
## <TR> <TD align="right"> 16 </TD> <TD align="right"> 7 </TD> </TR>
## <TR> <TD align="right"> 17 </TD> <TD align="right"> 31 </TD> </TR>
## <TR> <TD align="right"> 18 </TD> <TD align="right"> 10 </TD> </TR>
## <TR> <TD align="right"> 19 </TD> <TD align="right"> 13 </TD> </TR>
## <TR> <TD align="right"> 20 </TD> <TD align="right"> 11 </TD> </TR>
## <TR> <TD align="right"> 21 </TD> <TD align="right"> 2 </TD> </TR>
## <TR> <TD align="right"> 22 </TD> <TD align="right"> 7 </TD> </TR>
## <TR> <TD align="right"> 23 </TD> <TD align="right"> 7 </TD> </TR>
## <TR> <TD align="right"> 24 </TD> <TD align="right"> 13 </TD> </TR>
## <TR> <TD align="right"> 25 </TD> <TD align="right"> 15 </TD> </TR>
## <TR> <TD align="right"> 26 </TD> <TD align="right"> 32 </TD> </TR>
## <TR> <TD align="right"> 27 </TD> <TD align="right"> 14 </TD> </TR>
## <TR> <TD align="right"> 28 </TD> <TD align="right"> 7 </TD> </TR>
## <TR> <TD align="right"> 29 </TD> <TD align="right"> 22 </TD> </TR>
## <TR> <TD align="right"> 30 </TD> <TD align="right"> 4 </TD> </TR>
## <TR> <TD align="right"> 31 </TD> <TD align="right"> 7 </TD> </TR>
## <TR> <TD align="right"> 32 </TD> <TD align="right"> 4 </TD> </TR>
## <TR> <TD align="right"> 33 </TD> <TD align="right"> 2 </TD> </TR>
## <TR> <TD align="right"> 34 </TD> <TD align="right"> 1 </TD> </TR>
## <TR> <TD align="right"> 35 </TD> <TD align="right"> 2 </TD> </TR>
## <TR> <TD align="right"> 36 </TD> <TD align="right"> 2 </TD> </TR>
## <TR> <TD align="right"> 37 </TD> <TD align="right"> 1 </TD> </TR>
## <TR> <TD align="right"> 41 </TD> <TD align="right"> 1 </TD> </TR>
## <TR> <TD align="right"> 44 </TD> <TD align="right"> 2 </TD> </TR>
## </TABLE>
It is good to include the session info, e.g. this document is produced with knitr version 0.8. Here is my session info:
print(sessionInfo(), locale = FALSE)
## R version 2.15.1 (2012-06-22)
## Platform: x86_64-pc-mingw32/x64 (64-bit)
##
## attached base packages:
## [1] stats graphics grDevices utils datasets methods base
##
## other attached packages:
## [1] xtable_1.7-0 mgcv_1.7-21 ggplot2_0.9.2.1 knitr_0.8
##
## loaded via a namespace (and not attached):
## [1] colorspace_1.1-1 dichromat_1.2-4 digest_0.5.2
## [4] evaluate_0.4.2 formatR_0.6 grid_2.15.1
## [7] gtable_0.1.1 labeling_0.1 lattice_0.20-10
## [10] MASS_7.3-21 Matrix_1.0-9 memoise_0.1
## [13] munsell_0.4 nlme_3.1-104 plyr_1.7.1
## [16] proto_0.3-9.2 RColorBrewer_1.0-5 reshape2_1.2.1
## [19] scales_0.2.2 stringr_0.6.1 tools_2.15.1
This work (R Tutorial for Education, by Jared E. Knowles), in service of the Wisconsin Department of Public Instruction, is free of known copyright restrictions.