DPI R Bootcamp
Jared Knowles
In this lesson we hope to learn:
names(object) helpshist(df$readSS)
plot(df$readSS, df$mathSS)
plot(df$readSS, df$mathSS)
lines(lowess(df$readSS ~ df$mathSS), col = "red")
ggplot2 is pretty much the new standard in Rlibrary(ggplot2)
qplot(readSS, mathSS, data = df)
ggplot2 an R package does just this by breaking plots into a few basic componentsqplot(readSS, mathSS, data = df, alpha = I(0.3)) + theme_dpi()
readSS is the x coordinate and mathSS is the y coordinate for each observation in our datadf$mathSS using 3 separate geomsqplot(mathSS, readSS, data = df) + theme_dpi()
qplot(mathSS, data = df) + theme_dpi()
qplot(factor(grade), mathSS, data = df, geom = "line", group = stuid, alpha = I(0.2)) +
theme_dpi()
ggplot2 has an extended syntax that makes this obviousggplot(df, aes(x = readSS, y = mathSS)) + geom_point()
# Identical to: qplot(readSS,mathSS,data=df)
aes says we are specifying aesthetics, here we specified x and y to make a two dimensional graphicdata(mpg)
qplot(displ, cty, data = mpg) + theme_dpi()
qplot(displ, cty, data = mpg, size = cyl) + theme_dpi()
qplot(displ, cty, data = mpg, shape = drv, size = I(3)) + theme_dpi()
qplot(displ, cty, data = mpg, color = class) + theme_dpi()
qplot(mathSS, readSS, data = df[1:100, ], size = race, alpha = I(0.8)) + theme_dpi()
df$proflvl2 <- factor(df$proflvl, levels = c("advanced", "basic", "proficient",
"below basic"))
df$proflvl2 <- ordered(df$proflvl2)
qplot(mathSS, readSS, data = df[1:100, ], color = proflvl2, size = I(3)) + scale_color_brewer(type = "seq") +
theme_dpi()
mathSS to, and waht can we map discrete characteristics like race to?qplot(factor(grade), readSS, data = df[1:100, ], color = mathSS, geom = "jitter",
size = I(3.2)) + theme_dpi()
qplot(factor(grade), readSS, data = df[1:100, ], color = dist, geom = "jitter",
size = I(3.2)) + theme_dpi()
| Aesthetic | Discrete | Continuous |
|---|---|---|
| Color | Disparate | colors Sequential or divergent colors |
| Size | Unique siz | e for each value linear or logrithmic mapping to radius of value |
| Shape | A shape fo | r each value does not make sense |
| Aesthetic | Ordered | Unordered |
|---|---|---|
| Color | Sequential | or divergent colors Rainbow |
| Size | Increasing | or decreasing radius does not make sense |
| Shape | **does not | make sense** A shape for each value |
qplot(readSS, mathSS, data = df) + facet_wrap(~grade) + theme_dpi(base_size = 12) +
geom_smooth(method = "lm", se = FALSE, size = I(1.2))
qplot(readSS, mathSS, data = df) + facet_grid(ell ~ grade) + theme_dpi(base_size = 12) +
geom_smooth(method = "lm", se = FALSE, size = I(1.2))
colwheel <- "https://dl.dropbox.com/u/1811289/colorwheel.R"
dropbox_source(colwheel)
col.wheel("magenta", nearby = 2)
## [1] "plum" "violet" "darkmagenta" "magenta4" "magenta3"
## [6] "magenta2" "magenta" "magenta1" "orchid4" "orchid"
col.wheel("orange", nearby = 2)
## [1] "salmon1" "darksalmon" "orangered4" "orangered3"
## [5] "coral" "orangered2" "orangered" "orangered1"
## [9] "lightsalmon2" "lightsalmon" "peru" "tan3"
## [13] "darkorange2" "darkorange4" "darkorange3" "darkorange1"
## [17] "linen" "bisque3" "bisque1" "bisque2"
## [21] "darkorange" "antiquewhite3" "antiquewhite1" "papayawhip"
## [25] "moccasin" "orange2" "orange" "orange1"
## [29] "orange4" "wheat4" "orange3" "wheat"
## [33] "oldlace"
col.wheel("brown", nearby = 2)
## [1] "snow1" "snow2" "rosybrown" "rosybrown1" "rosybrown2"
## [6] "rosybrown3" "rosybrown4" "lightcoral" "indianred" "indianred1"
## [11] "indianred3" "brown" "brown4" "brown1" "brown3"
## [16] "brown2" "firebrick" "firebrick1" "chocolate" "chocolate4"
## [21] "saddlebrown" "seashell3" "seashell2" "seashell4" "sandybrown"
## [26] "peachpuff2" "peachpuff3"
+scale_color_brewer(palette"X")library(grid)
p1 <- qplot(readSS, ..density.., data = df, fill = race, position = "fill",
geom = "density") + scale_fill_brewer(type = "qual", palette = 2)
p2 <- qplot(readSS, ..fill.., data = df, fill = race, position = "fill", geom = "density") +
scale_fill_brewer(type = "qual", palette = 2) + ylim(c(0, 1)) + theme_bw() +
opts(legend.position = "none", axis.text.x = theme_blank(), axis.text.y = theme_blank(),
axis.ticks = theme_blank(), panel.margin = unit(0, "lines")) + ylab("") +
xlab("")
vp <- viewport(x = unit(0.65, "npc"), y = unit(0.73, "npc"), width = unit(0.2,
"npc"), height = unit(0.2, "npc"))
print(p1)
print(p2, vp = vp)
Embed one plot in another plot in R using two different data elements from our data set. For example, plot a histogram of readSS inside a scatterplot of readSS and mathSS
Explore some examples on the ggplot2 website. What are some ways to overlay more than 3 dimensions of data in a single plot?
What types of data work best for what types of visualizations?
It is good to include the session info, e.g. this document is produced with knitr version 0.8. Here is my session info:
print(sessionInfo(), locale = FALSE)
## R version 2.15.2 (2012-10-26)
## Platform: i386-w64-mingw32/i386 (32-bit)
##
## attached base packages:
## [1] splines grid stats graphics grDevices utils datasets
## [8] methods base
##
## other attached packages:
## [1] Hmisc_3.10-1 snow_0.3-10 gbm_1.6-3.2 survival_2.36-14
## [5] caret_5.15-044 foreach_1.4.0 cluster_1.14.3 reshape_0.8.4
## [9] lme4_0.999999-0 Matrix_1.0-10 lattice_0.20-10 xtable_1.7-0
## [13] gridExtra_0.9.1 sandwich_2.2-9 quantreg_4.91 SparseM_0.96
## [17] mgcv_1.7-22 eeptools_0.1 mapproj_1.1-8.3 maps_2.2-6
## [21] proto_0.3-9.2 stringr_0.6.1 plyr_1.7.1 ggplot2_0.9.2.1
## [25] lmtest_0.9-30 zoo_1.7-9 knitr_0.8
##
## loaded via a namespace (and not attached):
## [1] codetools_0.2-8 colorspace_1.2-0 compiler_2.15.2
## [4] dichromat_1.2-4 digest_0.5.2 evaluate_0.4.2
## [7] formatR_0.6 gtable_0.1.1 iterators_1.0.6
## [10] labeling_0.1 markdown_0.5.3 MASS_7.3-22
## [13] memoise_0.1 munsell_0.4 nlme_3.1-105
## [16] RColorBrewer_1.0-5 reshape2_1.2.1 scales_0.2.2
## [19] stats4_2.15.1 tools_2.15.1
This work (R Tutorial for Education, by Jared E. Knowles), in service of the Wisconsin Department of Public Instruction, is free of known copyright restrictions.