Nikhil Gopal, Graham Kim, Uba Backonja
January 25th, 2016
install.packages("ggplot2")
install.packages("HSAUR3")
A data frame with 53 observations on the following 3 variables.
require(HSAUR3)
head(bp)
logdose bloodp recovtime
1 2.26 66 7
2 1.81 52 10
3 1.78 72 18
4 1.54 67 NA
5 2.06 69 10
6 1.74 71 13
When your data is in the right data structure, summarizing your data is cake!:
summary(bp)
logdose bloodp recovtime
Min. :1.180 Min. :51.00 Min. : 4.0
1st Qu.:1.740 1st Qu.:60.00 1st Qu.:10.5
Median :1.900 Median :67.00 Median :21.0
Mean :1.992 Mean :66.34 Mean :22.4
3rd Qu.:2.260 3rd Qu.:69.00 3rd Qu.:26.5
Max. :2.780 Max. :88.00 Max. :72.0
NA's :10
In R, graphing data is easy!
require(HSAUR3)
plot(bp)
plot(bp)
ggplot2 is a visualization library based on the Grammar of Graphics (by Leland Wilkinson, who is now at Tableau).
In short, it is the hypothesis that all graphics can be composed from the same material parts.
require(ggplot2)
ggplot(bp, aes(logdose, bloodp)) +
geom_point() +
coord_cartesian()
require(ggplot2)
ggplot(bp, aes(logdose, bloodp)) +
geom_point() +
coord_cartesian()
What happens if we just run
ggplot(bp, aes(logdose, bloodp))
without any layers?:
What happens if we just run
ggplot(bp, aes(logdose, bloodp))
without any layers?:
Error: No layers in plot
Execution halted
aes
, but encoding details are provided in layersCan anyone describe how they might tweak the previous code to create this plot?
Previous code:
ggplot(bp, aes(logdose, bloodp)) +
geom_point() +
coord_cartesian()
If you deconstruct the plot, what is the main difference between the original plot and this one? The coordinate system!
ggplot(bp, aes(logdose, bloodp)) +
geom_point() +
coord_polar()
What do you see? Can anyone describe how they might create this plot? What would you need to add to the stub code and where would you add it?
logdose bloodp recovtime
1 2.26 66 7
2 1.81 52 10
3 1.78 72 18
4 1.54 67 NA
5 2.06 69 10
6 1.74 71 13
ggplot(bp,
aes(logdose, bloodp, color=recovtime)
) +
geom_point() +
scale_color_continuous(
low="#FFFFFF",
high="#990000"
)
What do you see? Can anyone describe how they might create this plot?
What do you see? Can anyone describe how they might create this plot?
logdose bloodp recovtime
1 2.26 66 7
2 1.81 52 10
ggplot(bp,
aes(logdose, bloodp, color=recovtime, size=recovtime)
) +
geom_point() +
scale_color_continuous(
low="#FFFFFF",
high="#990000"
)
Do you remember anything tricky about shapes from previous lectures? What kind of scale is shape best used for? And what kind of scale do we have in our dataset?
I don't think you'd want to map shapes in this way. I've demonstrated this solely as an excuse to use the shape encoding!
ggplot(bp,
aes(logdose, bloodp, color=recovtime, shape=factor(bp$recovtime > 50), size=recovtime)
) +
geom_point() +
scale_color_continuous(
low="#FFFFFF",
high="#990000"
)
Can anyone describe how they might create this plot? What do you see?
Can anyone describe how they might create this plot? What do you see?
ggplot(bp,
aes(logdose, bloodp, color=recovtime, size=recovtime)
) +
geom_point() +
geom_smooth(method=lm) +
theme(legend.position="none") +
scale_color_continuous(
low="#FFFFFF",
high="#990000"
)
Always have titles and labels!
Can anyone describe how they might create this plot? What do you see?
ggplot(bp,
aes(logdose, bloodp)
) +
geom_point() +
coord_cartesian() +
ggtitle("log10-transformed hypotensive agent dosage \npositively correlated with\naverage systolic blood pressure (while drug was being administered)") +
theme(plot.title = element_text(
lineheight=.8,
face="bold")
)
How might we make a histogram?
ggplot(bp,
aes(logdose)
) +
geom_histogram(
binwidth= 0.5
)
How might we make a histogram?
ggplot(bp,
aes(logdose,
fill=cut(
bp$logdose,
breaks = c(1,1.5,2,2.5,3)
)
)
) +
geom_histogram(
binwidth= 0.5
)
ggplot(bp,
aes(logdose,
fill=cut(
bp$logdose,
breaks = c(1,1.5,2,2.5,3)
)
)
) +
geom_histogram(
binwidth= 0.5
) +
coord_polar()