ggplot2 makes data visualization simple, every graph is build from the same components which include:
Property | Function |
---|---|
alpha | transparent |
color | line color |
fill | fill color |
linetype | line style |
size | thickness of line |
Graph type | Variable type | geom function | aes | Other parameters |
---|---|---|---|---|
Bar chart | Discrete | geom_bar | alpha, color, fill, linetype, size, | position, width |
Pie chart | Discrete | geom_bar | alpha, color, fill, linetype, size, | width, coord_polar() |
Histogram | Continuous | geom_historgram | y=(..density..), alpha, color, fill, linetype, size | bins, binwidth, position |
Shaded line | Continuous | geom_area(stat=“bin”) | y=(..density..), alpha, color, fill, linetype, size | |
Density plot | Continuous | geom_density | alpha, color, fill, linetype, size, |
Graph type | geom function | aes | Other parameters |
---|---|---|---|
Bar chart | geom_bar(stat=“identity”) | alpha, color, fill, linetype, size | position, width |
Boxplot | geom_boxplot | ymin, ymax, alpha, color, fill, linetype, shape, size | lower, middle, upper |
Violin plot | geom_violin | alpha, color, fill, linetype, size | trim, adjust, scale=“count/area” |
Line graph | geom_line | alpha, color, linetype, size | |
Regression line | geom_smooth | alpha, color, fill, linetype, size | model=lm/loess/glm, level, se |
Scatterplot | geom_point | alpha, color, fill, shape, size | |
2D Density plot | geom_density2d | alpha, color, linetype, size | geom=“raster/title”, contour=T/F |
Graph type | geom function | aes | Other parameters |
---|---|---|---|
Heat map | geom_tile | alpha, color, linetype, size |
Graph type | geom function | aes | Other parameters |
---|---|---|---|
Error bar | geom_errorbar | alpha, color, linetype, size, width | position |
Confidence region | geom_ribbon | alpha, color, linetype, size |
Graph type | geom function | aes | Other parameters |
---|---|---|---|
Marginal rugs | geom_rub(sides=“bl”) | alpha, color, linetype, size | position |
Vertical line | geom_vline | aes(xintercept, color), linetype, size | |
Horizontal line | geom_hline | aes(yintercept, color), linetype, size | |
Angled line | geom_abline | aes(intercept, slope), linetype, size | |
Text | geom_text | aes(y, label=) | see text properties table below |
qq plot | geom_qq | aes(sample=) |
Element | Function | Other parameters |
---|---|---|
Swap x, y axes | coord_flip | xlim, ylim |
x, y axes scaling ratio | coord_fixed | xlim, ylim, ratio=0.5 |
Polar coordinates | coord_polar | theta, start, direction |
Faceting | facet_grid | v ~ h, scales=“free/free_x/free_y”, labeller=c() |
Faceting with wrap | facet_wrap | ncol, nrow |
Remove legend | guides | fill=FALSE/guide_legend(reverse=TRUE, title=NULL) |
Annotation | annotate | see annotate table below |
Set title, axis, legend | labs | title, x, y, fill, color, size, shape |
Title | ggtitle | |
x, y label | xlab, ylab | |
Set x, y range | xlim, ylim | 0, 100 |
Range limit | expand_limits | y=0, x=0 |
Scale type | Applicable aes | Function | Other parameters |
---|---|---|---|
Map continuous var | alpha, color, fill, linetype, shape, size | scale_aes_continuous | name, labels, limits, breaks, values |
Map discrete var | alpha, color, fill, linetype, shape, size | scale_aes_discrete | name, labels, limits, breaks, values |
Manually specified visual | alpha, color, fill, linetype, shape, size | scale_aes_manual | |
HCL color wheel and lightness | color fill | scale_aes_hue | guide=guide_legend(reverse=TRUE), l=30 |
Grey color | color, fill | scale_aes_grey | start, end, na.value |
RcolorBrewer palette | color, fill | scale_aes_brewer | palette |
2 colors gradient (cont. var) | color, fill | scale_aes_gradient | low=“black”, high=“white”, breaks |
Gradient with 3 colors (cont. var) | color, fill | scale_aes_gradient2 | low=“black”, high=“white”, midpoint=110, breaks |
Gradient with n colors (cont. var) | color, fill | scale_aes_gradientn | color=c(“red”,“orange”, “yellow”) |
Manually specified shape | shape | scale_shape | solid |
Manually specified size | size | scale_size_area | max |
Manually specified linetype | linetype | scale_linetype | |
Set x, y, range & tick | x, y | scale_aes_continuous | breaks=c(), labels=c(), name=“title” |
Log x, y axis | x, y | scale_aes_log10 | |
x, y axis square root | x, y | scale_aes_sqrt | |
Reverse x, y order | x, y | scale_aes_reverse |
Function | Name | Value |
---|---|---|
Black and white theme | theme_bw | |
Classic theme | theme_classic | |
Grey theme | theme_grey | |
Title | plot.title | element_text(), see text properties table below |
Plot color | plot.background | element_rect(fill,color, size) |
Graph grid line | panel.grid.major[.x|.y] | element_blank(), element_line() |
Graph grid line | panel.grid.minor[.x|.y] | element_blank(), element_line() |
Graph background color | panel.background | element_rect(fill,color, size) |
Axis label | axis.title[.x|.y] | element_blank(), element_text(), see text properties table below |
Tick label | axis.text[.x|.y] | element_blank(), element_text(), see text properties table below |
Legend title | legend.title | element_text(), see text properties table below |
Legend text | legend.text | element_text(), see text properties table below |
Legend position | legend.position | c(0.7, 0.4) or “top/left/right/bottom/none” |
Legend justification | legend.justification | c(0,1) |
Legend background color | legend.background | element_rect(fill, color, size) |
Facet label | strip.text[.x|.y] | element_blank(), element_text(), see text properties table below |
Facet background color | strip.background | element_rect() |
Type | Parameter |
---|---|
text | label, x, y, size, color, hjust, vjust |
text (math expression) | label, x, y, parse, size, color, hjust, vjust |
segment | x, y, xend, yend, arrow=arrow(ends=“both”, angle=90, length=unit(.2, “cm”)) |
rect | xmin, xmas, ymin, ymax, alpha, fill |
Value | Style |
---|---|
stack | stack on top of one another |
fill | normalized height stack on top of one another |
identity | overlaid |
dodge | side by side |
Function | geom_text | element_text |
---|---|---|
Font | family | family |
Font style | fontface | face |
Font color | color | color |
Font size | size | size |
Horizontal alignment | hjust | hjust |
Vertical alignment | vjust | vjust |
Angle | angle | angle |
Data must be in a data frame and long format in order to make graph using ggplot2.
In R
df4 <- data.frame(ProductId = c(1, 2, 3), Regular = c(649, 749, 399), Discount = c(599,
699, 349))
print(df4)
## ProductId Regular Discount
## 1 1 649 599
## 2 2 749 699
## 3 3 399 349
library(reshape2)
df4_long <- melt(df4, id.vars = "ProductId", measure.vars = c("Regular", "Discount"),
variable.name = "conditions", value.name = "price")
print(df4_long)
## ProductId conditions price
## 1 1 Regular 649
## 2 2 Regular 749
## 3 3 Regular 399
## 4 1 Discount 599
## 5 2 Discount 699
## 6 3 Discount 349
In Python
import numpy as np
import pandas as pd
ProductId = [1, 2, 3]
Regular = [649, 749, 399]
Discount = [599, 699, 349]
df = pd.DataFrame(np.column_stack([ProductId, Regular, Discount]), columns=['ProductId', 'Regular', 'Discount'])
print df
## ProductId Regular Discount
## 0 1 649 599
## 1 2 749 699
## 2 3 399 349
df_long = pd.melt(df, id_vars=['ProductId'], value_vars=['Regular', 'Discount'])
print df_long
## ProductId variable value
## 0 1 Regular 649
## 1 2 Regular 749
## 2 3 Regular 399
## 3 1 Discount 599
## 4 2 Discount 699
## 5 3 Discount 349
Converting back to wide format
In R
df4_wide <- dcast(df4_long, ProductId ~ conditions, value.var = "price")
print(df4_wide)
## ProductId Regular Discount
## 1 1 649 599
## 2 2 749 699
## 3 3 399 349
In Python
table = df_long.pivot_table(values='value', index=['ProductId'], columns=['variable']).reset_index()
print table
## variable ProductId Discount Regular
## 0 1 599 649
## 1 2 699 749
## 2 3 349 399
In R
library(ggplot2)
library(dplyr)
library(scales)
library(gridExtra)
mtcars$cyl <- as.factor((mtcars$cyl))
p1 <- ggplot(mtcars, aes(x = drat, fill = cyl)) + geom_histogram(bins = 3) +
ggtitle("Stacked histogram")
mtcars_summary <- group_by(mtcars, cyl) %>% summarise(avg_drat = mean(drat))
p2 <- ggplot(mtcars, aes(x = drat, fill = cyl)) + geom_histogram(binwidth = 0.7,
alpha = 0.3, position = "identity") + geom_vline(data = mtcars_summary,
aes(xintercept = avg_drat, color = cyl), linetype = "dashed", size = 1) +
ggtitle("Overlaid histogram ")
p3 <- ggplot(mtcars, aes(x = drat, fill = cyl)) + geom_histogram(binwidth = 0.5,
position = "dodge") + scale_fill_brewer(palette = "Pastel1") + ggtitle("Interleave histogram with custom pastel color")
p4 <- ggplot(mtcars, aes(x = drat)) + geom_histogram(binwidth = 0.5, color = "red",
fill = "yellow") + facet_grid(cyl ~ ., scales = "free") + geom_vline(data = mtcars_summary,
aes(xintercept = avg_drat), linetype = "dashed", size = 1.5, color = "blue") +
ggtitle("Faceted histogram") + geom_text(x = max(mtcars$drat), y = 4, aes(label = "cyl"))
grid.arrange(p1, p2, p3, p4, ncol = 2, nrow = 2)
In Python
import numpy as np
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
mtcars = pd.read_csv('http://photo.etangkk.com/python/mtcars.txt', sep='\t')
table = mtcars.pivot_table(values='drat', index=['name'], columns=['cyl']).reset_index()
fig, ax = plt.subplots(1,2,figsize=(10,5))
table.plot.hist(bins=3, stacked=True, ax=ax[0])
ax[0].title.set_text('Stacked histogram')
table.plot.hist(bins=3, alpha=0.5, ax=ax[1])
ax[1].title.set_text('Overlaid historgram')
dislay(fig)
g = sns.FacetGrid(mtcars, col="cyl", margin_titles=True)
g.map(plt.hist, "drat", color="steelblue", lw=0)
g.fig.subplots_adjust(top=.85)
g.fig.suptitle('Faceted historgram')
In R
p1 <- ggplot(mtcars, aes(x=drat, color=cyl)) + geom_density() + ggtitle("p1 Basic density plot")
p2 <- ggplot(mtcars, aes(x=drat, fill=cyl)) + geom_density(alpha=0.3) + ggtitle("p2 Density plot with semi-transparent fill")
grid.arrange(p1, p2, ncol=2)
In Python
fig, ax = plt.subplots(1,2,figsize=(10,5))
sns.FacetGrid(mtcars, hue="cyl", size=4, aspect=1).map(sns.kdeplot, "drat", ax=ax[0])
ax[0].title.set_text('Basic density plot')
sns.FacetGrid(mtcars, hue="cyl", size=4, aspect=1).map(sns.kdeplot, "drat", shade=True, ax=ax[1])
ax[1].title.set_text('Density plot with semi-transparent fill')
display(fig)
In R
p <- ggplot(mtcars, aes(x=drat, y=hp))
p1 <- p + geom_point() + stat_density2d() + ggtitle("Density contour with points")
p2 <- p + stat_density2d(aes(color=..level..)) + scale_y_continuous(breaks=seq(min(mtcars$hp), max(mtcars$hp), 50)) + ggtitle("Density contour with height color")
p3 <- p + stat_density2d(aes(fill=..density..), geom="raster", contour=FALSE) + ggtitle("Density contour with density fill color")
p4 <- p + geom_point() + stat_density2d(aes(alpha=..density..), geom="raster", contour=FALSE) + annotate("segment", x=3.25, xend=3.75, y=100, yend=100, color="blue", size=1, arrow=arrow()) + ggtitle("Density contour with points and arrow")
grid.arrange(p1, p2, p3, p4, ncol=2, nrow=2)
In Python
plt.clf()
fig, ax = plt.subplots(1,2,figsize=(8,4))
sns.kdeplot(mtcars.drat, mtcars.hp, ax=ax[0])
ax[0].title.set_text('Density contour')
cmap = sns.cubehelix_palette(as_cmap=True, dark=0, light=1, reverse=True)
ax[1].title.set_text('Density contour with density fill color')
sns.kdeplot(mtcars.drat, mtcars.hp, cmap=cmap, n_levels=60, shade=True, ax=ax[1])
display(fig)
g = sns.jointplot(x="drat", y="hp", data=mtcars, kind="kde", color="m", xlim=(2.5,5), ylim=(0,350))
g.plot_joint(plt.scatter, c="w", s=30, linewidth=1, marker="+")
plt.title('Jointplot = kde + contour plot')
In R
p1 <- ggplot(mtcars, aes(x = cyl, y = drat)) + scale_y_reverse() + ylim(6, 2) +
geom_boxplot() + ggtitle("Basic box plot with reversed y-axis")
p <- ggplot(mtcars, aes(x = cyl, y = drat, fill = cyl))
p2 <- p + geom_boxplot() + scale_x_discrete(limits = c("4", "6")) + ggtitle("Colored box plot with x subset")
p3 <- p + geom_boxplot() + ggtitle("Box plot with flipped axes but no redundant legend") +
coord_flip() + guides(fill = FALSE)
p4 <- p + geom_boxplot() + ggtitle("Box plot with summary") + stat_summary(fun.y = mean,
geom = "point", shape = 5, size = 3)
grid.arrange(p1, p2, p3, p4, ncol = 2, nrow = 2)
In Python
plt.clf()
fig, ((ax1, ax2), (ax3, ax4)) = plt.subplots(nrows=2, ncols=2,figsize=(8,8))
sns.boxplot(x='cyl', y='drat', data=mtcars, palette="Set2", ax=ax1)
ax1.title.set_text('Basic boxplot')
sns.boxplot(x='cyl', y='drat', hue='vs', data=mtcars[mtcars['cyl'] <= 6], ax=ax2)
ax2.title.set_text('Boxplot with x subset and FacetGrid')
sns.boxplot(y='cyl', x='drat', data=mtcars, orient="h", ax=ax3)
ax3.title.set_text('Basic boxplot with flipped axes')
ax4 = sns.boxplot(x="cyl", y="drat", data=mtcars)
ax4 = sns.swarmplot(x="cyl", y="drat", data=mtcars, color=".25")
ax4.title.set_text('Basic boxplot with datapoints')
display(fig)
In R
# Find average hp for each cyl, gear group
mtcars$cyl <- as.factor(mtcars$cyl)
mtcars$gear <- as.factor(mtcars$gear)
mtcars_short <- group_by(mtcars, cyl, gear) %>% summarise(avg_hp = round(mean(hp),
2))
p1 <- ggplot(mtcars_short, aes(x = cyl, y = avg_hp, fill = gear)) + geom_bar(position = "dodge",
stat = "identity") + geom_text(aes(label = avg_hp), vjust = 1.5, position = position_dodge(0.9),
color = "white") + ggtitle("Interleave bar chart")
# Calculate label y position
mtcars_short <- arrange(mtcars_short, cyl, gear) %>% group_by(cyl) %>% mutate(label_y = cumsum(avg_hp))
p2 <- ggplot(mtcars_short, aes(x = cyl, y = avg_hp, fill = gear)) + geom_bar(stat = "identity") +
geom_text(aes(y = label_y, label = avg_hp), vjust = 1.5, color = "white") +
ggtitle("Stacked bar chart with label under the tops of bars")
# Calculate label y position
mtcars_short <- arrange(mtcars_short, cyl, gear) %>% group_by(cyl) %>% mutate(label_y = cumsum(avg_hp) -
0.5 * avg_hp)
p3 <- ggplot(mtcars_short, aes(x = cyl, y = avg_hp, fill = gear)) + geom_bar(stat = "identity") +
geom_text(aes(y = label_y, label = avg_hp), color = "white") + ggtitle("Stacked bar chart with label in the middle of bars")
grid.arrange(p1, p2, p3, ncol = 2, nrow = 2)
In Python
plt.clf()
fig, ((ax1, ax2), (ax3, ax4)) = plt.subplots(nrows=2, ncols=2,figsize=(8,8))
grouped = pd.DataFrame(mtcars.groupby(['cyl', 'gear'])['hp'].mean()).reset_index()
print(grouped)
sns.barplot(x="cyl", y="hp", hue="gear", data=grouped, ax=ax1)
ax1.title.set_text('Interleave barchart')
sns.factorplot(x="cyl", y="hp", hue="gear", data=grouped, size=6, kind="bar", palette="muted", ax=ax2)
ax2.title.set_text('Grouped barplots')
grouped.pivot('cyl', 'gear')['hp'].plot(kind='bar', stacked=True, ax=ax3)
ax3.title.set_text('Pandas stacked barplot')
grouped.pivot('cyl', 'gear')['hp'].plot(kind='bar', stacked=False, ax=ax4)
ax4.title.set_text('Pandas barplot')
display(fig)
Cleveland dot plot reduce visual clutter compare to bar chart making it easier to read.
In R
p1 <- ggplot(mtcars, aes(x = qsec, y = reorder(rownames(mtcars), qsec))) + geom_point(size = 2,
aes(color = cyl)) + ylab("Car") + ggtitle("Cleveland dot plot sorted by x variable")
p2 <- ggplot(mtcars, aes(x = qsec, y = reorder(rownames(mtcars), qsec))) + geom_point(size = 2,
aes(color = cyl)) + facet_grid(cyl ~ ., scales = "free_y", space = "free_y") +
ylab("Car") + guides(fill = FALSE) + ggtitle("Faceted Cleveland dot pot")
grid.arrange(p1, p2, ncol = 2)
In Python
plt.clf()
fig, ax = plt.subplots(figsize=(12,5))
sns.stripplot(x='qsec', y=mtcars[['name', 'qsec']].sort_values('qsec', ascending=False).name, hue='cyl', data=mtcars, size=6, orient="h", edgecolor="gray", ax=ax)
ax.title.set_text('Cleveland dot plot sorted by x variable')
display(fig)
In R
p1 <- ggplot(mtcars, aes(x = qsec, y = hp, color = gear)) + geom_line(linetype = "dashed",
size = 1) + ggtitle("Line graphs with different color")
p2 <- ggplot(mtcars, aes(x = qsec, y = hp, linetype = gear)) + geom_line() +
ggtitle("Line graphs with different linetype")
p3 <- ggplot(mtcars, aes(x = qsec, y = hp, shape = gear)) + geom_line() + geom_point() +
ggtitle("Line graphs with different shape")
p4 <- ggplot(mtcars, aes(x = qsec, y = hp, fill = gear)) + geom_line() + geom_point(shape = 21) +
ggtitle("Line graphs with different shape color")
grid.arrange(p1, p2, p3, p4, ncol = 2, nrow = 2)
In Python
g = sns.lmplot(x="mpg", y="disp", hue="gear", data=mtcars, lowess=True)
plt.title('Line graphs with differet color')
In R
p1 <- ggplot(mtcars, aes(x = mpg, y = disp, shape = gear)) + geom_point(size = 1.5) +
ggtitle("Scatterplot with different shape")
p2 <- ggplot(mtcars, aes(x = mpg, y = disp, shape = gear, color = gear)) + geom_point() +
ggtitle("Scatterplot with different shape and color")
p3 <- ggplot(mtcars, aes(x = mpg, y = disp, color = wt)) + geom_point() + ggtitle("Scatterplot with continuous variable map to color")
p4 <- ggplot(mtcars, aes(x = mpg, y = disp, size = wt)) + geom_point() + ggtitle("Scatterplot with continuous variable map to size")
grid.arrange(p1, p2, p3, p4, ncol = 2, nrow = 2)
In Python
plt.clf()
fig, axes = plt.subplots(1,2,figsize=(8,4))
sns.pointplot(x="mpg", y="disp", hue="gear", data=mtcars, markers=["o", "^", "s"], linestyles=["", "", ""], ax=axes[0])
axes[0].title.set_text('Scatterplot with different color and shape')
mtcars.plot.scatter(x='mpg', y='disp', c='wt', ax=axes[1])
axes[1].title.set_text('Continuous variable map to color')
for ax in axes.flatten():
for label in ax.get_xticklabels():
label.set_rotation(90)
display(fig)
sns.lmplot('mpg', 'disp', hue='gear', data=mtcars, fit_reg=False, markers=["o", "x", "p"], palette="Set1")
plt.title('Scatterplot with different shape and color')
In R
p1 <- ggplot(mtcars, aes(x = qsec, y = disp)) + geom_point() + geom_rug(position = "jitter",
size = 0.2) + ggtitle("Scatterplot with marginal rug")
p2 <- ggplot(mtcars, aes(x = qsec, y = disp)) + geom_point() + annotate("text",
x = 16.46, y = 160, label = "Mazda RX4") + annotate("text", x = 20.22, y = 225,
label = "Valiant") + ggtitle("Scatterplot with manual label")
p3 <- ggplot(mtcars, aes(x = qsec, y = disp)) + geom_point() + geom_text(label = row.names(mtcars),
size = 3, vjust = -0.5, hjust = 0) + ggtitle("Scatterplot with automatically label")
grid.arrange(p1, p2, p3, ncol = 2, nrow = 2)
In Python
g = sns.jointplot(x="qsec", y="disp", data=mtcars, space=0, size=6, ratio=50)
g.plot_joint(plt.scatter, color="g")
g.plot_marginals(sns.rugplot, height=1, color="g")
In R
p1 <- ggplot(mtcars, aes(x = mpg, y = disp, shape = gear, color = gear)) + geom_point() +
stat_smooth() + ggtitle("Scatterplot with LOESS fit")
p2 <- ggplot(mtcars, aes(x = mpg, y = disp, shape = gear, color = gear)) + geom_point() +
stat_smooth(method = lm) + ggtitle("Scatterplot with regression model line")
p3 <- ggplot(mtcars, aes(x = mpg, y = disp, shape = gear, color = gear)) + geom_point() +
stat_smooth(method = lm, level = 0.99) + ggtitle("Scatterplot with 99% confidence region")
p4 <- ggplot(mtcars, aes(x = mpg, y = disp, shape = gear, color = gear)) + geom_point() +
stat_smooth(method = lm, se = FALSE) + annotate("text", label = "r^2=0.5",
x = 30, y = 500) + ggtitle("Scatterplot with text annotation")
grid.arrange(p1, p2, p3, p4, ncol = 2, nrow = 2)
In Python
sns.lmplot(x="mpg", y="disp", hue="gear", data=mtcars)
plt.title('Scatterplot with regression line')
sns.lmplot(x="mpg", y="disp", hue="gear", data=mtcars, ci=99)
plt.title('Scatterplot with 99% confidence region')
In R
ggplot2 doesn’t provide an easy way to plot two graphs with same x-axis into one, I created a function to do so.
library(grid)
library(gtable)
overlay_graphs <- function(p1, p2) {
p1 <- p1 + theme_bw()
p2 <- p2 + theme_bw() %+replace% theme(panel.grid.major = element_blank(),
panel.grid.minor = element_blank(), panel.border = element_blank(),
panel.background = element_blank())
# extract gtable
g1 <- ggplot_gtable(ggplot_build(p1))
g2 <- ggplot_gtable(ggplot_build(p2))
# overlap the panel of 2nd plot on that of 1st plot
pp <- c(subset(g1$layout, name == "panel", se = t:r))
g <- gtable_add_grob(g1, g2$grobs[[which(g2$layout$name == "panel")]], pp$t,
pp$l, pp$b, pp$l)
# axis tweaks
ia <- which(g2$layout$name == "axis-l")
ga <- g2$grobs[[ia]]
ax <- ga$children[[2]]
ax$widths <- rev(ax$widths)
ax$grobs <- rev(ax$grobs)
ax$grobs[[1]]$x <- ax$grobs[[1]]$x - unit(1, "npc") + unit(0.15, "cm")
g <- gtable_add_cols(g, g2$widths[g2$layout[ia, ]$l], length(g$widths) -
1)
g <- gtable_add_grob(g, ax, pp$t, length(g$widths) - 1, pp$b)
ia <- which(g2$layout$name == "ylab")
ylab <- g2$grobs[[ia]]
g <- gtable_add_cols(g, g2$widths[g2$layout[ia, ]$l], length(g$widths) -
1)
g <- gtable_add_grob(g, ylab, pp$t, length(g$widths) - 1, pp$b)
grid.draw(g)
}
mtcars_cyl <- group_by(mtcars, cyl) %>% summarise(cnt = length(cyl)) %>% mutate(pct = cnt/sum(cnt) *
100, cum_pct = cumsum(pct))
p1 <- ggplot(mtcars_cyl, aes(x = cyl, cnt)) + geom_bar(fill = "gray70", stat = "identity") +
ggtitle("Cyl distribution and cummulative percentage")
p2 <- ggplot(mtcars_cyl, aes(x = cyl, cum_pct, group = 1)) + geom_line(color = "red") +
expand_limits(y = 0)
overlay_graphs(p1, p2)