r - In ggplot2, what do the end of the boxplot lines represent? -
i can't find description of end points of lines of boxplot represent.
for example, here point values above , below lines end.
(i realize top , bottom of box 25th , 75th percentile, , centerline 50th). assume, there points above , below lines not represent max/min values.
the "dots" @ end of boxplot represent outliers. there number of different rules determining if point outlier, method r , ggplot use "1.5 rule". if data point is:
- less q1 - 1.5*iqr
- greater q3 + 1.5*iqr
then point classed "outlier". line goes first data point before "1.5" cut-off. note: iqr = q3 - q1
additional information
- see wikipedia boxplot page alternative outlier rules.
- there variety of ways of calculating quantiles. have @ `?quantile description of nine different methods.
example
consider following example
> set.seed(1) > x = rlnorm(20, 1/2)#skewed data > par(mfrow=c(1,3)) > boxplot(x, range=1.7, main="range=1.7") > boxplot(x, range=1.5, main="range=1.5")#default > boxplot(x, range=0, main="range=0")#the same range="very big number"
this gives following plot:
as decrease range 1.7 1.5 reduce length of whisker. however, range=0
special case - it's equivalent "range=infinity"
Comments
Post a Comment