# Box and whisker plot examples pdf

If you are being asked or are asking yourself, genuine questions, about real-world problems, you probably already have your data. On the other hand, if box and whisker plot examples pdf want to teach you R, you will need some data to play with. Luckily, R comes with a wealth of data sets. I did use such manipulations while writing those notes, in order to find an example satisfying some properties.

For instance, if you measure squirrels, you will have one row per squirrel, one column for the weight, another for the tail length, another for the height, another for the fur colour, etc. There are sonetimes ordered qualitative variables, for instance, a variable whose values would be “never”, “seldom”, “sometimes”, “often”, “always”. These data are sometimes obtained by binning quantitative data. We shall often call this vector a “statistical series”. Unsurprisingly, this is what the “summary” function gives us. 487783 Always be critical when observing data: in particular, you should check that the extreme values are not aberrant, that they do not come from some mistake.

2915 But you might not be familiar with those notions: let us recall the links between the mean, variance, median and MAD. One can show that it is the median. One can show that it is the mean. This property of the mean is called the “Least Squares Property”.

Yet, if your data is gaussian, they are less precise. You can display high-dimensional datasets in the L1-L2 space: average value of the coordinates and standard deviation of the coordinates. If the data were gaussian, the cloud of points should exhibit a linear shape. If the data is a mixture of gaussians, if there are several clusters, you should see several lines. This representation can be used to spot outliers.

TODO: A plot, with financial data. Google the returns of half a dozen indices, say, FTSE100, CAC40, DAX, Nikkei225, DJIA. But beware: normalization will just rescale your data, it will not solve other problems. In some situations, other transformations are meaningful: power scales, arcsine, logit, probit, Fisher, etc.

Whatever the analysis you perform, it is very important to look at your data and to transform them if needed and possible. If you really want your distribution to be bell-shaped, you can “forcefully normalize” it — but bear in mind that this discards relevant information: for instance, if the distribution was bimodal, i. If you have not, skip to the next. One may replace the squares by another power: the k-th moment M_k of a series is the mean of its k-th powers.

2 The third moment of a centered statistical series is called skewness. For a symetric series, it is zero. To check if a series is symetric and to quantify the departure from symetry, it suffices to compute the third moment of the normalized series. 3, so that the kurtosis of a gaussian distribution be zero, that of a fat-tailed one be positive, that of a no-tail one be negative. To see it, we have estimated the density of the returns and we overlay this curve with the density of a gaussian distribution. You can notice two things: first, the distribution has a higher, narrower peak, second, there are more extreme values. 751968 While a gaussian distribution would give 0 and 3.

