The mean,
µ, or average of a set of N
measurements is found by dividing the sum over the entire set of
measurements by the total number of measurements N. Example, the mean (average) of {5, 15, 25, 10, 15} is = = 14. 
Consider this table
of measurements
Height in inches of sample of 100 male adults 61 64 67 70 70 71 75 72 72 69 68 70 65 67 62 59 62 66 66 68
68 73 71 72 73 71 71 68 69 68 69 65 63 70 70 76 71 72 74 60 56 74 75 79 72 72 69 68 68 68 62 66 66 66 61 77 75 74 63 72 63 62 65 65 66 65 67 67 65 67 68 62 67 60 68 65 70 70 69 70 68 73 64 71 71 68 70 69 73 72 70 69 67 64 67 58 66 69 76 73
When rounded off to the nearest inch there are many numbers repeated. A frequency table can keep track of how often individual values are repeated (see center). A histogram gives a visual representation (far right). 56 is the smallest height recorded in this sample. 79 is the largest. 68 is the most common height observed, and most values in fact crowd close to this common height. 68 is known as the mode, while 67.20 is the mean. 
Frequenct
56 1Table of Heights 57 0 58 1
59 1
60 2
61 2
62 5
63 3
64 3
65 7
66 7
67 8
68
12
69 8
70 10
71 7
72 8
73 5
74 3
75 3
76 2
77 1
78 0
79 1

There are 36
possible outcomes to any single toss of two die, only 1 of which gives a 2 So we expect N_{2}/N = 1/36. (The probability of a 2 = 1/36). 
There are two
ways to score a 3: or So N_{3}/N = 2/36=1/18. Similarly N_{4}/N=3/36=1/12, up past N_{7}/N = 6/36 = 1/6 (the probability of a 6 = 5/6)... 
...all the way
through the one way to score 12: N_{12}/N = 1/36. 
Averages identify
a "central value" to any distribution. But they do not indicate
how tightly clumped together data might be. The range can give
you an idea of how spread out the data is: Range = x_{max}  x_{min}
For the 100 men
sampled above, the range in heights was 79  56 = 23.
If a sample contains some rare, extreme measurements (see right) the range, however, can be very misleading! 

In fact at right
are shown 3 data samples which all have exactly the same mean and
range! The data at the top, however, are much more consistent
than those at the bottom. Almost all the measurements have come
out very close to one another. The data in the center actually
display somewhat larger spread, despite the statistical accident of
having a range that matches the top data set. The data set at the
bottom shows an enormous amount of variation. We need a better way of describing this variation. 
The statistical standard deviation
is the square root of
the variance;
the variance is often described as the average difference from the
mean. (2+1+0+3)/4=0/4=0
(there is as much data below the
mean, as there is above). The average difference then can never
be very informative. The variance
actually averages the squares of such differences (avoiding
the problem introduced by the negative numbers). For this example:
{(2)^{2} + (1)^{2} + 0 + 3^{2}}/4=14/4=3.5.
Finally, the
standard deviation is equal to the square root of the variance:
SQR(3.5)=1.87. 
X 
Xµ 
(Xµ)^{2} 
1 
3 
9 
3 
1 
1 
4 
0 
0 
4 
0  0 
5 
1 
1 
7 
3 
9 
SUM 
24 
20 
N 
6  6 
SUM/N 
4 mean 
3.3333 variance 
SQRT 
1.8257 StandDeviation 