Definition: value that appears most often in a set of data
In a grouped frequency distribution
define the modal class (class with largest frequency)
use the formula:
\begin{equation}
\text{Mode} = L + \frac{{d_1}}{{d_1 + d_2}} \cdot h
\end{equation}
L: the lower boundary of the modal class
d1β(d2β): absolute value of the difference between the frequency of the class modal and the class before (after) it
h: length of the interval of the modal class.
the median
Definition: the number in the middle of a sorted data set.
even-size data set: average of 2 middle numbers
odd-size data set: middle number
Range:
the difference between the largest and the smallest value
Percentile:
a measure indicates value BELOW which a given percentage of observations in a group of observations falls
eg. 20th percentile = value below which 20% of observations found
Find location of the Pth percentile Lp=(n+1)β 100PβL: location in the ordered array of the desired percentile
n: number of observations
P: desired percentile
Find percentile based on location:Pth=valueatlowerLp+Pβ value(higherβlower)Lp
Quartiles
Definition: three points divide dataset into 4 equal groups, each comprises a quarter of the data
Q1β=P25β,Q2β=P50β,Q3β=P75βQ1β : middle number btw smallest & median
Q2: median
Q3: middle value btw median & highest
Deciles
any of nine values divide sorted data into 10 equal parts, each represents 10% of the sample / population
just like Percentile, for eg 8th Decile = 80% Percentile
Percentile with grouped data
calculate from frequency table
loc: from Pnβ calculate the location β determine the class from the cumulative frequency (round up)
Pnβ=firstclassvalue+(lastcumfreqtothisloc)β frequencyclassrangeβ
Inter-quartile range (IQR)
IQR=Q3ββQ1β=P75ββP25β
Variance
the average of the squared mean deviation for each value in a distribution
normal variance
$x_i$: individual observation
$\bar{x}$ : population mean
$n$: number of observations
Variance with grouped data
$f$: class frequency
$M$: class midpoint
$n$: number of observations
Standard deviation:
square root of the variance $$\sigma = \sqrt{\sigma^2}
- Standard deviation: in finance, can measure the risk associated with various investment opportunities. the higher the std, the greater the risk
- Two investments with similar std may have different distribution of returns
<!--ID: 1708098043547-->
## symmetry
- one that can be divided into two mirrored halves of each other, with same arithmetic mean, median, and mode ("bell curve")
<!--ID: 1708098043549-->
## skewness
- the asymmetry of a frequency distribution curve, with different mean, mode, median
- positive skewed distribution: lean left (just like holding left thumb), $mean > median$
- negative skewed distribution: lean right (just like holding right thumb), $mean < median$
<!--ID: 1708098043553-->
## hard parts -> cheat sheet
- interpretation of the Quartiles
<!--ID: 1708098043557-->