Aprende estadística descriptiva con fórmulas esenciales para datos agrupados y no agrupados. Incluye medidas de tendencia central, dispersión, y ejemplos prácticos para análisis de datos.
n n n : Total number of observations
m m m : Number of classes (for grouped data)
x i x_i x i : i-th observation (ungrouped data)
f i f_i f i : Absolute frequency of the i-th class
F i F_i F i : Cumulative frequency up to the i-th class
L i n f L_{inf} L in f : Lower limit of a class
c i c_i c i : Class width (length)
X ‾ \overline{X} X : Arithmetic Mean
X ~ X̃ X ~ : Median
M o Mo M o : Mode
X ‾ = 1 n ∑ i = 1 n x i \overline{X} = \frac{1}{n} \sum_{i=1}^{n} x_i
X = n 1 i = 1 ∑ n x i
X ~ = { x ( n + 1 2 ) if n is odd x ( n 2 ) + x ( n 2 + 1 ) 2 if n is even \tilde{X} = \begin{cases}
x_{\left(\frac{n+1}{2}\right)} & \text{if } n \text{ is odd} \\[10pt]
\dfrac{x_{\left(\frac{n}{2}\right)} + x_{\left(\frac{n}{2}+1\right)}}{2} & \text{if } n \text{ is even}
\end{cases}
X ~ = ⎩ ⎨ ⎧ x ( 2 n + 1 ) 2 x ( 2 n ) + x ( 2 n + 1 ) if n is odd if n is even
M o = value(s) with the highest absolute frequency Mo = \text{value(s) with the highest absolute frequency}
M o = value(s) with the highest absolute frequency
R = x max − x min R = x_{\max} - x_{\min}
R = x m a x − x m i n
S 2 = 1 n ∑ i = 1 n ( x i − X ‾ ) 2 S^2 = \frac{1}{n} \sum_{i=1}^{n} (x_i - \overline{X})^2
S 2 = n 1 i = 1 ∑ n ( x i − X ) 2
Practical formula:
S 2 = ∑ i = 1 n x i 2 n − X ‾ 2 S^2 = \frac{\sum_{i=1}^{n} x_i^2}{n} - \overline{X}^2
S 2 = n ∑ i = 1 n x i 2 − X 2
S = S 2 S = \sqrt{S^2}
S = S 2
C V = S X ‾ CV = \frac{S}{\overline{X}}
C V = X S
α 3 = 1 n ∑ i = 1 n ( x i − X ‾ ) 3 S 3 \alpha_3 = \frac{\frac{1}{n} \sum_{i=1}^{n} (x_i - \overline{X})^3}{S^3}
α 3 = S 3 n 1 ∑ i = 1 n ( x i − X ) 3
α 3 = 0 \alpha_3 = 0 α 3 = 0 : Symmetric distribution
α 3 > 0 \alpha_3 > 0 α 3 > 0 : Positive skew (tail to the right)
α 3 < 0 \alpha_3 < 0 α 3 < 0 : Negative skew (tail to the left)
α 4 = 1 n ∑ i = 1 n ( x i − X ‾ ) 4 S 4 \alpha_4 = \frac{\frac{1}{n} \sum_{i=1}^{n} (x_i - \overline{X})^4}{S^4}
α 4 = S 4 n 1 ∑ i = 1 n ( x i − X ) 4
α 4 = 3 \alpha_4 = 3 α 4 = 3 : Mesokurtic distribution (like the normal)
α 4 > 3 \alpha_4 > 3 α 4 > 3 : Leptokurtic (more peaked)
α 4 < 3 \alpha_4 < 3 α 4 < 3 : Platykurtic (less peaked)
X ‾ = 1 n ∑ i = 1 m f i ⋅ x i \overline{X} = \frac{1}{n} \sum_{i=1}^{m} f_i \cdot x_i
X = n 1 i = 1 ∑ m f i ⋅ x i
where x i x_i x i is the class mark (midpoint).
X ~ = L med + [ n 2 − F i − 1 f med ] ⋅ c med \tilde{X} = L_{\text{med}} + \left[ \frac{\frac{n}{2} - F_{i-1}}{f_{\text{med}}} \right] \cdot c_{\text{med}}
X ~ = L med + [ f med 2 n − F i − 1 ] ⋅ c med
L med L_{\text{med}} L med : Lower limit of the median class
f med f_{\text{med}} f med : Frequency of the median class
F i − 1 F_{i-1} F i − 1 : Cumulative frequency preceding the median class
c med c_{\text{med}} c med : Width of the median class
M o = L Mo + [ Δ 1 Δ 1 + Δ 2 ] ⋅ c Mo Mo = L_{\text{Mo}} + \left[ \frac{\Delta_1}{\Delta_1 + \Delta_2} \right] \cdot c_{\text{Mo}}
M o = L Mo + [ Δ 1 + Δ 2 Δ 1 ] ⋅ c Mo
Δ 1 = f Mo − f Mo − 1 \Delta_1 = f_{\text{Mo}} - f_{\text{Mo}-1} Δ 1 = f Mo − f Mo − 1
Δ 2 = f Mo − f Mo + 1 \Delta_2 = f_{\text{Mo}} - f_{\text{Mo}+1} Δ 2 = f Mo − f Mo + 1
L Mo L_{\text{Mo}} L Mo : Lower limit of the modal class
R = L sup, last − L inf, first R = L_{\text{sup, last}} - L_{\text{inf, first}}
R = L sup, last − L inf, first
S 2 = 1 n ∑ i = 1 m f i ( x i − X ‾ ) 2 = ∑ i = 1 m f i x i 2 n − X ‾ 2 S^2 = \frac{1}{n} \sum_{i=1}^{m} f_i (x_i - \overline{X})^2 = \frac{\sum_{i=1}^{m} f_i x_i^2}{n} - \overline{X}^2
S 2 = n 1 i = 1 ∑ m f i ( x i − X ) 2 = n ∑ i = 1 m f i x i 2 − X 2
S = S 2 S = \sqrt{S^2}
S = S 2
C V = S X ‾ CV = \frac{S}{\overline{X}}
C V = X S
α 3 = 1 n ∑ i = 1 m f i ( x i − X ‾ ) 3 S 3 \alpha_3 = \frac{\frac{1}{n} \sum_{i=1}^{m} f_i (x_i - \overline{X})^3}{S^3}
α 3 = S 3 n 1 ∑ i = 1 m f i ( x i − X ) 3
α 4 = 1 n ∑ i = 1 m f i ( x i − X ‾ ) 4 S 4 \alpha_4 = \frac{\frac{1}{n} \sum_{i=1}^{m} f_i (x_i - \overline{X})^4}{S^4}
α 4 = S 4 n 1 ∑ i = 1 m f i ( x i − X ) 4
P k = L k + [ n ⋅ k − F i − 1 f k ] ⋅ c k P_k = L_k + \left[ \frac{n \cdot k - F_{i-1}}{f_k} \right] \cdot c_k
P k = L k + [ f k n ⋅ k − F i − 1 ] ⋅ c k
where:
P k P_k P k : Desired fractile
L k L_k L k : Lower limit of the fractile class
k k k : Corresponding proportion (e.g., 0.25 for Q 1 Q_1 Q 1 )
f k f_k f k : Frequency of the fractile class
F i − 1 F_{i-1} F i − 1 : Preceding cumulative frequency
c k c_k c k : Width of the fractile class
Type
Symbol
Proportion (k)
Example
Quartiles
Q 1 , Q 2 , Q 3 Q_1, Q_2, Q_3 Q 1 , Q 2 , Q 3
0.25, 0.50, 0.75
Q 3 Q_3 Q 3 : k = 0.75 k = 0.75 k = 0.75
Deciles
D 1 , D 2 , … , D 9 D_1, D_2, \ldots, D_9 D 1 , D 2 , … , D 9
0.10, 0.20, … \ldots … , 0.90
D 5 D_5 D 5 = Median
Percentiles
P 1 , P 2 , … , P 99 P_1, P_2, \ldots, P_{99} P 1 , P 2 , … , P 99
0.01, 0.02, … \ldots … , 0.99
P 90 P_{90} P 90 : k = 0.90 k = 0.90 k = 0.90
Important relationships:
Q 2 = D 5 = P 50 Q_2 = D_5 = P_{50} Q 2 = D 5 = P 50 = Median
D 1 = P 10 D_1 = P_{10} D 1 = P 10 , D 9 = P 90 D_9 = P_{90} D 9 = P 90
Grouped data : All formulas use the class mark (midpoint) x i x_i x i as the representative value of the interval.
Median and fractiles : Their calculation first requires identifying the corresponding class by analyzing cumulative frequencies.
Mode : A distribution can be unimodal, bimodal, or multimodal. For grouped data, interpolation within the modal class is used.
Interpreting the coefficient of variation :
C V < 15 % CV < 15\% C V < 15% : Low relative dispersion
15 % ≤ C V ≤ 30 % 15\% \leq CV \leq 30\% 15% ≤ C V ≤ 30% : Moderate dispersion
C V > 30 % CV > 30\% C V > 30% : High relative dispersion
Units of measurement :
Variance retains the original units squared
Standard deviation maintains the original units
The coefficients of variation, skewness, and kurtosis are dimensionless