### Chebyshev Inequality with estimated Mean and Variance

```Chebyshev Inequality
with estimated Mean and
Variance
John G. Saw, Mark C.K. Yang, Tse Chin Mo
The American Statistician, 1984
Coffeetalk, 29 Jan 2014
woensdag 29 januari 2014
Chebyshev Inequality
• The inequality
1
P [|X − µ| > λσ] ≤ 2
λ
• Holds for finite variances
• Useful to check that values far away from the mean are
very unlikely:
pX
µ
1
λ2
λσ
2
woensdag 29 januari 2014
Chebyshev Ineq. on estimated vars
• For Chebychev
1
P [|X − µ| > λσ] ≤ 2
λ
you need the true µ and σ .
¯ and Q (estimated on n points),
• If you have estimates X
the paper shows:
2
2
g
(nλ
/(n
−
1
+
λ
))
n+1
¯ > λQ] ≤
P [|X − X|
n+1
• Uhm, what is gn+1 (.) ?
3
woensdag 29 januari 2014
The function g_m
• Define integer n ≥ 2 , real value x ≥ 1
• Define integer m = n + 1
• Define integer ν = �m/x�


if ν is even
ν
2
gm (x) = ν
if ν is odd and x < a


2
ν − 1 if ν is odd and x ≥ a
• where:
m(m − ν)
a =
1 + ν(m − ν)
2
• Huh?!?!?!
4
woensdag 29 januari 2014
The function g_m
• Ok, just show it, for n=10, m=11:
g
11
10
9
8
g11 (x)
7
gm
6
5
4
3
2
1
0
0
5
10
x
x
15
5
woensdag 29 januari 2014
And the inequality...
• And we used it in:
2
2
g
(nλ
/(n
−
1
+
λ
))
n+1
¯ > λQ] ≤
P [|X − X|
n+1
bound on P(|.|>h Q) for n=10
1
0.9
• Gives
(n=10):
0.8
bound on P(|.|>h Q)
0.7
0.6
0.5
0.4
P [|...| > λQ]
0.3
0.2
0.1
0
1
1.5
2
2.5
3
h
3.5
4
4.5
5
λ
6
woensdag 29 januari 2014
And the inequality...
• And we used it in:
2
2
g
(nλ
/(n
−
1
+
λ
))
n+1
¯ > λQ] ≤
P [|X − X|
n+1
bound on P(|.|>h Q) for n=10
1
0.9
• Gives
(n=10):
0.8
bound on P(|.|>h Q)
0.7
0.6
the prob.
does not go
to 0...
0.5
0.4
P [|...| > λQ]
0.3
0.2
0.1
0
1
1.5
2
2.5
3
h
3.5
4
4.5
5
λ
6
woensdag 29 januari 2014
But for larger and larger n...
• For larger n it becomes standard Chebyshev:
bound on P(|.|>h Q) for n=100
bound on P(|.|>h Q) for n=1000
1
1
0.9
0.9
n = 100
0.8
0.7
bound on P(|.|>h Q)
bound on P(|.|>h Q)
0.7
0.6
0.5
0.4
0.6
0.5
0.4
0.3
0.3
0.2
0.2
0.1
0.1
0
n = 1000
0.8
1
1.5
2
2.5
3
h
3.5
4
4.5
5
0
1
1.5
2
2.5
3
h
3.5
4
4.5
5
7
woensdag 29 januari 2014
Limit for large n...
• If you ignore some details (like rounding, and even/odd
distinction) we sort-of have:
n+1
gn+1 (x) ≈
x
gn+1 (nλ2 /(n − 1 + λ2 ))
1
n+1
≈
• So:
n+1
n + 1 nλ2 /(n − 1 + λ2 )
n − 1 + λ2
n−1 1
1
=
=
+
2
2
nλ
n λ
n
• Hurray:
1
n−1 1
1
+
lim (
)
=
n→∞
n λ2
n
λ2
8
woensdag 29 januari 2014
Chebyshev on training sample
• It is possible!
2
2
gn+1 (nλ /(n − 1 + λ ))
¯
P [|X − X| > λQ] ≤
n+1
• The proof is quite hard (but only one page long...)
• The probability for λ → ∞ does not go to zero!
• I wanted to use to find out:
how many samples n do I need to estimate an average
{error, 1-NN distance, ...}?
(But apparently
9
woensdag 29 januari 2014
```